2d Array and a SegFault

**John...** · 08-29-2016

Hello everyone. I'm encountering a segmentation fault in the following code for large values of SIZE. It executes as expected when SIZE equals 1000. But when SIZE equals 10000, I get a segfault.

Does anyone know why I am receiving this segmentation fault?

Code:

void populate_grid(unsigned int size, unsigned long long grid[][size]){
   unsigned int i, j;


   //seed the grid with a few values
   for(j = 0;j < size;j++)
   {
      grid[0][j] = j + 1;
      grid[j][j] = 1;
   }


   //use n choose r recurrence relation
   for(i = 1;i < size;i++)
      for(j = i + 1;j < size;j++)
         grid[i][j] = grid[i][j - 1] + grid[i - 1][j - 1];


}

main.c

Code:

#define SIZE 10000


int main(){
   unsigned long long grid[SIZE][SIZE];


   populate_grid(SIZE, grid);
//   out(SIZE, grid);


//   choose(50, 20, SIZE, grid);


   return 0;
}

**John...** · 08-29-2016

Upon some further research, it appears I am overflowing my OS stack. I appended the qualifier static to the array in main and this solved the issue.

**rstanley** · 08-30-2016

Or better, use malloc() to dynamically allocate the memory on the heap.

**John...** · 08-30-2016

rstanely - yes that is the plan for the next version. Thank you!

So I've written a brute-force approach to compare the two version's performance. But now I've encountered a floating point exception, and I'm not sure where it is occurring. Again the error only occurs on "large-ish" values of size. It works as expected for size equals 50, but generates the floating point exception at size equals 70.

Code:

unsigned long long factorial(unsigned int n){
   unsigned long long i, value;
   for(i = value = 1;i <= n;i++)
      value *= i;
   return value;
}


void populate_grid(unsigned int size, unsigned long long grid[][size]){
   for(unsigned int i = 0;i < size;i++)
      for(unsigned int j = i;j < size;j++)
         grid[i][j] = factorial(j + 1) / (factorial(i + 1) * factorial(j - i));
}

**algorism** · 08-30-2016

The largest value for a 64-bit unsigned int is 18,446,744,073,709,551,615.
The largest factorial that fits in a 64-bit unsigned int is 20!
50! is 65 digits long.
70! is 100 digits long.

At some point, the value is wrapping around and producing 0, which will then keep giving zero from that point on. So you ended up dividing by zero at some point. Running a little test program yields:

Code:

.
  1:                      1
  2:                      2
  3:                      6
  4:                     24
  5:                    120
  6:                    720
  7:                   5040
  8:                  40320
  9:                 362880
 10:                3628800
 11:               39916800
 12:              479001600
 13:             6227020800
 14:            87178291200
 15:          1307674368000
 16:         20922789888000
 17:        355687428096000
 18:       6402373705728000
 19:     121645100408832000
 20:    2432902008176640000
 21:   14197454024290336768     wrong answers from here on...
 22:   17196083355034583040
 23:    8128291617894825984
 24:   10611558092380307456
 25:    7034535277573963776
 26:   16877220553537093632
 27:   12963097176472289280
 28:   12478583540742619136
 29:   11390785281054474240
 30:    9682165104862298112
 31:    4999213071378415616
 32:   12400865694432886784
 33:    3400198294675128320
 34:    4926277576697053184
 35:    6399018521010896896
 36:    9003737871877668864
 37:    1096907932701818880
 38:    4789013295250014208
 39:    2304077777655037952
 40:   18376134811363311616
 41:   15551764317513711616
 42:    7538058755741581312
 43:   10541877243825618944
 44:    2673996885588443136
 45:    9649395409222631424
 46:    1150331055211806720
 47:   17172071447535812608
 48:   12602690238498734080
 49:    8789267254022766592
 50:   15188249005818642432
 51:   18284192274659147776
 52:    9994050523088551936
 53:   13175843659825807360
 54:   10519282829630636032
 55:    6711489344688881664
 56:    6908521828386340864
 57:    6404118670120845312
 58:    2504001392817995776
 59:     162129586585337856
 60:    9727775195120271360
 61:    3098476543630901248
 62:    7638104968020361216
 63:    1585267068834414592
 64:    9223372036854775808
 65:    9223372036854775808
 66:                      0
 67:                      0
 68:                      0
 69:                      0
 70:                      0
  ...

**algorism** · 08-30-2016

If this is another Project Euler problem (which one?) then you should analyze the calculation you're doing since it seems to me you might be able to do it without calculating factorials at all.

If you really need giant numbers, you can use a "big number" library like https://gmplib.org/

**John...** · 08-30-2016

Haha algorism, nice to see you poking around my thread. It IS indeed a project euler problem (https://projecteuler.net/problem=53). My approach is to use my recurrence relation above (first populate_grid() function posted) to generate n choose r values for a fixed r while looping through n (thus avoiding using factorials). Once I reach an n for a given r that outputs a value greater than or equal to one million, I can stop generating choose values for that particular r and calculate how many n's for that particular r produce a 'n choose r' value at or above one million. I believe my answer is a few lines away, but I got sidetracked by my exceptions!

I often find myself too concerned with the decimal value project euler wants instead of learning from my errors along the way. If you would like to take a gander at the next problem I plan on solving, here it is.

https://projecteuler.net/problem=101

The issue I'm having with problem 101 is finding the right linear algebra library to use. I've been struggling with GSL CBLAS. My solution involves interpolating the data to form the desired n'th degree polynomial with an Ax=b matrix-vector equation for n = 1,2,...,10. Obviously I can solve this by hand, in relatively trivial time, but I'm treating it as an exercise in coding linear algebra problems. But I digress... I will start a new thread for that guy once I get started on that problem.

**algorism** · 08-30-2016

If the answer to problem 53 is 4075 then I have a 20 line solution that uses only 32-bit ints. It takes advantage of pascal's triangle and constrains maximum values to 1000001.

Problem 101 looks like a doozy!

**John...** · 08-30-2016

Code:

void solve53(unsigned int size, unsigned int grid[][size], unsigned int ceiling){


   int i, j, sum = 0;


   //seed the grid with a few values
   for(j = 0;j < size;j++)
   {
      grid[0][j] = j + 1;
      grid[j][j] = 1;
   }


   //use n choose r recurrence relation
   for(i = 1;i < size;i++)
      for(j = i + 1;j < size;j++)
      {
         if((grid[i][j] = grid[i][j - 1] + grid[i - 1][j - 1]) > ceiling)
         {
            sum += size - j;
            break;
         }
      }
   printf("%d\n", sum);
}

Ahh yes, it is Pascal's triangle. I was using using a combinatoric identity from my textbook and didn't realize it essentially generates the triangle. Pretty neat. I assume our solutions are somewhat similar? And yes! You've got the correct answer.

**algorism** · 08-30-2016

Originally Posted by John...

I assume our solutions are somewhat similar?

Similar in some ways, different in others. I like the way you took advantage of the fact that once a value was over 1000000, the rest of the values on that line (up to the middle) will also be over 1000000.

The only advantage of mine is that I only store a single line of the triangle. Combining both ideas may yield an optimal solution!

Code:

#include <stdio.h>

#define MAXVAL  1000000
#define MAXLINE 100

int main(void) {
    int n[MAXLINE] = {1, 1};  // line 1 of Pascal's triangle
    int cnt = 0;

    for (int line = 2; line <= MAXLINE; line++) {
        int prev = 1; // first number is always 1
        for (int i = 1; i < line; i++) { // start at 2nd number
            int r = prev + n[i];
            if (r > MAXVAL) {
                cnt++;
                r = MAXVAL;  // constrain value size
            }
            prev = n[i];
            n[i] = r;
        }
        n[line] = 1; // last number is always 1
    }

    printf("%d\n", cnt);
    return 0;
}

BTW, I was able to solve 101 in about 40 lines using a little trick. No external libraries needed.

**John...** · 08-30-2016

BTW, I was able to solve 101 in about 40 lines using a little trick. No external libraries needed.

Well, that is very interesting. Did you still use linear algebra concepts and just implement your own routine(s)?

**algorism** · 08-30-2016

Originally Posted by John...

Well, that is very interesting. Did you still use linear algebra concepts and just implement your own routine(s)?

No linear algebra (unless it's somehow implied in the method).

Here's the idea applied to the n-cubed example.

The first row is the values of the function. The subsequent rows are the differences between each pair of values in the previous row. Keep going until you get 0's (the previous row is all the same value).

Code:

1   8  27  64  125
  7  19  37  61
   12  18  24
     6    6
        0

Working back up from that point gives us the next value in the original sequence:

Code:

1   8  27  64  125    216
  7  19  37  61     91
   12  18  24    30
     6    6    6
        0

To get the so-called FITs, just add one value at a time:

Code:

1                = 1

1   8         +7 = 15
  7

1   8   27    +31 = 58
  7   19      +12 = 31
    12

1    8    27    64   +61 = 125   (the correct answer)
  7    19    37      +24 = 61
    12    18         +6 = 24
       6

You get the correct answer when you have enough values in the initial sequence to reach the row with repeated values. The FITs are 1, 15 and 58.

**John...** · 08-30-2016

Wow. That is amazing. I have no idea how this works. What kind of method would you call this? Do you know of any sort of proof or reason why this should produce fits so nicely?

**algorism** · 08-31-2016

Apparently it's called "the method of common differences".
Sequences: The Method of Common Differences

**John...** · 08-31-2016

Okay, so what I gather is that the method of common differences allows one to be able to estimate the degree of the polynomial of the generating function accurately. We do this by taking differences of successive known terms in the sequence until we come across a row with all same values. Apparently, calculus is a way of understanding why, if it takes until the n-th row to produce a row with all same values, the polynomial is n-th degree. I've studied calculus, and I don't quite see the reasoning.

And this is all well and good, but I still don't see how this generates FIT's that should be produced by bad generating functions.

I don't know about your math background, but do you know of the calculus explanation the page speaks of? Also any ideas about the FIT's in particular? Sorry for the barrage of questions, but I find your solution so very awesome and I wish to understand.

Thread: 2d Array and a SegFault

Thread Tools

Search Thread

Display

2d Array and a SegFault

Similar Threads

SegFault Help!

Segfault... Why?

Segfault

Reading a txt file into an array: unexpected segfault

segfault with gcc, but not with TC

Tags for this Thread