Thread: fseek failing - Error: Invalid argument

  1. #1
    Registered User
    Join Date
    Feb 2009
    Posts
    26

    fseek failing - Error: Invalid argument

    Hi all,

    I have a file consisting of a huge array of floats written in binary mode (fwrite). Its actually a contiguous 2D array. I need to access various parts of the array and for this I use fseek to jump a different point and the fread to read in the data.

    I noticed that some of my output coming out of the fread was arrays completely filled with only zeros. I could trace the problem to my fseek statement. For certain values of offset , my fseek just fails and checking errno gives me the error "Invalid argument". My code looks like this..
    Code:
     
    
       int start_row=18375, n_cols =2960, n_rows_to_read =9000;
    
       float *array;
    
       fp = fopen(filename, rb);
    
       array = (float*)malloc(n_rows_to_read*n_cols*sizeof(float) );
      
       if( fseek(fp, start_row*n_cols*sizeof(float), SEEK_CUR) != 0)
       {
            perror("Fseekfailed");
      
            exit(0);
        }
    
       /** fread and write into outfile.. **/
    For the exact start row that I have set here.. 181375, the code works fine. But beyond that, 18376 onwards fseek fails and the error, as I already said, is reported as "Invalid argument". I was not able to find any information on why fseek would give an invalid argument error.

    When I read in 9000 rows starting from 181375, the output is fine for all the 9000 rows after it. So I know for a fact that the file isn;t blank beyond row 181375. But if I am not hitting end of file why else would fseek give this error??

    Appreciate any help from you guys..

    thanks,

    Avinash
    Last edited by avi2886; 05-07-2009 at 04:57 PM. Reason: Grammar fault.. wrote hitting EOF instead of not hitting eof

  2. #2
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    What is 4 * 2960 * 181376 ? Does that fit in a 32bit int?


    Quzah.
    Hope is the first step on the road to disappointment.

  3. #3
    Registered User
    Join Date
    Sep 2008
    Location
    Toronto, Canada
    Posts
    1,834
    Because when you go beyond row 181375, the offset becomes

    181375 * 2960 * 4 = 2,147,480,000

    ... which is close to the limit 2,147,483,648 permissible.

    Beyond that your calculation overflows, making the parameter negative integer... which is invalid.

    Go find the fseek() function that can take 64-bit integer offsets.

  4. #4
    Registered User
    Join Date
    Feb 2009
    Posts
    26
    Oh.. never thought about that. So the problem is that fseek defines its offset to be a signed integer. And my number is an invalid signed integer.. Makes sense.

    I guess I ll ve to use a conditional and fseek in steps which will fit in an int.. Right?

    Thanks a lot for the reply. Never thought in that direction..

    cheers,

    Avinash

  5. #5
    int x = *((int *) NULL); Cactus_Hugger's Avatar
    Join Date
    Jul 2003
    Location
    Banks of the River Styx
    Posts
    902
    Code:
    fseek(fp, start_row*n_cols*sizeof(float), SEEK_CUR)
    In addition to what others have said, SEEK_CUR is a seek relative to the current location in the file. Are you sure you don't want SEEK_SET?

    And a negative number is not by definition invalid... depends if you seek off either end of the file. (Edit: But overflowing your integer is - take their advice.)
    long time; /* know C? */
    Unprecedented performance: Nothing ever ran this slow before.
    Any sufficiently advanced bug is indistinguishable from a feature.
    Real Programmers confuse Halloween and Christmas, because dec 25 == oct 31.
    The best way to accelerate an IBM is at 9.8 m/s/s.
    recursion (re - cur' - zhun) n. 1. (see recursion)

  6. #6
    Registered User
    Join Date
    Sep 2008
    Location
    Toronto, Canada
    Posts
    1,834
    Damn!! SEEK_SET... I should have caught that.

  7. #7
    Registered User
    Join Date
    Feb 2009
    Posts
    26
    I fixed the integer overflow and it works fine now.! Thanks for the help guys..
    Posting my code here incase anyone is interested..
    Code:
    int max_fseekrow;
    
    int start_row=18375, n_cols =2960, n_rows_to_read =9000;
    
       float *array;
    
       max_fseekrow = 2147483647 /n_cols*sizeof(float);
    
       fp = fopen(filename, rb);
    
       array = (float*)malloc(n_rows_to_read*n_cols*sizeof(float) );
    
       fseek_checkpt:
    
       if(start_pt > max_fseekrow)
       {
    
           fseek(fp,max_fseekrow*n_cols*sizeof(float),SEEK_CUR);
           start_pt = start_pt - max_fseekrow;
         
           goto fseek_checkpt;
       }
    
       else
       {
          fseek(fp, start_pt*ncols*sizeof(float), SEEK_CUR);
       }
    
    
       /** fread and printout **/
    @Cactus: Yea.. actually, the first few lines of my file are comment/title lines. So in my code I have to fgets them out before my binary array begins and I can start to fseek. Ergo the SEEK_CUR.. Didnt seem connected thats why didnt put it in the sample code I put up.

  8. #8
    Registered User
    Join Date
    Sep 2008
    Location
    Toronto, Canada
    Posts
    1,834
    Except that your
    max_fseekrow = 2147483647 /n_cols*sizeof(float)

    should be

    max_fseekrow = 2147483647 / (n_cols * sizeof(float) )

  9. #9
    Registered User
    Join Date
    Feb 2009
    Posts
    26
    Hi all,

    Sorry to be reporting to an old thread. But I just realised that the problem that I announced as having been fixed above was actually not properly fixed. The issue was that I was unable to fseek beyond a certain number of rows in my file because the offset number was overflowing the max number that could be held in an int.

    To fix this, i tried to do a stepped fseek as given in the above code. Basically, if my offset value was beyond the max possible int, I fseeked until the max possible row. And then fseek again to move over the remaining rows.. I thought that this had fixed the problem because I started to get some output. But i neglected to verify that output at that time. I now realised that though the filepointer moves to somepoint in the file.. Its not to the correct point.

    I ve tried googling about but didnt turn up anything. Does anyone know if doing 2 fseeks one after the other is known to have any issues.

    To test the code, I created a huge binary file in which each row is an array of floats filled with the row number. So by printing out the array which is read, I can know what is the array I am reading. This code is appended below
    Code:
    #include <stdio.h>
    
    int main()
    {
    
       FILE *fp;
    
       float *array;
    
       int i, j;
    
       int  n_records = 181500, mz_range = 2960; /** 181375 is the no of rows at which my offset over flows the max int value. So just picked an arbitrary value above that to be the no of rows **/
    
       
       if( (fp =fopen("fseek_test.txt","wb")) == NULL)
       {
         printf("Error.. can't open fseek_test.txt\n");
         exit(0);
       }
    
    
       array = (float*) malloc(mz_range*sizeof(float));
    
       fprintf(fp,"File created for the purpose of testing my stepped fseek code\n");
    
    
       for(i=0; i<n_records; i++)
       {
         
          for( j=0; j<mz_range; j++)
          {
            array[j] = i;
          }
    
        
          fwrite(array, sizeof(float), mz_range, fp);
    
       }
          
    
      fclose(fp);    
         
      printf("Test file created!!\n");
    
      return 0;
    
    }
    The code to test if my stepped fseek works is given below.
    Code:
    #include <stdio.h>
    #include <string.h>
    
    int main()
    {
      
      FILE *fp;
    
      char szbuf[5000];
    
      int i, max_fseekrow, mz_range = 2960;
      
      int row = 181500, offset_row;
    
      float *array;
    
    
      if( (fp=fopen("fseek_test.txt","rb")) ==NULL)
      {
        printf("Error.. unable to open fseek_test.txt\n");
        
        exit(0);
      }
    
    
      array = (float*) malloc(mz_range*sizeof(float));
    
      fgets(szbuf,5000,fp); /** reading out the title line **/
    
      printf("%s\n\n",szbuf);
    
    
      max_fseekrow = (int) 2147483647 /( mz_range * sizeof(float) ); /** Computing max_fseekrow.. 2147483647 is the biggest possible 32 bit integer **/
    
       
      offset_row = row;
    
    
      printf("Offset row= %d\n",offset_row);
    
      fseek_checkpt:   
      if( offset_row > max_fseekrow)
        {
    
           fseek(fp, max_fseekrow*mz_range*sizeof(float), SEEK_CUR);
    
           printf("Done with the first fp movement\n");
    
           if( (fread(array, sizeof(float), mz_range, fp)) != mz_range)
           {
    
             printf("Possible read error in fread");
    
             exit(0);
    
           }
    
       
           for( i=0; i<10; i++)
           {
    
              printf("%f\t",array[i]);
           }
    
    
    
           offset_row = offset_row - max_fseekrow -1;
    
           goto fseek_checkpt;
        }
    
       else
       {
    
         if( ( fseek(fp, offset_row*mz_range*sizeof(float), SEEK_CUR)) != 0)
         {
           printf("Error.. fseek didnt work!\n");
           exit(0);
         } 
    
       } 
    
    
      
       
       if( (fread(array, sizeof(float), mz_range, fp)) != mz_range)
       {
    
         printf("Possible read error in fread");
    
         exit(0);
    
       }
    
       
       for( i=0; i<10; i++)
       {
    
          printf("%f\t",array[i]);
       }
    
    
    
      fclose(fp);
    
      return 0;
     }
    As you can see, what I try in the code is just to calculate the max number of rows that I can fseek over without the offset value overflowing int. Then if the offset rows exceeds this max number of rows, then I first fseek over the max_fseekrows. And then to get to the point that i wanted to go to, I fseek over the rest of the file (offset_row - max_fseekrow). [Note: since I read in 1 row to check if the first fseek worked properly, I am setting offset_row to offset_row - max_fseekrow -1.]

    The fread after the first fseek gives out the proper line. But the fread after the second fseek only prints out zeros. Irrespective of whether I want to offset it by offset it by offset_rows or just 1 row. It all gives out a zero value. Commenting out the second fseek and just doing freads hawever gives me proper output. So I am sure the file isn;t empty or anything. Does anyone have any ideas on what I am doing wrong.. I would be grateful for any advice..

    best,

    Avinash

  10. #10
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    You should be looking at the fseeko()/fseeko64() family of functions, which use an off_t/off64_t instead of an integer to specify the offset. Using certain macro definitions, you can cause the off_t type to become 64-bit, allowing seeks larger than 2^31-1.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  11. #11
    Registered User
    Join Date
    Feb 2009
    Posts
    26
    Thanks for the quick reply brewbuck. I am doing my developing on Windows using VC++. Do you happen to know how to get fseeko64() in VC++ environment..? Or do u know of any alternatives that I can use on windows?

    thanks

  12. #12
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by avi2886 View Post
    Thanks for the quick reply brewbuck. I am doing my developing on Windows using VC++. Do you happen to know how to get fseeko64() in VC++ environment..? Or do u know of any alternatives that I can use on windows?

    thanks
    The support for these (C99 standard!) functions is flaky in MSVC. In that case, look at fgetpos() and fsetpos(). They are less straighforward to use (they take an fpos_t *, which is actually a pointer to a structure) but by wrapping them in convenience functions you can abstract that away.

    Since fsetpos() does not allow SEEK_CUR or SEEK_END-like behavior, you'll need to emulate them yourself (with unportable twiddling) -- SEEK_END can be emulated by using a normal fseek( fp, 0, SEEK_END ) to get to the end of the file (which works even for files larger than 2^31-1) followed by fgetpos() to get the fpos_t corresponding to the end of the file.

    Yeah, it blows.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  13. #13
    Registered User
    Join Date
    Feb 2009
    Posts
    26
    I haven't worked with the fpos_t datatype before. Does it store the position in bytes? I mean if I wanted to move to a certain bytes away from the current position, how should I increment the position?

    I tried
    Code:
     fgetpos(fp, &curr_pos);
    
       
      offset_pos = curr_pos + offset_row*mz_range*sizeof(float) ;
    
      fsetpos(fp, &offset_pos);
    Where offset_row*mz_range*sizeof(float) is the number of bytes that I want to move away from the current position. but this doesn't seem to work either..

  14. #14
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by avi2886 View Post
    I haven't worked with the fpos_t datatype before. Does it store the position in bytes? I mean if I wanted to move to a certain bytes away from the current position, how should I increment the position?
    fpos_t is intended to be an opaque type which you should not manipulate directly. That's why I referred to this as "unportable twiddling." Unfortunately, until MSVC provides better (i.e. standard) support for general large file seeks, this is the best solution I'm aware of.

    You'll need to find where fpos_t is defined, look at it and figure out how to adjust the values contained therein to control the offset. I did this about a year ago, and if I remember it wasn't too difficult. Again, it's completely non-portable though.

    Windows obviously supports large files, and large file seeking. But in typical MS fashion these features are only really accessible through non-standard APIs.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  15. #15
    Registered User
    Join Date
    Sep 2008
    Location
    Toronto, Canada
    Posts
    1,834
    _fseeki64

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. member as default argument
    By MarkZWEERS in forum C++ Programming
    Replies: 2
    Last Post: 03-23-2009, 08:09 AM
  2. fseek() changing an unrelated variable?
    By aaronvegh in forum C Programming
    Replies: 3
    Last Post: 11-21-2007, 02:30 PM
  3. Screwy Linker Error - VC2005
    By Tonto in forum C++ Programming
    Replies: 5
    Last Post: 06-19-2007, 02:39 PM
  4. Nested loop frustration
    By caroundw5h in forum C Programming
    Replies: 14
    Last Post: 03-15-2004, 09:45 PM
  5. popen and fseek
    By mach5 in forum C Programming
    Replies: 4
    Last Post: 11-29-2003, 02:03 AM