Thread: Determine file size

  1. #1
    Registered User
    Join Date
    Aug 2009
    Posts
    7

    Determine file size

    I have a very dumb beginner question:

    How to determine size in bytes of file in c?

    Thank you

  2. #2
    and the hat of sweating
    Join Date
    Aug 2007
    Location
    Toronto, ON
    Posts
    3,545
    To use only standard C functions? Open a file, keep reading bytes until you reach EOF, close the file.

    To do it the easy & less portable way? Use something like the POSIX stat() function (or _stat() as Microsoft likes to call it).
    "I am probably the laziest programmer on the planet, a fact with which anyone who has ever seen my code will agree." - esbo, 11/15/2008

    "the internet is a scary place to be thats why i dont use it much." - billet, 03/17/2010

  3. #3
    Registered User
    Join Date
    Aug 2009
    Posts
    7
    Yes, using only C standard functions.

    Code:
    int br;
            while(!feof(fp)) {
                fscanf(fp, "%d", &br);
                filesize+=sizeof(br);
        }
    It is OK if all data in file is of type integer. But what happens if there are different data types in file?

  4. #4
    Registered User
    Join Date
    Sep 2007
    Posts
    1,012
    First off, while(!feof(fp)) is almost never the right thing to do. You should instead check the return value of your read function, which will tell you whether you've hit EOF.

    To get the file size (or more accurately, to find out how many bytes you can read, which is not necessarily the same thing), you would not want to use fscanf(); instead, something like:
    Code:
    size_t n = 0;
    while(getc(fp) != EOF) n++;
    But to really get the file size, you want a platform-specific function, as cpjust mentioned.

    Someone may mention the trick of using fseek() to go to the end of the file and then ftell() to get its size, but as tends to be the case with C, there are problems with that approach: 1) fseek() with SEEK_END (ie, the end of the file) on a binary stream need not work, and 2) ftell() on a text stream might return a value that is meaningless to you.

    C is fun like that.

  5. #5
    Registered User
    Join Date
    Jun 2009
    Location
    US of A
    Posts
    305
    Quote Originally Posted by cas View Post
    Someone may mention the trick of using fseek() to go to the end of the file and then ftell() to get its size, but as tends to be the case with C, there are problems with that approach: 1) fseek() with SEEK_END (ie, the end of the file) on a binary stream need not work, and 2) ftell() on a text stream might return a value that is meaningless to you.

    C is fun like that.
    You hit me on the right spot. I was experimenting with the code below to find the file size but it always returns 21 (though the file size is 145kb)


    [insert]
    Code:
    #include <stdio.h>
    
    int main()
    {
    	FILE *fp;
    	long file_size;
    	int size;
    
    	fp = fopen("C:\\Documents and Settings\\ROHAN\\Desktop\\DESKTOP\\filejava.txt", "rb");
    	if(fp == NULL)
    	{
    		printf("\n Error opening the file");
    		return 0;
    	}
    
    	size = file_length(fp);
    	printf("\n Size of file is %d", size);
    
    	close(fp);
    
    	return 0;
    }
    
    
    int file_length(FILE *f)
    {
    	int pos;
    	int end;
    
    	pos = ftell (f);
    	fseek (f, 0, SEEK_END);
    	end = ftell (f);
    	fseek (f, pos, SEEK_SET);
    
    	return end;
    }
    But i couldn't understand why?????

  6. #6
    Registered User
    Join Date
    Sep 2004
    Location
    California
    Posts
    3,268
    But i couldn't understand why?????
    Because the C standard explicitly states:
    Quote Originally Posted by c99 standard
    Setting the file position indicator to end-of-file, as with fseek(file, 0, SEEK_END), has
    undefined behavior for a binary stream (because of possible trailing null characters) or for any stream
    with state-dependent encoding that does not assuredly end in the initial shift state.
    bit∙hub [bit-huhb] n. A source and destination for information.

  7. #7
    Registered User
    Join Date
    Oct 2008
    Location
    TX
    Posts
    2,059
    Quote Originally Posted by roaan View Post
    ...
    But i couldn't understand why?????
    For starters you're opening a FILE stream but calling close() which takes a file descriptor as an argument.

    Don't know about Windows but Unix doesn't make any distinction between binary and text files. So fseek() followed by ftell() will give you the correct size of the file. Alternatively you can call getc() to read a byte at a time until EOF. Both methods will work on any of the Unixes.

  8. #8
    Registered User
    Join Date
    Jun 2009
    Location
    US of A
    Posts
    305
    Quote Originally Posted by itCbitC View Post
    For starters you're opening a FILE stream but calling close() which takes a file descriptor as an argument.

    Don't know about Windows but Unix doesn't make any distinction between binary and text files. So fseek() followed by ftell() will give you the correct size of the file. Alternatively you can call getc() to read a byte at a time until EOF. Both methods will work on any of the Unixes.
    Oh yes that was a typo and i was thinking why it was giving me an error .... now changed it to fclose()...

    Also i am on a windows machine so fseek() and ftell() give some rubbish garbage as pointed out in the above post for the c99 standard

  9. #9
    Registered User
    Join Date
    Oct 2008
    Posts
    1,262
    Well, reading all data using getc is a horrible solution, really. Even though the standard apparently doesn't describe a portable, other way, I would never use such a thing. Imagine a 2GB file where you need to know the filesize and then only a portion of the file. It would take several seconds to determine the filesize, while using unportable techniques it would be only a few clock cycles. Just write it for your OS and re-write it if you need to port it.

    Anyway, does the same hold true for C++ files? As in, doing seekg and tellg on an std::ifstream being illegal?

  10. #10
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    It would take several seconds to determine the filesize
    Plus flushing everything out of the disk cache, dramatically reduce the computer's responsiveness afterwards, at least for Linux.

  11. #11
    Registered User
    Join Date
    Oct 2008
    Location
    TX
    Posts
    2,059
    No doubt getc() is a drag on largefiles. There are obvious tradeoffs between the portability and efficiency of a solution. IMO stat() or fseek() / ftell() are the preferred ways to get the filesize but on Windows they apparently won't work. getc() is portable but very inefficient. So there you have it - pros and cons of each solution and perhaps the o/p should pick the one that best suits the requirements.

  12. #12
    and the hat of sweating
    Join Date
    Aug 2007
    Location
    Toronto, ON
    Posts
    3,545
    Quote Originally Posted by itCbitC View Post
    No doubt getc() is a drag on largefiles. There are obvious tradeoffs between the portability and efficiency of a solution. IMO stat() or fseek() / ftell() are the preferred ways to get the filesize but on Windows they apparently won't work. getc() is portable but very inefficient. So there you have it - pros and cons of each solution and perhaps the o/p should pick the one that best suits the requirements.
    Why won't stat() (a.k.a. _stat()) work on Windows?
    "I am probably the laziest programmer on the planet, a fact with which anyone who has ever seen my code will agree." - esbo, 11/15/2008

    "the internet is a scary place to be thats why i dont use it much." - billet, 03/17/2010

  13. #13
    Registered User
    Join Date
    Oct 2008
    Location
    TX
    Posts
    2,059
    Quote Originally Posted by cpjust View Post
    Why won't stat() (a.k.a. _stat()) work on Windows?
    Never coded in Windows so I couldn't say but the fact that stat() on Unix is _stat() on Windows is platform dependent.

  14. #14
    Registered User
    Join Date
    Oct 2008
    Posts
    1,262
    Quote Originally Posted by itCbitC View Post
    Never coded in Windows so I couldn't say but the fact that stat() on Unix is _stat() on Windows is platform dependent.
    I feel happy for you! I've had the misfortune of having to code on VS about 3 times. Wow it's bad. All the basic string functions they don't like they add warnings about. strcat? No, use strncatuhegreg instead (I don't remember the exact name, only that it was a retarded extension intended for people who can't code).
    And then.. _stat? Are you kidding me? Those naming conventions are nearly as bad as PHP.
    What will we do to that name? Hmm let's add an underscore prefix. Or two. Let's add a random character to the end of it. Let's just scrap a function entirely and make our own.

    It's disgusting.

    I hope, for your sake, that you never will have to code on Windows. At least not on VS, I reckon other compilers like gcc do a pretty good job.
    Don't even get me started on that dev-c++ crap. For fun, compile something and then open it with a disassembler. Don't be surprised to see one register being written to only to never be used before it's overwritten again. Oh, and the automatic indentation is the worst I've ever seen. Not to mention the many crashes I've experienced.

    Again. Stick to non-Windows if you can.

  15. #15
    and the hat of sweating
    Join Date
    Aug 2007
    Location
    Toronto, ON
    Posts
    3,545
    VC++ is awsome, but Microsoft's ridiculous renaming of standard functions is stupid.
    Last time I used them it didn't care if you typed stat() or _stat(), but maybe they've completely deprecated them in the newest versions of VC++?

    Either way it's simple to fix with a simple #ifdef/#define.
    "I am probably the laziest programmer on the planet, a fact with which anyone who has ever seen my code will agree." - esbo, 11/15/2008

    "the internet is a scary place to be thats why i dont use it much." - billet, 03/17/2010

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Need Help Fixing My C Program. Deals with File I/O
    By Matus in forum C Programming
    Replies: 7
    Last Post: 04-29-2008, 07:51 PM
  2. Can we have vector of vector?
    By ketu1 in forum C++ Programming
    Replies: 24
    Last Post: 01-03-2008, 05:02 AM
  3. gcc link external library
    By spank in forum C Programming
    Replies: 6
    Last Post: 08-08-2007, 03:44 PM
  4. Trouble with DMA Segmentation Faults
    By firestorm717 in forum C Programming
    Replies: 2
    Last Post: 05-07-2006, 09:20 PM
  5. Replies: 3
    Last Post: 03-04-2005, 02:46 PM