Thread: end of file byte

  1. #1
    Registered User
    Join Date
    May 2009
    Posts
    242

    end of file byte

    I'm just wondering what this byte actually looks like. My textbook says that it's dependent on OS but often ctrl + Z.

    Ok, fair enough. But as I see it, there are only 256 possible distinct values for a byte, and all are listed in a chart of ASCII codes, in which some of the ones 31 and lower are still mysterious to me.

    But, if you want to interpret those bytes as numbers rather than characters, you can't exclude any of the possible 256 values. int, for example, needs all 256 possibililities if it's going to store its 4 billion plus possible values in 4 bytes of memory.

    Do we actually need more than 4 bytes to store an arbitrary integer on disk so that there's available space to tell us when we've reached the end of the file?

    Could someone explain the mechanics of this?

    What I'm not seeing is how a file can at the same time leave all 256 options open for storing a numeric value in a byte and at the same time have some specific byte value that definitively signals the end of a file.

  2. #2
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    End-of-file bytes are used in systems that do not have exact file-lengths in the directory structure in the filesystem. DOS, Windows and Linux all have exact (to a single byte) length in stored in the filesystem.

    PDP-11 RSTS/E or i8080 or Z80 CP/M would be examples of OS's that store "the number of blocks used by the file" rather than the number of bytes, so to show where the end of the actual data ends.

    Since this only matters for text-files, it's fine to use certain bytes for indication purposes. Likewise, we have newlines, carriage returns, and another control characters to tell the output device to go to the next line, and such.

    For binary files, such as data-files, jpeg's, do not use control characters for these purposes. So for these files, the special characters do not indicate for example end-of-file. For this purpose, we should use the "binary" modifier to tell the stream management that "do not treat control characters as special" - so you just read each byte for what it is, there is no special meaning to any of them - all values 0-255 are "valid data".

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  3. #3
    Registered User
    Join Date
    May 2009
    Posts
    242
    ok, makes sense. thanks!

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. A development process
    By Noir in forum C Programming
    Replies: 37
    Last Post: 07-10-2011, 10:39 PM
  2. Formatting the contents of a text file
    By dagorsul in forum C++ Programming
    Replies: 2
    Last Post: 04-29-2008, 12:36 PM
  3. help with text input
    By Alphawaves in forum C Programming
    Replies: 8
    Last Post: 04-08-2007, 04:54 PM
  4. Simple File encryption
    By caroundw5h in forum C Programming
    Replies: 2
    Last Post: 10-13-2004, 10:51 PM
  5. error: identifier "byte" is undefined.
    By Hulag in forum C++ Programming
    Replies: 4
    Last Post: 12-10-2003, 05:46 PM