fread bytes of data from txt file: break on newline

This is a discussion on fread bytes of data from txt file: break on newline within the C Programming forums, part of the General Programming Boards category; I am reading txt data in binary. But I am not sure how to break the lines up, based on ...

  1. #1
    Registered User
    Join Date
    Dec 2007
    Posts
    23

    fread bytes of data from txt file: break on newline

    I am reading txt data in binary. But I am not sure how to break the lines up, based on "\n", after reading the file content into a buffer. Any hints?
    fedora 6, gcc 4.1.2

  2. #2
    CSharpener vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,484
    use strchr to locate '\n' character ( do not forget to make buffer nul-terminated beforhead)
    The first 90% of a project takes 90% of the time,
    the last 10% takes the other 90% of the time.

  3. #3
    Registered User
    Join Date
    Jan 2008
    Posts
    58
    If you're reading the file in binary mode then '\n' isn't meaningful. You have to convert the computer's format of a new line yourself.
    Code:
    for ( int i = 0; block[i] != '\0'; ++i )
    {
        // Windows new line conversion
        if ( block[i] == 0xD && block[i + 1] == 0xA )
        {
            putchar( '\n' );
        }
        else
        {
            putchar( block[i] );
        }
    }

  4. #4
    CSharpener vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,484
    Quote Originally Posted by Banana Man View Post
    If you're reading the file in binary mode then '\n' isn't meaningful. You have to convert the computer's format of a new line yourself.
    Code:
    for ( int i = 0; block[i] != '\0'; ++i )
    {
        // Windows new line conversion
        if ( block[i] == 0xD && block[i + 1] == 0xA )
        {
            putchar( '\n' );
        }
        else
        {
            putchar( block[i] );
        }
    }
    Don't you note that your "conversion" code does not skips 0xA char?
    Also for Unix formatted txt file this code has no meaning at all
    The first 90% of a project takes 90% of the time,
    the last 10% takes the other 90% of the time.

  5. #5
    Registered User
    Join Date
    Jan 2008
    Posts
    58
    Don't you note that your "conversion" code does not skips 0xA char?
    Oops.
    Code:
    for ( int i = 0; block[i] != '\0'; ++i )
    {
        // Windows new line conversion
        if ( block[i] == 0xD && block[i + 1] == 0xA )
        {
            ++i;
            putchar( '\n' );
        }
        else
        {
            putchar( block[i] );
        }
    }
    Also for Unix formatted txt file this code has no meaning at all
    Read the comment please. It's an example for Windows because Windows uses two characters to represent a new line. For Unix or any other OS you'd have to change the code because different OS's treat new lines differently.

  6. #6
    CSharpener vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,484
    so when somebody nead this conversion - he will just open file in a text mode and will get the conversion for free

    If file is opened in binary - it is because no conversion needed, I suppose
    The first 90% of a project takes 90% of the time,
    the last 10% takes the other 90% of the time.

  7. #7
    Registered User
    Join Date
    Jan 2008
    Posts
    58
    so when somebody nead this conversion - he will just open file in a text mode and will get the conversion for free

    If file is opened in binary - it is because no conversion needed, I suppose
    That's fine I guess, if 911help can open the file in text mode instead of binary mode. If he can't, you're not really helping because you've dismissed the exact problem he's having by saying "If file is opened in binary - it is because no conversion needed".

  8. #8
    CSharpener vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,484
    He asked how to break buffer on '\n' not how to convert '\n' '\r' pair into '\n'...
    The first 90% of a project takes 90% of the time,
    the last 10% takes the other 90% of the time.

  9. #9
    uint64_t...think positive xuftugulus's Avatar
    Join Date
    Feb 2008
    Location
    Pacem
    Posts
    355
    I assume from the user's signature that he is on a Linux box, so his files will probably contain only one character encoded line seperator, the '\n' character. Still it is possible that the user might be reading a windows encoded text file.
    The fact that the file is read in binary, burdens the programmer with the translation of 'special' characters.
    To split the input from his buffer to where newlines appear he could alternatively use strtok. Except of course if replacement of the newline character by the '\0' character is not desirable behavior for the buffer. A traditional strchr call, or even a manual seek with a char * will also be useful, but i would prefer strtok, as every line i parse would be ready to use in whatever meaning i needed.
    Code:
    ...
        goto johny_walker_red_label;
    johny_walker_blue_label: exit(-149$);
    johny_walker_red_label : exit( -22$);
    A typical example of ...cheap programming practices.

  10. #10
    Registered User
    Join Date
    Jan 2008
    Posts
    58
    He asked how to break buffer on '\n' not how to convert '\n' '\r' pair into '\n'...
    But binary mode doesn't convert an OS specific new line into '\n', and that's why I posted. Whatever. If he's on Linux it doesn't matter and you're right.

  11. #11
    Captain Crash brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,263
    Quote Originally Posted by xuftugulus View Post
    I assume from the user's signature that he is on a Linux box, so his files will probably contain only one character encoded line seperator, the '\n' character. Still it is possible that the user might be reading a windows encoded text file.
    The fact that the file is read in binary, burdens the programmer with the translation of 'special' characters.
    I guess the obvious realization that breaking on '\n' works equally correctly, whether the line terminator is '\n' or '\r\n', did not occur to you? There's a reason things are made this way.

    Old Mac OS 9 and prior was the oddball, using just '\r'. Finally they woke up.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. File transfer- the file sometimes not full transferred
    By shu_fei86 in forum C# Programming
    Replies: 13
    Last Post: 03-13-2009, 01:44 PM
  2. Unknown Memory Leak in Init() Function
    By CodeHacker in forum Windows Programming
    Replies: 3
    Last Post: 07-09-2004, 10:54 AM
  3. archive format
    By Nor in forum A Brief History of Cprogramming.com
    Replies: 0
    Last Post: 08-05-2003, 08:01 PM
  4. Extra printed stmts...why?
    By mangoz in forum C Programming
    Replies: 4
    Last Post: 12-19-2001, 07:56 AM
  5. A simple array question
    By frenchfry164 in forum C++ Programming
    Replies: 7
    Last Post: 11-25-2001, 04:13 PM

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21