Thread: another file i/o question

  1. #1
    Registered User
    Join Date
    Feb 2003
    Posts
    596

    another file i/o question

    What difference does it make whether a file is opened in text mode vs. binary mode? I seem to be finding the same results either way. When I write an int using write() the file contains the 4 bytes that make up the int. When I write the int using <<, the file contains the ascii codes for the digits that represent the int. The same things occur regardless of whether I use fout.open( "iotest.dat" ) or fout.open( "iotest.dat", ios::binary ). It seems that the only thing that matters is the output function, not the file type. Why?

    Code:
    #include <iostream>
    #include <fstream>
    using namespace std;
    
    int main () {
      int intval1 = 1919006563; // this is what I should get back when the chars are put together
      int intval2, intval3;
      unsigned char buffer[4];
      void* vp;
      
      cout << "First test - view contents of binary file written with unformatted output:\n";
      ofstream fout( "iotest.dat", ios::binary );
      if( fout == NULL ) {
        cout << "couldn't open for output\n";
        return 1;
      }
    
      buffer[0] = (unsigned char)99;
      buffer[1] = (unsigned char)183;
      buffer[2] = (unsigned char)97;
      buffer[3] = (unsigned char)114;
      vp = buffer;
      fout.write( (const char*)vp, 4 );
      vp = &intval1;
      fout.write( (const char*)vp, 4 );
      fout.close();
    
      ifstream fin( "iotest.dat", ios::binary );
      if( fin == NULL ) {
        cout << "couldn't open for input\n";
        return 1;
      }
    
      vp = &intval2;
      fin.read( (char*)vp, 4 );
      cout << intval2 << endl;
      vp = buffer;
      fin.read( (char*)vp, 4 );
      for( int i = 0; i < 4; ++i ) {
        cout << (int)buffer[i] << endl;
      }
      while( !fin.eof() ) {
        fin.read( (char*)vp, 1 );
        if( !fin.eof() )
          cout << (int)buffer[0] << " ";
        else cout << "reached eof";
      }
      cout << endl;
      fin.close();
      
      cout << "\nSecond test - view contents of binary file written with formatted output:\n";
      
      fout.open( "iotest.dat", ios::binary );
      if( fout == NULL ) {
        cout << "couldn't open for output\n";
        return 1;
      }
      fout << intval1;
      fout << (unsigned char)99;
      fout << (unsigned char)183;
      fout << (unsigned char)97;
      fout << (unsigned char)114;
      fout.close();
      
      fin.open( "iotest.dat", ios::binary );
      if( fin == NULL ) {
        cout << "couldn't open for input\n";
        return 1;
      }
      while( !fin.eof() ) {
        fin.read( (char*)vp, 1 );
        if( !fin.eof() )
          cout << (int)buffer[0] << " ";
        else cout << "reached eof";
      }
      cout << endl;
      fin.close();
      
      cout << "\nThird test - view contents of text file with unformatted output:\n";
      
      fout.open( "iotest.dat" );
      if( fout == NULL ) {
        cout << "couldn't open for output\n";
        return 1;
      }
      buffer[0] = (unsigned char)99;
      buffer[1] = (unsigned char)183;
      buffer[2] = (unsigned char)97;
      buffer[3] = (unsigned char)114;
      vp = buffer;
      fout.write( (const char*)vp, 4 );
      vp = &intval1;
      fout.write( (const char*)vp, 4 );
      fout.close();
    
      fin.open( "iotest.dat", ios::binary );
      if( fin == NULL ) {
        cout << "couldn't open for input\n";
        return 1;
      }
      cout << "Unformatted input: ";
      while( !fin.eof() ) {
        fin.read( (char*)buffer, 1 );
        if( !fin.eof() )
          cout << (int)buffer[0] << " ";
        else cout << "reached eof";
      }
      cout << endl;
      fin.clear();
      fin.seekg(0);
      cout << "Formatted input: ";
      while( fin >> buffer[0] )
        cout << (int)buffer[0] << " ";
      cout << endl;
      fin.close();
    
      cout << "\nFourth test - view contents of text file with formatted output:\n";
      
      fout.open( "iotest.dat" );
      if( fout == NULL ) {
        cout << "couldn't open for output\n";
        return 1;
      }
      buffer[0] = (unsigned char)99;
      buffer[1] = (unsigned char)183;
      buffer[2] = (unsigned char)97;
      buffer[3] = (unsigned char)114;
      for( int i = 0; i < 4; ++i )
        fout << buffer[i];
      fout << intval1;
      fout.close();
    
      fin.open( "iotest.dat", ios::binary );
      if( fin == NULL ) {
        cout << "couldn't open for input\n";
        return 1;
      }
      cout << "Unformatted input: ";
      while( !fin.eof() ) {
        fin.read( (char*)buffer, 1 );
        if( !fin.eof() )
          cout << (int)buffer[0] << " ";
        else cout << "reached eof";
      }
      cout << endl;
      fin.clear();
      fin.seekg(0);
      cout << "Formatted input: ";
      while( fin >> buffer[0] ) {
        cout << (int)buffer[0] << " ";
      }
      cout << endl;
      fin.close();
          
      return 0;
    }
    Last edited by R.Stiltskin; 01-29-2009 at 11:48 AM.

  2. #2
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    The one example everyone uses is \r (which is ASCII 10 I believe) which, if read in text mode, will either go away or become \n. 0x7f and 0xff might act strangely (I think 0x7f is supposed to be DEL.)

  3. #3
    Registered User
    Join Date
    Feb 2003
    Posts
    596
    OK, I tested \r (it's actually ASCII 13), and using << it simply did not appear in the list of chars I read from the file, but using read() it does appear. This occurred when I opened the file as a text file for output, but opened it as binary for input. So, what is the actual difference between a text file and a binary file? Other than strange effects like this one with \r, I can't see any difference when I look at the file contents. Given a file that I didn't write, how can I tell whether it is a text file or a binary file?

  4. #4
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,613
    Use a binary extension such as bin or dat. Most formal file formats like rtf also have specifications, so anything that you would work with in the real world has implementation rules.

    And concerning the code, I prefer stream.is_open() myself. It's always been somewhat unclear to me exactly what bits are considered when streams are used in Boolean contexts, and what you wrote seems to rely on the knowledge that streams use operator void*() in these. The member will perform the check and return the correct response but is self-documenting.

    It also seems that you were never taught how to do a priming read:
    Code:
    stream >> item;  // priming read
    while (!stream) {
      processData(item);
      stream >> item; // driving read
    }
    
    if (!stream.eof()) {
       // there were read errors
    }
    That's the gist of it. It will catch most problems, and it's safer than what you've been doing. I think this method applies to binary reads as well. but in my opinion it's still a pain in the ass.

  5. #5
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    On *nix there is no difference. On Windows, non-binary will cause the following translations:
    Code -> File (writes) : \n -> \r\n
    File -> Code (reads) : \r\n -> \n

    gg

  6. #6
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    You ask the person who wrote it. Any text file can be interpreted as a binary file (since it is, in fact, binary data), without much harm. But a binary file may or may not make much sense as text. (If you get a lot of non-printable characters, it's probably binary; but there's no actual rule.)

  7. #7
    Registered User
    Join Date
    Feb 2003
    Posts
    596
    thanks to all of you.

    whiteflags: thanks for the suggestions. It seems, though, that
    Code:
    fin.read( (char*)(&inbuff), 1 );
    makes more sense for non-printable data. I tried "fin >>" but it has a VERY STRANGE & PUZZLING RESULT (see below).

    Codeplug: does Windows distinguish the two file types strictly based on their extensions, or does the OS use some sort of file metadata written to the disk that we don't have access to?

    tabstop: I had in mind situations in which there aren't two people to discuss the file, i.e. some sort of agent program, bot, etc., that acquires a file & has to determine what it contains with no outside assistance. By the way, I'm not sure what you had in mind regarding 0xff and 0x7f. I don't see any unusual results for those values.

    **************

    This code gives me (in Linux) the same output no matter how it is opened. I changed the extensions for the 2 cases in which it is opened for writing as a text file, in case anyone wants to see what that does (if anything) in Windows.

    Notice that when I use "fin >>" to read in '13' '10' something odd happens to inbuff, with the result that in each case the first call to
    Code:
    cout << " " << (int)inbuff;
    has no output -- not even the blank space. In the last test I replaced cout with printf, which acts even more strangely. It does nothing for the first "13 10", but executes a linefeed (or a CR/LF) for the second "13 10". But inbuff is just an unsigned char. It can only hold values [0, 255], and how can one '13' be different from another '13'? Anybody know what's happening here?

    Code:
    #include <iostream>
    #include <fstream>
    using namespace std;
    
    int main () {
    //  int intval1 = 1919006563; // this is what I should get back when the chars are put together
    //  int intval2, intval3;
      unsigned char cbuff[4];
      unsigned short sbuff[1];
      unsigned char inbuff;
      void* vp;
      
      cout << "First test - file opened as binary for writing, opened as binary for reading:\n";
      ofstream fout( "iotest.dat", ios::binary );
      if( !fout.is_open() ) {
        cout << "couldn't open for output\n";
        return 1;
      }
    
      cbuff[0] = (unsigned char)13;
      cbuff[1] = (unsigned char)10;
      cbuff[2] = (unsigned char)255;
      cbuff[3] = (unsigned char)127;
      sbuff[0] = (unsigned short)0xa0d;
      
      vp = cbuff;
      fout.write( (const char*)vp, 4 );
      vp = sbuff;
      fout.write( (const char*)vp, 2 );
      fout.close();
    
      ifstream fin( "iotest.dat", ios::binary );
      if( !fin.is_open() ) {
        cout << "couldn't open for input\n";
        return 1;
      }
      inbuff = 99;
      cout << "inbuff contents before input: " << inbuff << endl;
      cout << "formatted input:";
      fin >> inbuff;
      while( !fin.eof() ) {
        cout << " " << (int)inbuff;
        fin >> inbuff;
      }
      cout << "." << endl;
      fin.clear();
      fin.seekg(0);
      cout << "unformatted input:";
      fin.read( (char*)(&inbuff), 1 );
      while( !fin.eof() ) {
        cout << " " << (int)inbuff;
        fin.read( (char*)(&inbuff), 1 );
      }
      cout << "." << endl;
      fin.close();
      
      cout << "Second test - file opened as binary for writing, opened as text for reading:\n";
      fout.open( "iotest.dat", ios::binary );
      if( !fout.is_open() ) {
        cout << "couldn't open for output\n";
        return 1;
      }
    
      cbuff[0] = (unsigned char)13;
      cbuff[1] = (unsigned char)10;
      cbuff[2] = (unsigned char)255;
      cbuff[3] = (unsigned char)127;
      sbuff[0] = (unsigned short)0xa0d;
      
      vp = cbuff;
      fout.write( (const char*)vp, 4 );
      vp = sbuff;
      fout.write( (const char*)vp, 2 );
      fout.close();
    
      fin.open( "iotest.dat" );
      if( !fin.is_open() ) {
        cout << "couldn't open for input\n";
        return 1;
      }
    
      inbuff = 99;
      cout << "inbuff contents before input: " << inbuff << endl;
      cout << "formatted input:";
      fin >> inbuff;
      while( !fin.eof() ) {
        cout << " " << (int)inbuff;
        fin >> inbuff;
      }
      cout << "." << endl;
      fin.clear();
      fin.seekg(0);
      cout << "unformatted input:";
      fin.read( (char*)(&inbuff), 1 );
      while( !fin.eof() ) {
        cout << " " << (int)inbuff;
        fin.read( (char*)(&inbuff), 1 );
      }
      cout << "." << endl;
      fin.close();
      
      cout << "Third test - file opened as text for writing, opened as binary for reading:\n";
      fout.open( "iotest.txt" );
      if( !fout.is_open() ) {
        cout << "couldn't open for output\n";
        return 1;
      }
    
      cbuff[0] = (unsigned char)13;
      cbuff[1] = (unsigned char)10;
      cbuff[2] = (unsigned char)255;
      cbuff[3] = (unsigned char)127;
      sbuff[0] = (unsigned short)0xa0d;
      
      vp = cbuff;
      fout.write( (const char*)vp, 4 );
      vp = sbuff;
      fout.write( (const char*)vp, 2 );
      fout.close();
    
      fin.open( "iotest.dat", ios::binary );
      if( !fin.is_open() ) {
        cout << "couldn't open for input\n";
        return 1;
      }
    
      inbuff = 99;
      cout << "inbuff contents before input: " << inbuff << endl;
      cout << "formatted input:";
      fin >> inbuff;
      while( !fin.eof() ) {
        cout << " " << (int)inbuff;
        fin >> inbuff;
      }
      cout << "." << endl;
      fin.clear();
      fin.seekg(0);
      cout << "unformatted input:";
      fin.read( (char*)(&inbuff), 1 );
      while( !fin.eof() ) {
        cout << " " << (int)inbuff;
        fin.read( (char*)(&inbuff), 1 );
      }
      cout << "." << endl;
      fin.close();
      
      cout << "Fourth test - file opened as text for writing, opened as text for reading:\n";
      fout.open( "iotest.txt" );
      if( !fout.is_open() ) {
        cout << "couldn't open for output\n";
        return 1;
      }
    
      cbuff[0] = (unsigned char)13;
      cbuff[1] = (unsigned char)10;
      cbuff[2] = (unsigned char)255;
      cbuff[3] = (unsigned char)127;
      sbuff[0] = (unsigned short)0xa0d;
      
      vp = cbuff;
      fout.write( (const char*)vp, 4 );
      vp = sbuff;
      fout.write( (const char*)vp, 2 );
      fout.close();
    
      fin.open( "iotest.dat" );
      if( !fin.is_open() ) {
        cout << "couldn't open for input\n";
        return 1;
      }
    
      inbuff = 99;
      cout << "inbuff contents before input: " << inbuff << endl;
      cout << "formatted input:";
      fin >> inbuff;
      while( !fin.eof() ) {
        printf(" %d\n", inbuff);
    //    cout << " " << (int)inbuff;
        fin >> inbuff;
      }
      cout << "." << endl;
      fin.clear();
      fin.seekg(0);
      cout << "unformatted input:";
      fin.read( (char*)(&inbuff), 1 );
      while( !fin.eof() ) {
        cout << " " << (int)inbuff;
        fin.read( (char*)(&inbuff), 1 );
      }
      cout << "." << endl;
      fin.close();
    }

  8. #8
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    All files are just a sequence of bytes the OS. It's the MS-CRT that does the newline translations for for any file opened as non-binary.

    For C++ file I/O, read() write() is the only way to do binary I/O. I'm not surprised that you would have strange results using (formatted I/O) extractors (>>) on binary data.

    gg

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. A development process
    By Noir in forum C Programming
    Replies: 37
    Last Post: 07-10-2011, 10:39 PM
  2. File i/o and ASCII question
    By muzihc in forum C Programming
    Replies: 13
    Last Post: 11-04-2008, 11:46 PM
  3. Inventory records
    By jsbeckton in forum C Programming
    Replies: 23
    Last Post: 06-28-2007, 04:14 AM
  4. File I/O Question
    By Achy in forum C Programming
    Replies: 2
    Last Post: 11-18-2005, 12:09 AM
  5. Another dumb question about file i/o
    By Cobras2 in forum C++ Programming
    Replies: 23
    Last Post: 03-14-2002, 04:15 PM