Thread: Writing std::string in binary mode

  1. #1
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607

    Writing std::string in binary mode

    The problem:
    Reading/Writing a binary data file that contains several variable length strings as well as binary data.

    Code:
    std::ofstream("CraftTest.dat",std::ios_base::binary);
    ...
    ...
    //Write the string and its null terminator
    outFile << m_CraftName.c_str();
    
    //Skip past the null
    outFile.seekp(1,std::ios_base::cur);
    
    //Write binary data
    outFile.write((char *)&m_CraftData,sizeof(BinaryCraftData));
    What happens here is that the null terminator is written to disk, however, I must do the seekp because the stream position is not incremented via the terminating null. So without the seekp when the next write happens it blows away the null terminator.

    However when reading the data file back in nothing works correctly except reading in the string. After that point none of the data following the string is correct even if I do a seekg() to get past the null.

    How can I read in this variable length string without writing the length to the file? I would prefer to do this without having to manually read in each character until I encounter a null. My initial thoughts were that if I wrote the string to disk correctly with the null that this would work.

    Code:
    std::ifstream("CraftTest.dat",std::ios_base::binary);
    ...
    ...
    //Read the string and it's null terminator
    inFile >> m_CraftName;
    
    //Read in binary data
    inFile.read((char *)&m_CraftData,sizeof(BinaryCraftData));
    This reads in the string correctly but now every read after this point is pure garbage as if my file position is off causing me to get some very weird numbers. Just in case the string read was not reading the null I tried to do a seekg() after it to skip the null and this still did not fix anything.

    If I had just used C I/O I would have been done by now.
    Last edited by VirtualAce; 01-01-2009 at 01:00 AM.

  2. #2
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607
    Got it working. Was forced to write the size to disk and manually read and write each character of the string to the disk in a loop.

  3. #3
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by Bubba
    What happens here is that the null terminator is written to disk, however, I must do the seekp because the stream position is not incremented via the terminating null. So without the seekp when the next write happens it blows away the null terminator.
    Are you sure the null terminator is written to disk? I had the impression that it was not, and a quick check with a test program and a hex editor appears to confirm that.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  4. #4
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607
    Only if you use c_str(). However the stream position is not affected by the terminating null even though it is written to disk. I'm a huge fan of streams and for the most part they are great. However for binary data I'm thinking that good old C I/O is much easier. This was quite painful.

  5. #5
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by Bubba
    Only if you use c_str().
    I do not think so. Try this test program:
    Code:
    #include <iostream>
    #include <fstream>
    #include <string>
    
    int main()
    {
        using namespace std;
        ofstream out("out.txt", ios_base::binary);
        std::string str = "hello world!";
        out << str.c_str();
        //out << '\0';
        out.close();
    
        ifstream in("out.txt", ios_base::binary);
        int ch;
        size_t count = 0;
        while ((ch = in.get()) != EOF)
        {
            ++count;
        }
        cout << count << endl;
    }
    I reason that if the null terminator is written to disk, the count should be 13. However, I get a count of 12. Uncommenting the line with out << '\0' gives a count of 13, so apparently the null terminator is not ignored, as expected. Therefore, I conclude that the null terminator was not written to disk contrary to what you stated.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  6. #6
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607
    It must have been something else then when I saw the null in XVI 32. Perhaps I did not reload the file after altering and running my code.

    It appears from the source that this is the operation that << does for const char *.
    Code:
    out.write(str, strlen(str))
    Which would leave out the null terminator.

  7. #7
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Right, so my guess is that if you actually wrote out the null terminator in your earlier example, perhaps the problem would be fixed without having to write the size to disk. The use of c_str() would also be unnecessary in that case.

    EDIT:
    Actually, maybe not. The null terminator is not regarded as whitespace, so inFile >> m_CraftName would read it in as well. Perhaps a better solution is to write a space after writing m_CraftName, then when reading ignore that space and go on to read the binary data.
    Last edited by laserlight; 01-01-2009 at 02:18 AM.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  8. #8
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607
    As usual it's not as simple as it sounds b/c there are a lot of gotchas. The only way I got it to work was from an idea off of gamedev which was to write 2 specific functions to read and write STL strings to/from disk. It works very well but requires the string size be written to disk. I made the functions part of a Serializeable base class so every object that needs serialization has the functions available to them. Next time I read/write binary data I think I will move away from streams since they are primarily for text files.

    All this came about because I moved my data files over to binary since using XML became way to cumbersome, ugly, and did not make the loading/saving process any easier. IE: Using XML bought me nothing.
    Last edited by VirtualAce; 01-01-2009 at 02:37 AM.

  9. #9
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by Bubba
    I moved my data files over to binary because using XML became way to cumbersome, ugly, and did not make the loading/saving process any easier. For this I just whipped up a console app in about 10 minutes and would have been done had I not ran into this string issue.
    Just a thought, but have you considered using SQLite? One of its use cases is as a replacement for fopen(), i.e., to replace a custom file format with a database schema but without needing the use of a database server.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  10. #10
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    Don't mix formatted I/O with raw I/O. In other words, if on a stream you use write(), don't use << on the same stream.

    To write a std::string to a binary stream, first write() the size, then write() the data(), in one block. For reading, read() the size, resize() the string and read() the data block to &s[0]. Note, however, that this is only guaranteed to work in C++0x; C++03 gives you no guarantees that the data of std::string is contiguous and writeable. If you want a proper C++03 method of doing it, read() the size, resize() a std::vector, read() to the vector and range-assign from the vector to the string.

    Code:
    std::string::size_type sz = s.size();
    stream.write(reinterpret_cast<char*>(&sz), sizeof(std::string::size_type));
    stream.write(s.data(), sz);
    
    // Reading common:
    stream.read(reinterpret_cast<char*>(&sz), sizeof(std::string::size_type));
    
    // C++03 with C++0x semantics:
    s.resize(sz);
    stream.read(&s[0], sz);
    
    // C++0x:
    s.resize(sz);
    stream.read(s.data(), sz);
    
    // C++03:
    std::vector<char> buf(sz);
    stream.read(&buf[0], sz);
    s.assign(buf.begin(), buf.end());
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  11. #11
    and the hat of sweating
    Join Date
    Aug 2007
    Location
    Toronto, ON
    Posts
    3,545
    Would it work better if you used a std::vector<unsigned char> when reading/writing to disk?
    std::string is meant for strings, not binary data...
    "I am probably the laziest programmer on the planet, a fact with which anyone who has ever seen my code will agree." - esbo, 11/15/2008

    "the internet is a scary place to be thats why i dont use it much." - billet, 03/17/2010

  12. #12
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607
    The string does not contain binary data. It is mixed in with binary data. Regardless what CornedBee suggests is the best approach. Right now I'm doing a string resize after reading the length from disk.

    All in all this is ridiculous since its a simple matter of writing/reading a string until you reach some terminating character whatever it may be. It seems a huge limitation to have to use the vector. Fixed size strings would have worked just as well. In this instance the STL was completely useless and I should have used fixed length strings and C I/O functions to read/write the data. But rather than change the code and structures I basically did what CornedBee suggested but without the vector.

    The only downside to a fixed length string is I have to write the entire string regardless if only a couple of characters are used and I must decide on a max size for the string.

  13. #13
    Registered User
    Join Date
    Nov 2010
    Posts
    1
    thanx cornedbee

  14. #14
    Registered User
    Join Date
    Oct 2008
    Posts
    1,262
    If you really don't want a size prepended you could as well read it one byte at the time from the stream until you find the 0-terminator, and simply append it to a string. Yes, this does go at the cost of a tiny bit of speed, but unless it's an extremely common operation that shouldn't be noticeable.

    Writing the data is easy: simply write the str.c_str() with the length str.length()+1; this c string is guaranteed to have a 0 byte and by simply adding one to the length you include it to be written. Then, reading, as I said, is no more than a loop reading one character at the time.

    How slow is reading one character at a time? Well, I think more data will actually be read/cached in several places. I've looked into the STL I'm using before, and the ifstream had another "layer" before reading from the file directly, I believe. So even if you requested only one byte, more would be read so that it could be accessed quicker later. But I'm not completely sure on this...

  15. #15
    Anti-Poster
    Join Date
    Feb 2002
    Posts
    1,401
    This thread is nearly two years old. I think Bubba has taken care of it by now.
    If I did your homework for you, then you might pass your class without learning how to write a program like this. Then you might graduate and get your degree without learning how to write a program like this. You might become a professional programmer without knowing how to write a program like this. Someday you might work on a project with me without knowing how to write a program like this. Then I would have to do you serious bodily harm. - Jack Klein

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Writing binary data to a file (bits).
    By OOPboredom in forum C Programming
    Replies: 2
    Last Post: 04-05-2004, 03:53 PM
  2. Tutorial review
    By Prelude in forum A Brief History of Cprogramming.com
    Replies: 11
    Last Post: 03-22-2004, 09:40 PM
  3. Writing to a binary file
    By nz_cutechick in forum C++ Programming
    Replies: 5
    Last Post: 08-15-2002, 06:46 AM
  4. Writing binary
    By PsychoMantis in forum C++ Programming
    Replies: 0
    Last Post: 07-30-2002, 02:09 AM
  5. File Encryption & Read/Write in Binary Mode
    By kuphryn in forum C++ Programming
    Replies: 5
    Last Post: 11-30-2001, 06:45 PM