Thread: read full file to string in one go

  1. #1
    Registered User
    Join Date
    Aug 2013
    Posts
    10

    read full file to string in one go

    as a casual c++ user i recently stumbled upon a problem, which i didn't expect to be one at all. something as trivial as reading a file at once turned out to be not so trivial after all in c++.

    in c i'd simply use fread and be done with it. for c++ however there doesn't seem to be such a thing. i had a look around here and searched for "read file" as well as across the net. i found a bunch of approaches and most of them are summarized there -> Insane Coding: How to read in a file in C++.

    the interesting ones are:

    Code:
    std::string get_file_contents(const char *filename)
    {
      std::ifstream in(filename, std::ios::in | std::ios::binary);
      if (in)
      {
        std::string contents;
        in.seekg(0, std::ios::end);
        contents.resize(in.tellg());
        in.seekg(0, std::ios::beg);
        in.read(&contents[0], contents.size());
        in.close();
        return(contents);
      }
      throw(errno);
    }
    and

    Code:
    std::string get_file_contents(const char *filename)
    {
      std::ifstream in(filename, std::ios::in | std::ios::binary);
      if (in)
      {
        std::ostringstream contents;
        contents << in.rdbuf();
        in.close();
        return(contents.str());
      }
      throw(errno);
    }
    the first one looked good at first. "in.read(&contents[0], contents.size());" works, but only for the raw content. the counters of the string however are not updated and hence size() for example reports 0 and many member functions such as find() don't work. so that one was off the table.

    the second one (and in fact all others i found) seems to require a copy to a string and that's where the catch is. my files are text but quite large, as in several gigabytes, so a copy, which doubles the ram usage, is simply not an option. and any line-by-line (getline and alike) based approaches neither of course.


    and now i'm wondering, is there really no direct way of getting a text file into a string via the standard library? i remember even roguewave's RWCString had a readFile member function in the 90s.

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    Seems to work for me, running the code with it's own source file.
    Code:
    $ ls -l bar.cpp
    -rw-rw-r-- 1 sc sc 577 Jun 10 06:08 bar.cpp
    $ g++ -std=c++11 bar.cpp
    $ ./a.out 
    #include <iostream>
    #include <fstream>
    #include <string>
    using namespace std;
    
    std::string get_file_contents(const char *filename)
    {
      std::ifstream in(filename, std::ios::in | std::ios::binary);
      if (in)
      {
        std::string contents;
        in.seekg(0, std::ios::end);
        contents.resize(in.tellg());
        in.seekg(0, std::ios::beg);
        in.read(&contents[0], contents.size());
        in.close();
        return(contents);
      }
    }
    
    int main()
    {
      std::string r = get_file_contents("bar.cpp");
      std::cout << r << std::endl;
      std::cout << "Size=" << r.size() << std::endl;
      return 0;
    }
    
    Size=577
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    Registered User
    Join Date
    Aug 2013
    Posts
    10
    yeah as a separate function which then requires a copy again. hence the string has everything and size() works. but try it directly:

    Code:
    #include <cstdio>
    #include <fstream>
    #include <string>
    #include <sys/stat.h>
    
    #define LFILE "testfile"
    
    using namespace std;
    
    
    int main() {
    	ifstream in;
    	string str;
    	struct stat s;
    
    	if (stat(LFILE, &s)) {
    		perror("ERROR: couldn't stat logfile");
    		exit(1);
    	}
    
    	str.reserve(s.st_size + 1);
    	in.open(LFILE);
    	in.read(&str[0], s.st_size);
    	in.close();
    	printf("%lu\n", str.size());
    	exit(0);
    }
    the crucial thing in this case is that i cannot afford a copy as mentioned in my first post already.

  4. #4
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    > my files are text but quite large, as in several gigabytes, so a copy, which doubles the ram usage
    Why do you need the whole file in memory at the same time?

    How much time is spent processing the data once the file is in memory?
    If you're spending minutes in processing, saving seconds on the file I/O is a waste of effort.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  5. #5
    Registered User
    Join Date
    Aug 2013
    Posts
    10
    so there's indeed no way to read a file into a string directly in c++ ?

  6. #6
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    No, it can be done, but the issue is that based on what you wrote, you probably don't want to do that.
    Last edited by laserlight; 06-10-2019 at 02:30 AM.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  7. #7
    Registered User
    Join Date
    Aug 2013
    Posts
    10
    Quote Originally Posted by laserlight View Post
    you probably don't want to do that.
    oh i absolutely want to do that. this is such a basic task, i'm still buffled (pun!) that there's no obvious way to do it.
    please do tell!
    Last edited by YocR; 06-10-2019 at 02:38 AM.

  8. #8
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Actually, you have already found a number of approaches. The copy issue is a non-issue: the copy will either be elided (the old named return value optimisation), or you can take steps to ensure a move or the use of a reference instead.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  9. #9
    Registered User
    Join Date
    Aug 2013
    Posts
    10
    move sounds like the best option. how would i ensure a move?

  10. #10
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    I would choose your second example (based on memory that it did well in timed tests some time back), and just compile with respect to C++11 or later. I'd suggest inverting the if statement though: use it to throw an exception derived from std::exception instead, then it's clear you do return a string at the end, though this shouldn't matter.

    Nonetheless, Salem's point remains: are you really sure you cannot process your input in smaller chunks?
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  11. #11
    Registered User
    Join Date
    Aug 2013
    Posts
    10
    yes sorry, i didn't mention any requirements. it has to work everywhere so c++11 is not an option and exceptions are generally disabled via compiler option.

    anyway, my example from post #3 does work after all but with one change: the problem was reserve(). if i use resize() instead it seems fine.

    either way, thanks for all replies

  12. #12
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by YocR
    it has to work everywhere so c++11 is not an option
    It's 2019: C++11 is likely to be available. If it isn't, then you need to be sure what exactly you're supporting: how can you be sure that C++98 is available?

    Anyway, move semantics was introduced in C++11, so if C++11 is not an option, you'll have to do the same thing and depend on copy elision, which should happen since C++98.

    Quote Originally Posted by YocR
    exceptions are generally disabled via compiler option.
    You're throwing an exception with throw(errno), except that it isn't a usual exception. If you cannot use exceptions, then you need some other way to indicate an error: return an empty string?

    Quote Originally Posted by YocR
    anyway, my example from post #3 does work after all but with one change: the problem was reserve(). if i use resize() instead it seems fine.
    If you don't want to rely on copy elision, you can use this but with a reference parameter instead, then say, return a bool to indicate success/failure.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  13. #13
    Registered User
    Join Date
    Aug 2013
    Posts
    10
    Quote Originally Posted by laserlight View Post
    you need some other way to indicate an error: return an empty string?

    If you don't want to rely on copy elision, you can use this but with a reference parameter instead, then say, return a bool to indicate success/failure.
    thanks again but since i don't put this in a function (as post #3 shows) there is no return for me to come up with.
    but if i wanna be sure, i can just compare str.size() against s.st_size.

    EDIT: even better, with that "&str[0]" way, i can even keep using fread so i don't have to deal with streams at all
    Last edited by YocR; 06-10-2019 at 04:19 AM. Reason: addition

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 5
    Last Post: 09-25-2016, 06:53 AM
  2. Read file line, search for string, get string position
    By taifun in forum C++ Programming
    Replies: 15
    Last Post: 03-24-2014, 02:55 PM
  3. how to read a file as a string
    By fred_benrold in forum C Programming
    Replies: 1
    Last Post: 12-22-2012, 06:04 AM
  4. fread failing to read full file
    By DeusAduro in forum C Programming
    Replies: 3
    Last Post: 05-27-2009, 12:21 PM
  5. Read Int & String from a File
    By tim_messer in forum C++ Programming
    Replies: 4
    Last Post: 02-22-2002, 04:59 PM

Tags for this Thread