Thread: Loading a File

  1. #1
    Registered User
    Join Date
    Dec 2010
    Posts
    19

    Loading a File

    Hi,

    I'm trying to load a text file very quickly. My test file is 630KB. Presently, I'm trying to find the fastest algorithm that works. For testing purposes, all processing has been removed; the file is only loaded. I'm porting a Python algorithm, which on average, takes 0.0065 seconds to load the file.

    My first attempt, on average, takes approximately 0.42 seconds, which is FAR too slow. It is of the form:
    Code:
    string line;
    ifstream obj_file(path);
    if (obj_file.is_open()) {
    	while (!obj_file.eof()) {
    		getline(obj_file,line);
    	}
    }
    Method two takes, on average, 0.009 seconds, which is nearly as good as the Python version. However, newlines do not seem to be preserved, and the end of "memblock" contains garbage (I believe due to the same reason as method three). Method two:
    Code:
    ifstream file;
    file.open(path, ios::in);
    if (file.is_open()) {
    	long begin = file.tellg();
    	file.seekg(0, ios::end);
    	long end = file.tellg();
    	long size = end - begin;
    
    	char* memblock = new char[size];
    	file.seekg(0, ios::beg);
    	file.read(memblock, size);
    	file.close();
    
    	delete [] memblock;
    }
    Method three takes, on average, 0.005 seconds, which is good. However, again, "buffer" contains garbage at the end. Furthermore, (if it were uncommented), "Reading error" would be output. Exactly how much "lSize" differs from "result" appears to be the number of newlines in the file; i.e., fread is ignoring newlines!
    Code:
     uint8_t using_type;
    
    FILE* pFile = fopen(path,"r");
    fseek(pFile,0,SEEK_END);
    long lSize = ftell(pFile)/sizeof(using_type);
    rewind(pFile);
    
    using_type* buffer = (using_type*)malloc(sizeof(using_type)*lSize);
    if (buffer == NULL) {
    	//fputs ("Memory error",stderr);
    }
    // copy the file into the buffer:
    size_t result = fread(buffer,1,lSize,pFile);
    if (result != lSize) {
    	//fputs ("Reading error",stderr);
    }
    fclose (pFile);
    free (buffer);
    It would seem that fread and file.read do not preserve newlines in the output buffer. This is a major problem, and I don't now how to work around it other than by using getline(), which is incredibly slow.

    Help, please!
    Thanks,
    Ian

  2. #2
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    This may be the difference between CR-LF and \n. Make sure your file is being written in binary mode, and/or being read in binary mode.

  3. #3
    Registered User
    Join Date
    Dec 2010
    Posts
    19
    It's was written and is being read in text mode, not binary mode. It needs to be human readable.

  4. #4
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Quote Originally Posted by Geometrian View Post
    It's was written and is being read in text mode, not binary mode. It needs to be human readable.
    Are you ever printing integers as integers (as opposed to in the middle of a string)? If not, then the readability of the file is unaffected by text vs. binary.

    (EDIT: That is to say, if you never use fwrite with integer or float data, then bob's your uncle.)

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 3
    Last Post: 11-11-2010, 12:05 PM
  2. gcc link external library
    By spank in forum C Programming
    Replies: 6
    Last Post: 08-08-2007, 03:44 PM
  3. Replies: 3
    Last Post: 03-04-2005, 02:46 PM
  4. Unknown Memory Leak in Init() Function
    By CodeHacker in forum Windows Programming
    Replies: 3
    Last Post: 07-09-2004, 09:54 AM
  5. archive format
    By Nor in forum A Brief History of Cprogramming.com
    Replies: 0
    Last Post: 08-05-2003, 07:01 PM

Tags for this Thread