Loading a File

**Geometrian** · 12-23-2010

Hi,

I'm trying to load a text file very quickly. My test file is 630KB. Presently, I'm trying to find the fastest algorithm that works. For testing purposes, all processing has been removed; the file is only loaded. I'm porting a Python algorithm, which on average, takes 0.0065 seconds to load the file.

My first attempt, on average, takes approximately 0.42 seconds, which is FAR too slow. It is of the form:

Code:

string line;
ifstream obj_file(path);
if (obj_file.is_open()) {
	while (!obj_file.eof()) {
		getline(obj_file,line);
	}
}

Method two takes, on average, 0.009 seconds, which is nearly as good as the Python version. However, newlines do not seem to be preserved, and the end of "memblock" contains garbage (I believe due to the same reason as method three). Method two:

Code:

ifstream file;
file.open(path, ios::in);
if (file.is_open()) {
	long begin = file.tellg();
	file.seekg(0, ios::end);
	long end = file.tellg();
	long size = end - begin;

	char* memblock = new char[size];
	file.seekg(0, ios::beg);
	file.read(memblock, size);
	file.close();

	delete [] memblock;
}

Method three takes, on average, 0.005 seconds, which is good. However, again, "buffer" contains garbage at the end. Furthermore, (if it were uncommented), "Reading error" would be output. Exactly how much "lSize" differs from "result" appears to be the number of newlines in the file; i.e., fread is ignoring newlines!

Code:

 uint8_t using_type;

FILE* pFile = fopen(path,"r");
fseek(pFile,0,SEEK_END);
long lSize = ftell(pFile)/sizeof(using_type);
rewind(pFile);

using_type* buffer = (using_type*)malloc(sizeof(using_type)*lSize);
if (buffer == NULL) {
	//fputs ("Memory error",stderr);
}
// copy the file into the buffer:
size_t result = fread(buffer,1,lSize,pFile);
if (result != lSize) {
	//fputs ("Reading error",stderr);
}
fclose (pFile);
free (buffer);

It would seem that fread and file.read do not preserve newlines in the output buffer. This is a major problem, and I don't now how to work around it other than by using getline(), which is incredibly slow.

Help, please!
Thanks,
Ian

**tabstop** · 12-23-2010

This may be the difference between CR-LF and \n. Make sure your file is being written in binary mode, and/or being read in binary mode.

**Geometrian** · 12-23-2010

It's was written and is being read in text mode, not binary mode. It needs to be human readable.

**tabstop** · 12-23-2010

Originally Posted by Geometrian

It's was written and is being read in text mode, not binary mode. It needs to be human readable.

Are you ever printing integers as integers (as opposed to in the middle of a string)? If not, then the readability of the file is unaffected by text vs. binary.

(EDIT: That is to say, if you never use fwrite with integer or float data, then bob's your uncle.)

Thread: Loading a File

Thread Tools

Search Thread

Display

Loading a File

Similar Threads

Producer Consumer with Circular Buffer and Suspected Race Condition

gcc link external library

airport Log program using 3D linked List : problem reading from file

Unknown Memory Leak in Init() Function

archive format

Tags for this Thread