Hello,
I'd like some advice on how to go about reading a data file efficiently - that is, with as little wasted operation as possible.
The files I'm trying to read are in this format :
The columns should appear as either integers or floats, but in some files, I have E-notation on all columns, even those constrainted to be integers ( a quick test tells me reading this isn't a problem ).Code:# This is a comment. 1 1 0.24452 5.58872 3.54826 1.58262 -1 # Comments can appear anywhere 2 1 7.47274 0.37462 -1.28472 8.27462 1
Columns represent entry ID ( int ), structural location ( int ), x, y, z coordinates ( floats ), radius ( float ) and parent node ( int ).
I need to read in all non-comment columns, where the second column has a value of 3. I'm not sure what the best way to do this is. Knowing that each line is either a comment or consists of seven numbers, I could potentially read in a line using getline into a string, and test if the first character is a hash. If not, I could somehow use stringstreams, perhaps, though I still haven't worked out how.
So far, I'm working with commentless files, and the temptation is simply to read in two doubles from the line, and if the second has a value of 3, then continue reading and use the line appropriately. Obviously, this won't work once I have files with comments.
So, my question is simply - what would be the best way to deal with this ? I'm trying to waste as little computation as possible because the files can get relatively long, and each line with a 3 in the second column needs to be placed in some kind of structure, possibly a vector of double arrays, but that's still undecided. First, I'd love some advice on how you'd go about reading this kind of file, with these constraints in mind. I'm not asking for code, but suggestions on methodology would be fantastic.
Thanks very much,
Quentin