Hello
Just wanted to sound people out on methods of transferring data from a .csv file into a custom built container which will provide the row and cell values of a spreadsheet. I'm working on a mac doing some work in obj C and some in C.
The current version loads the raw data simply into dynamic memory (I hope). Initially I parsed the file, however I decided it may be better to just load the thing and parse later. So after doing an initial run through to count the chars, figure out the maximum number of rows. then used the character count to malloc a chunk of memory
Code:
//Count and set ivars for rows (lines), columns and characters (size of memory needed)
while ((c = fgetc(fp)) != EOF ) {
numChars++;
if ( c == '\n') {
tempNumColumns++;
if(tempNumColumns > numColumns) {
numColumns = tempNumColumns;
}
tempNumColumns = 0;
numRows++;
}
else if (c == ',') {
tempNumColumns++;
}
}
Then:
Code:
rewind(fp);
data = malloc( (numChars * sizeof(char) ) + 1);
fread(data, sizeof(char), numChars, fp);
data[numChars] = '\0';
printf("\nData:: \n\n%s", data);
fclose(fp);
Then I have another method/function to parse the file. I sense this could be inefficient, but somehow it felt better to get the file read in and then depending on the use I could have different parsing routines.
At the moment I played around with some array classes from the mac library, however I think I would prefer to do this all in C. Here is the current dilemma - sometimes the data will be in the form of text, sometimes floats, sometimes ints etc. On this basis it would probably be best to treat the whole lot as a series of strings and set up either an array (=rows) of arrays (=columns) of char arrays (=entries). If I do this, I could either statically allocate memory (in which case I would have to overestimate and this would possibly be wasteful of memory), or dynamically allocate the whole lot (in which case I would have to calculate the length of each entry before calling malloc, which could be expensive on processing, but save memory). My last thought also, was that dynamic allocation for a string which may only contain one character, (e.g. if the entry was an integer) might be very wasteful, but equally a char array[50] would also be wasteful too.
Another thought had was to parse to identify the type of data and assign the data type accurately, probably using a struct for the data. This feels like it would be more messy as I would have to also include checks for the type/
What are other people's thoughts? For example:
a) Is is simply better to parse whilst reading the file in the first place?
b) Is it better to dynamically allocate memory for the whole thing, or do a mixture (e.g. is there going to be any noticeable performance issue anyway) or just overcompensate on static memory via arrays
Any thoughts would be greatly appreciated