-
Thanks for all the replies...
I used the zStrrmv() function from iMac and implemented the changes recommended by Salem. The import time is almost the same (about 50 mins).
I will follow ledov's advice to partially see the time consumed for each part of the import process. It turns out sqlite3 API calls takes time (since they are called frequently for each line).
-
I've got a suggestion for you.
Let's ignore the sqlite3 API calls for now. Take these as "fixed" for the purpose of improving the rest of the program.
Make a dummy input file, that requires say 5 -15 minutes to process, but excludes the sqlite3 API calls.
Post a link to that dummy input file, along with the format you want it to be in upon completion, for optimal results.
Then go off and enjoy the holidays. When you return after Christmas, I'll bet some Arctic Elf will have the input portion of the program streamlined. ;)
'Tis the Season.
-
Agreed. You need to profile the thing to work out where the time is being spent. One way to essentially do that is to comment out a certain part and see how much faster it becomes.
So yes please comment out the sqllite calls and time that.
-
Go to my website and take my csv file parser. It might be fast enough.
-
After one week, we still have *zero* lines of an input file example. Also, we have no example of the final format needed for the sqlite insertion, either.
fnoyan, can you post this info?
-
If, as some of us suspect, the SQL is the elephant in the room, then try this.
Code:
void tryit ( ) {
const char *word = "TestElement";
//!! pick some numbers which represent your data
const int numLines = 1000000, numFields = 100;
for ( int line = 0 ; line < numLines ; line++ ) {
for ( int field = 0 ; field < numFields ; field++ ) {
sqlite3_bind_text(stmt, 1, word, -1, SQLITE_STATIC);
}
if (sqlite3_step(stmt)!=SQLITE_DONE)
return sqlite3_errcode(db);
sqlite3_reset(stmt);
}
}
If this still takes an hour, then you're wasting your effort saving a couple of minutes on improving (although there is some very low hanging fruit to be picked).
Your efforts might be better employed looking for more efficient use of SQL to begin with.
Importing CSV would be so common that there should be an efficient way of doing it.
-
@Adak,
Actually, there is no "final format". Input file is CSV and process is "importing it" into DB. Below is first 4 lines of one example file. You can duplicate last two lines as much as you want (the file I am importing has 20500 lines, not huge!)
Code:
BSCName,BIDIRECTION,RNCID,CELLID,NCELLRNCID,NCELLID,CIOOFFSET,CELLSFORBIDDEN1A,CELLSFORBIDDEN1B,SIB11IND,IDLEQOFFSET1SN,IDLEQOFFSET2SN,SIB12IND,CONNQOFFSET1SN,CONNQOFFSET2SN,TPENALTYHCSRESELECT,TEMPOFFSET1,TEMPOFFSET2,NPRIOFLAG,NPRIO,MBMSNCELLIND
BSC Name,Add Bidirection Neighboring Cell Or Not,RNC ID,Cell ID,RNC ID of a neighboring cell,Neighboring cell ID,Neighboring cell oriented CIO[dB],Affect 1A threshold flag,Affect 1B threshold flag,SIB11 Indicator,IdleQoffset1sn[dB],IdleQoffset2sn[dB],SIB12 Indicator,ConnQoffset1sn[dB],ConnQoffset2sn[dB],HCS Cell Reselect Penalty Timer[s],HCS Cell Reselect TempOffset1[dB],HCS Cell Reselect TempOffset2[dB],The flag of the priority,The priority of neighbor cell,MBMS neighboring cell indicator
O3D1_PXXX00, ,2840,12662,2340,12671,0,AFFECT,AFFECT,TRUE,0,0,FALSE,,,D0,,,FALSE,,TRUE
O3D1_PXXX00, ,2840,12662,2340,12759,0,AFFECT,AFFECT,TRUE,0,0,FALSE,,,D0,,,FALSE,,TRUE
@Salem,
I was thinking so that importing CSV is so common but since sqlite's concept is to keeping things simple, there is no such function (there is one provided with sqlite command line tool, which is open source).
Yes, you are right, first thing is to try a "simple" import to see sqlite performance.
-
I was experimenting with fnprintf's code on the previous page, and a 20,500 line data file, as the OP suggested.
Most assuredly, there is no input bottleneck with this or other reasonable code. fnprintf's program (as tweaked a bit to get input from a a file), handles it in less than half a second.
Code:
//fnprintf's code
#include <stdio.h>
#include <string.h>
#include <time.h>
int main (void)
{
char *p, csv[100];
char rmc[] = ".:;`~*\\?=)(/&%+^'!\"|}][{$ #@\t";
char pstr[30][100];
char ostr[30][100];
int i=0,c=0;
clock_t timer;
FILE *fpIn,*fpOut;
fpIn=fopen("data20500.txt", "r");
fpOut=fopen("dataOut20500.txt","w");
if(!fpIn ) { // || !fpOut) {
printf("Error opening file\n");
return 1;
}
timer=clock();
while(fgets(csv, sizeof(csv),fpIn)) {
i=0;
p = strtok (csv, ",");
while (p)
{
strcpy (pstr[i], p);
i++;
p = strtok (NULL, ",");
}
c = i;
for (i = 0; i < c; i++)
{
p = strtok (pstr[i], rmc);
strcpy (ostr[i], "");
while (p)
{
strcat (ostr[i], p);
strcat (ostr[i], " ");
p = strtok (NULL, rmc);
}
}
for (i = 0; i < c; i++)
{
fprintf (fpOut,"%d: %s\n", i, ostr[i]);
}
}
timer=clock()-timer;
printf("Elapsed time: %f seconds\n",(double)timer/CLOCKS_PER_SEC);
fclose(fpIn);
fclose(fpOut);
return 0;
}