Thread: File import opinion.

  1. #1
    {Jaxom,Imriel,Liam}'s Dad Kennedy's Avatar
    Join Date
    Aug 2006
    Location
    Alabama
    Posts
    1,065

    File import opinion.

    I wrote an embedded system that captures data. Originally, we wanted have one file per day. This, however, became a problem with the web interface on the embedded machine. I had to break the captured data at the hour level.

    There is a way to bring the data from the embedded machine to a desktop computer. I elected to make a *hopefully* small data store on the local computer (until we release the server-side database with web front-end) so that the history of the information can be retained and reviewed locally without the need of the USB storage unit.

    The Question: I'd like to import in the data as quickly as possible (if I can, I'd like to make it as fast as or faster than a copy), but at the same time, I need to sort the data into the files that may already exist. My main thought as to how I have done it so far is:
    1) Look at the file name. If the file name was unchange from the embedded system, the file format is "%05i%i%02i%02%02.%s", Unit Number, Year, Month, Day, Hour, extension.
    a) From here, I grab all the files with unit, year, month, and day the same and load all the files (should be 24 of these--duh) into one array.
    b) I then grab the potentially existing file.
    c) Next merge the two file together into one list (updating any record that may have more information).
    d) Sort the list via qsort().
    e) rewrite the data.
    2) Someone out there will rename one of these files. To handle this, I'll have to process the files one at a time.
    3) I have also made a directory import, or in other words, I have given the user the chance to select a directory that (supposedly) contains these file types, then I parse through the entire directory/tree structure and import all the data sequentially.

    Would it be better (keeping in mind that the target platform is Windows 2000 and up) to "fork" each recursive call to the function that does the directory import? I would have to handle shared memory for the answer back for which file name was created (we are planning to return the LATEST information to be displayed in the program) to determine which file to display. . . which I'm not sure how one handles this in Windows, but I could figure that out without much problem. But the main thing that I don't know, and would rather not have to find out AFTER I have spent the time required to do all of the above, is would this gain me anything, or I'm I still bottle-necking at the device?

    Or, to simplify, If I call Split() in Windows, then attempt to parallel read data from an USB storage device, will I gain ANY time if the ammount of data is large (upwards of 1GB)?

  2. #2
    Yes, my avatar is stolen anonytmouse's Avatar
    Join Date
    Dec 2002
    Posts
    2,544
    Or, to simplify, If I call Split() in Windows, then attempt to parallel read data from an USB storage device, will I gain ANY time if the ammount of data is large (upwards of 1GB)?
    Probably not. If one thread is reading the device at 100% (or near enough), then using two threads is not going to help and may slow down overall performance as many storage devices will be more efficient reading from one location rather than reading from two locations in parallel.

    In regards to the rest of your post, it sounds like you will be implementing database functionality. I'd suggest you look into using a database manager (such as SQLLite), to manage your data internally. This will handle updating, adding, sorting and retrievel of data, dramatically simplifying your code. You can then retrieve the data from your internal database as needed and write to file.

  3. #3
    {Jaxom,Imriel,Liam}'s Dad Kennedy's Avatar
    Join Date
    Aug 2006
    Location
    Alabama
    Posts
    1,065
    Quote Originally Posted by anonytmouse
    I'd suggest you look into using a database manager (such as SQLLite), to manage your data internally. This will handle updating, adding, sorting and retrievel of data, dramatically simplifying your code. You can then retrieve the data from your internal database as needed and write to file.
    Too late. I had to do all the basic functionality for the embedded system anyways, so, I was able to use all the functions I made for that to do the stuff on Windows. I had no choice to make individual files (on the embedded system), as this is the way the boss wanted it done. . . at least I conviced him to go with a pack binary file. . . he initially wanted flat text files .

    BTW, thanks, that's what I was thinking, however, I hoped I was wrong. It takes a loooooong time to imoort 1GB of data. . .

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Can we have vector of vector?
    By ketu1 in forum C++ Programming
    Replies: 24
    Last Post: 01-03-2008, 05:02 AM
  2. Encryption program
    By zeiffelz in forum C Programming
    Replies: 1
    Last Post: 06-15-2005, 03:39 AM
  3. archive format
    By Nor in forum A Brief History of Cprogramming.com
    Replies: 0
    Last Post: 08-05-2003, 07:01 PM
  4. Making a LIB file from a DEF file for a DLL
    By JMPACS in forum C++ Programming
    Replies: 0
    Last Post: 08-02-2003, 08:19 PM
  5. Hmm....help me take a look at this: File Encryptor
    By heljy in forum C Programming
    Replies: 3
    Last Post: 03-23-2002, 10:57 AM