Thread: Move to specified line in CSV file

  1. #1
    Registered User
    Join Date
    Nov 2006
    Posts
    224

    Talking Move to specified line in CSV file

    Hi,

    I have a csv file and I want to move to a specific line and then begin overwriting data from that line onwards.

    I know from certain variable the specific line I want to move to.

    For example,

    Say I had a csv file with 5 lines of data, I may wish to move to line 3 and then begin to re-write data on line 3 and onwards.

    simple eg) csv file
    1,3,4
    2,3,5,6,7
    1,2,3
    3,4,5
    6,7,8

    Move to 3rd line (1,2,3) and begin to edit from there onwards. Everything on and after line 3 is irrelevant and will be overwritten so it can as well just be deleted from that point onwards.

    So, again. I do not know what will be the data on line 3 (in this example) but I do know that line 3 is where I want to start editing from.

    Any ideas/suggestions?

    many thanks!!

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,656
    The easiest (and safest) thing to do is to
    - read the lines you want to keep, then write them to a new temp file
    - write your new data to the new temp file.

    When you're done, then
    - rename the old file to say file.bak
    - rename the temp new file to whatever you want

    The problems with your approach are
    - what happens in case of program/machine failure - your file is partly trashed
    - what happens if you write less data than was already there - you have trash at the end of the file.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    Registered User
    Join Date
    Nov 2006
    Posts
    224
    Thanks for your reply.

    I take your points.

    However, I do not want to keep a temp file since this particular CSV file is likely to be pretty big (up 200,000+ lines) and take up some time and space.

    problems
    - You mention about a program failure. What would happen in this case to the data already written to the file? Surely I could move back to a specfifc line and begin to re-write?
    - To overcome the problem with trash at the end of the file. Could I not move to a specified line (which I know to be less than the end of file position) and then delete everything after this point before I begin to re-write?

  4. #4
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,656
    > However, I do not want to keep a temp file since this particular CSV file is likely to be pretty big (up 200,000+ lines) and take up some time and space.
    A 20MB file for a program running on a machine with processors in the GHz range, with RAM in the GB range and disk space also in the GB range is going to be no more than a blip in the grand scheme of things.

    > - You mention about a program failure. What would happen in this case to the data already written to the file?
    Maybe the file system sees the file as corrupt (or at least inconsistent) and just deletes it.

    > Surely I could move back to a specfifc line and begin to re-write?
    In theory, you can record each tellg() and then read the corresponding line. Later, you can use either seekg() or seekp() to go back to a particular line and then start reading or writing.

    > Could I not move to a specified line (which I know to be less than the end of file position) and then delete everything after this point before I begin to re-write?
    Would you know how much to "delete", by say overwriting it with something meaningless like newlines or spaces?
    Some file systems provide a "truncate" feature, which may just work for you (but it isn't portable).
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  5. #5
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    O_o

    How important is your data?

    Very Important: Don't even try to update files "in place"; let your file system and hardware do their jobs by writing out a new file.

    Otherwise: Feel free to do an "in place" update.

    Soma

  6. #6
    Registered User
    Join Date
    Nov 2006
    Posts
    224
    Thanks both for the advice and help!

    I need to re-think my approach: Is there no way to save your files in c++? If I could 'save' them periodically then that would suffice and in the event of a failure I could always have certainty as to where I'm at.

    So, what happens then when we write to a file? Is it 'saved' straight away as soon as the command is given or does it all get written when the file is closed??

    thanks again

  7. #7
    Registered User rogster001's Avatar
    Join Date
    Aug 2006
    Location
    Liverpool UK
    Posts
    1,472
    It is not 'saved' until the filestream is closed
    Thought for the day:
    "Are you sure your sanity chip is fully screwed in sir?" (Kryten)
    FLTK: "The most fun you can have with your clothes on."

    Stroustrup:
    "If I had thought of it and had some marketing sense every computer and just about any gadget would have had a little 'C++ Inside' sticker on it'"

  8. #8
    Registered User
    Join Date
    May 2010
    Posts
    4,633
    So, what happens then when we write to a file? Is it 'saved' straight away as soon as the command is given or does it all get written when the file is closed??
    The data is not actually written until one of several conditions is met.

    1. You manually flush the stream with fflush() or endl;

    2. The output buffer becomes full.

    3. The file is closed.


    Jim

  9. #9
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,656
    ostream::flush - C++ Reference
    This ensures that any data you have output is consistent with the environment.

    fsync(2) - Linux man page
    This ensures that the environment flushes any data to the real storage.
    It's only after such a call (or closing the file) that the data is likely(*) to be permanently stored.

    If you're just appending data to the file, you should probably try just opening the file in "append" mode, write what you need to and then close it again.



    (*) Hard disks also have caches, but you have to assume that the OS device drivers correctly manage these.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  10. #10
    Registered User
    Join Date
    Nov 2006
    Posts
    224
    Thanks again for replies and advice; it is much appreciated!

    It seems as if the close function just won't do for me... because it is likely that the program will be closed in a proper manner and not abruptly terminated.

    That being said, I have tested this and will require to overwrite from a specific point in the file.

    Thus far, I am able to move in the CSV file to the exact location from where I want to overwrite by using:
    Code:
    				  f.seekp(ios::beg);
    				  for(long int i = 0; i < temp_v; ++i)
    				  {
    					  f.ignore(numeric_limits<streamsize>::max(),'\n');
    				  }
    Doing so gets me to the line that I now want to start overwriting from. But now, my question is, how do I set the stream pointer to this position???

    thanks

  11. #11
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,656
    tellg() tells you where the 'get' (reading) pointer has got to.
    seekp() will then move the 'put' (writing) pointer to that position.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  12. #12
    Registered User
    Join Date
    Nov 2006
    Posts
    224
    Hi thanks,

    Basically this is what I have. Note: f is declared as fstream


    Code:
    f.open(Name_v, ios::binary|ios::in|ios::out);
    long int posl;
    
    f.seekp(ios::beg);
    for(long int i = 0; i < temp_v; ++i)
    {
      f.ignore(numeric_limits<streamsize>::max(),'\n');
    }
    posl = f.tellp();
    f.seekp(posl);
    When I check the file. It does not overwrite from the point where it moves to (temp_v).
    Instead it just continues to append from the end of the file as if I didn't do any of the moving to a specific line stuff.

    Any ideas?

    cheers

  13. #13
    Registered User
    Join Date
    May 2010
    Posts
    4,633
    You must use the correct open mode if you want to overwrite the information. The default (out) will always append to the end of the file. See this link for the open modes: ofstream.open. You may want to try ate.

    Jim

  14. #14
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    O_o

    Just in case you've not been made aware; this is not going to be as easy as your code implies that you think it will be done.

    To update a file "in place" means lots of bookkeeping and manual buffering.

    For example, with a simple naive attempt garbage will be left on the next line when the updated line has fewer bytes than the old line.

    An alternative example, with a simple naive attempt replacing a line will consume parts of valid data when the updated line has more bytes than the old line.

    [Edit]
    With a note because your posts imply you think there is a performance issue:

    You'll have to rewrite every byte of the file after an updated line that has fewer or more bytes than the old line.

    In other words, if you need to update the first line of the file with data that has only has a one byte difference you'll still have to read and rewrite the entire file to account for that difference from the point that difference occurs.
    [/Edit]

    [Edit]
    Also, have you considered just journaling updates and writing multiple updates at once on occasion?

    So, you would write the updates lines to a new file and read the file along with the old file.
    [/Edit]

    Soma
    Last edited by phantomotap; 06-29-2012 at 11:50 AM.

  15. #15
    Registered User rogster001's Avatar
    Join Date
    Aug 2006
    Location
    Liverpool UK
    Posts
    1,472
    how is this to be used? are you planning to implement on a single machine with a known capability? If so you 'could' do that based on the scope of that machine, and forget about some of the other stuff, i mean you could maybe buffer the whole data and parse it out, but that is still pretty poor in my opinion, and if the data is read from a network connection, then bin that idea. I think you have seen some of the best suggstions, temp files so the source data is only read etc.
    Thought for the day:
    "Are you sure your sanity chip is fully screwed in sir?" (Kryten)
    FLTK: "The most fun you can have with your clothes on."

    Stroustrup:
    "If I had thought of it and had some marketing sense every computer and just about any gadget would have had a little 'C++ Inside' sticker on it'"

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 6
    Last Post: 06-07-2012, 02:50 AM
  2. Move pointer to next line in .txt file
    By motarded in forum C++ Programming
    Replies: 6
    Last Post: 03-01-2006, 10:18 AM
  3. Move the Caret to a line
    By TheDan in forum Windows Programming
    Replies: 3
    Last Post: 08-07-2005, 12:59 PM
  4. How do you move up a line?
    By epb613 in forum C Programming
    Replies: 3
    Last Post: 05-31-2005, 01:44 PM
  5. How do I make my edit box move text to the next line?
    By ElWhapo in forum Windows Programming
    Replies: 2
    Last Post: 01-04-2005, 11:21 PM