Thread: once mor file operation

  1. #1
    Registered User stormbringer's Avatar
    Join Date
    Jul 2002
    Posts
    90

    once mor file operation

    hi

    i have some question about file operations, working with vc++ on win2k, console application:

    i have a huge text-file (5gb+) and need to change a number on every page. There are several ways I could do this:

    1. read an certain amount of data in a buffer, work on a buffer and the fprintf it to a new file. problem: after doing that it fills up 10gb+ of disk space, because there are two files at the time (of course one is deleteted at programm termination, but right befor that...). It could be, that the disk doesn't have that much left (no, i don't have money for a larger disk :-)?
    To solve that i could always delete the part of the file i already have in the buffer. but how?

    2. i could just insert the new numbers dirctly in the file and remove other numbers. but is that possible and how?). that would save me to copy the file and save a lot of time.

    3. i could go over the file and building up a linked list with the file positions, where the stuff to change is and then copy the file till the first position is reached, put my stuff and continiue (after a short overreading). but there's once more the 10gb problem.

    i already once wrote a similar (not the same) prog, that copied the file and on the fly changed stuff, but it used to have about 12 hours for a 3gb file and filled up my hd to 6gb. i'm pretty sure one could solve that mor efficient and i want to plan this next prog propperly.

    thanks, i apreciate your help.

  2. #2
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    What kind of number are you changing? Is it the same number of bytes? If so, just open in update mode and overwrite it. If not, you're pretty much screwed, and are going to have to make a few massive files.

    Your best bet then would be to make a file or list of "move N bytes through the file, and then write this number, skip over N bytes more, which is the number I'm replacing, before skipping N bytes more to reach the next space to write the new value. Rinse, repeat.

    Quzah.
    Hope is the first step on the road to disappointment.

  3. #3
    Registered User stormbringer's Avatar
    Join Date
    Jul 2002
    Posts
    90
    once i overwrite the same bytes, once (after a certain keyword) i insert some bites just in front of the number.

    so what you're proposing is to read once over the whole file and construct a linked list of file positions (e.g with a struct that contains from: to. then open a new one and copy and change on the fly.
    one problem there could be, that the memory gets filled up. i can catch that, by just writing what i've got so far and then free the list and eventually adjust the buffer-size. but how can i catch that memory is full. is it sure, that when malloc returns error, it is because there isn't enough memory? or can i check how much is available at the very second? because then i could decide to jus malloc let's say 75% of it so other programms want steal it. also i could design the buffer size dynamicaly on decide on the fly wheter it is important to have a smaller buffer or a smaller list.

    is that really the most efficient way?

    thanks

  4. #4
    Registered User stormbringer's Avatar
    Join Date
    Jul 2002
    Posts
    90
    looking good. however, i have a few questions:

    Code:
    void editbuff ( struct buff *buff ) {
               ...
               int len1 = s - buff->buff[i] + 5;   /* start to space after page */ 
               ^-- what do you do here exactly? (i don't unterstand to space)
                strncpy( tempbuff.buff[i], buff->buff[i], len1 );
                tempbuff.buff[i][len1] = '\0'; 
                ^--- her you "fill up" the chars not needed anymore with null, which don't consumes diskspace, right?
    some other questions:
    1. you fopen the file in r+, so you can read and write the stuff in the file, meaning, that every char in the way is just overwriten (like if you hit insert in a text editor), right?
    2. how do you ensure the first buffer dosn't catch up?
    thanks

    stormbringer
    Last edited by stormbringer; 12-20-2002 at 07:29 AM.

  5. #5
    Registered User stormbringer's Avatar
    Join Date
    Jul 2002
    Posts
    90
    i saw. edited the post: right as you've been replying :-) what i don't really understand: how is in this code ensured, that the buf1 doesn't catch up?

    because of the line of code. i saw what changed in the file. however, i first thought you replace page by @@ (i'm realy tired. maybe i should better get some sleep :-) )
    Last edited by stormbringer; 12-20-2002 at 07:33 AM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. i=++i; operation undefined?
    By password636 in forum C Programming
    Replies: 10
    Last Post: 04-16-2009, 09:46 AM
  2. how to change this simple opeation into power operation..
    By transgalactic2 in forum C Programming
    Replies: 9
    Last Post: 12-20-2008, 03:17 PM
  3. Replies: 5
    Last Post: 12-04-2008, 08:15 PM
  4. Replies: 16
    Last Post: 11-23-2007, 01:48 PM
  5. Serial Communications in C
    By ExDigit in forum Windows Programming
    Replies: 7
    Last Post: 01-09-2002, 10:52 AM