Thread: Skipping feww lines while copying froma 1 file to another

  1. #1
    Registered User
    Join Date
    Dec 2011
    Posts
    4

    Skipping feww lines while copying froma 1 file to another

    Hello,

    I am writing a c program in to copy all the lines from a file to another file, only in this case, while copying from the original file,i need to skip a few lines.

    What i mean is that a few lines from the oringinal file should not be copied to the new file.

    I am using the fgets function.

    What is the easiest solution to this?

  2. #2
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    When you read in a line you don't want, don't write it out.

  3. #3
    Registered User
    Join Date
    Nov 2011
    Posts
    72
    fgets only takes the first line, you dont know how many lines are there in the text so i think you should write your own fgets function to do better operations.
    Not:before calling this function; you should open your file, and after you called the function you should close the file.

    Code:
    //***********MY FGETS FUNCTION**************
    char *Tomgets(FILE *filex,char *filename)
    {
    char *p;
    char c;
    int i=0;
    len=-1;                //abc\0 == 0 1 2 3 (their position in array a[0] a[1]... )
    
    
    
    
        for(i=0;c!=EOF;i++)  // from starting of text file to END OF FILE ,the file length will calculated.Why? because if you want to save the content of the text file to an array, you will use this length to open the size of array
            
        {
            c=(fgetc(filex)); //gets all the letters in a loop
            len++;
        }
        fclose(filex);
    //-------------closed and reopened file to start fgetc from beginning to the end-------------
        filex=fopen(filename,"r+"); //file opened for reading
            p=(char *)malloc(len*sizeof(char)); 
        for(i=0;i<len;i++) p[i]=(fgetc(filex)); //saves the content of the file to 'p' array.
            //p[i]='\0';  you can do it to do better.
    return p;  //function returns to the text file content.
    }
    you can do here, if there is an \n and after if there is unwanted words etc. , you may continue your loop.

  4. #4
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Quote Originally Posted by alienator View Post
    Hello,
    I am writing a c program in to copy all the lines from a file to another file, only in this case, while copying from the original file,i need to skip a few lines.
    By what criteria... Content? Length? Marker? ???

    I am using the fgets function.
    What is the easiest solution to this?
    Just read the file in a loop...
    Code:
    #define MAX_LINE_LENGTH 80  // change as necessary
    
    char buffer[MAX_LINE_LENGTH]  
    
    
    // open input file
    // make sure it's open
    // open output file
    // make sure it's open
    
    while (fgets(buffer,MAX_LINE_LENGTH,infile))
      {
        if (! ExcludeLine(buffer))  // write exclusion routine like this
         fputs(buffer,outfile);
      }
    
    // close input file
    // close output file
    It's not hard...
    Last edited by CommonTater; 12-31-2011 at 08:44 PM.

  5. #5
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Quote Originally Posted by rac1 View Post
    fgets only takes the first line, you dont know how many lines are there in the text so i think you should write your own fgets function to do better operations.
    1) fgets() reads the file one line at a time... got 25000 lines... use a loop.

    2) Nobody reads files character by character... that's just slow and silly

    3) If the file is already open why on earth would you close it and re-open it,

    4) If you did #3, why wouldn't you check if it actually opened?

    5) if you want to know how big a file is...
    Code:
    fseek(file,0,SEEK_END);
    size = ftell(file);
    fseek(file,0,SEEK_SET);
    6) In c you don't cast the return value of malloc()... if your compiler complains it's in C++ mode.

    7) Your routine is horridly inefficient since it has to read the file twice... character by character, no less.

    8) if you simply want to read the whole file into memory...
    Code:
    char *buffer = malloc(size);
    fread(buffer,1,size,file);
    Last edited by CommonTater; 12-31-2011 at 08:50 PM.

  6. #6
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    You missed a few:

    9) c is uninitialised and could therefore easily happen to be initially equal to EOF causing the first loop to be skipped.

    10) the resulting string that is returned is not null-terminated thus the caller has no way to determine the length of the string. Note that the malloc does not even allocate enough room for a null-terminator.

    11) i is unnecessarily given the value of zero twice initially.

    12) There is no check to see that the malloc succeeded.

    13) After calling this function, the file handle that was passed in is invalid. Since filex is not passed out, the file is now stuck open and cannot be closed.

    14) Somewhat poor indentation and use of whitespace.

    I sure hope somebody learnt a few things here!
    Last edited by iMalc; 01-01-2012 at 12:45 AM.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  7. #7
    Registered User
    Join Date
    Dec 2011
    Posts
    4
    @ Common Tater

    >>>When you read in a line you don't want, don't write it out.

    That is right, but the thing is that i am dealing with duplicate lines here. I need to keep only 1 copy of the original but i need a way to distinguish in between.
    Last edited by alienator; 01-06-2012 at 11:43 PM.

  8. #8
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    Usually, getting rid of duplicate strings is easiest by sorting the strings, which will group the duplicates all together. Then in your input of the strings, you check the string behind it ONLY, to see if it's a duplicate. If it is, just don't write it out, and take in the next string - keep comparing until you reach a string that is not the same as your "reference" string.

    Sometimes, you need to do this, but you also need to keep the strings in their original order - that's no problem. You can "sort" them, without actually moving them, by using an integer array (or pointer array), and swapping the integer index instead of swapping the actual strings. You wind up with the original data, unchanged, but now you can remove the duplicates, using the above technique, by referring to the strings THRU the index array (or pointer array).

    It's harder to describe than it is to code it up, once you know how the logic works.

    Interested?

  9. #9
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    If you have some spare memory (trade-off), a potentially more efficient way is to hash each string, and go through them one by one, and for each one, if the hash is already in a hash table, skip the line, otherwise add the hash to the hash table. Only keys are used in the hash table.

    If the hash table is big enough and the hashing algorithm is good, this is O(n).

    This is similar in concept to bucket sort.

  10. #10
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    I was thinking along similar lines to cyberfish... hashing is one technique, so is checksumming (similar but easier)... just make a checksum value for each string as you load it... stuff it into an array or linked list and check if there's any matches as you load more strings... The advantage is that it reduces 64+ byte strings down to 4 bytes... Checksums are very easily produced, just add up the character values in each string then multiply by the number of characters... Not perfect but unique enough to give you a 90% solution.

  11. #11
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    A checksum is really nothing but a poor hash. FNV-1a is about as simple yet far better and very widely used and documented.
    Personally I tend to use a CRC for hashing.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. fgets skipping lines
    By wassat676 in forum C Programming
    Replies: 4
    Last Post: 05-29-2011, 11:45 PM
  2. reading a file with getline and skipping lines..
    By kocmohabt33 in forum C++ Programming
    Replies: 2
    Last Post: 01-29-2011, 12:37 AM
  3. skipping lines with an input text file
    By kwikness in forum C++ Programming
    Replies: 7
    Last Post: 12-12-2006, 09:11 AM
  4. skipping lines
    By LightKnight86 in forum C++ Programming
    Replies: 2
    Last Post: 09-20-2003, 08:26 PM
  5. compiler skipping lines..?
    By Linette in forum C++ Programming
    Replies: 6
    Last Post: 04-12-2002, 11:59 PM