Thread: Problems while reading a file line by line

  1. #1
    Registered User
    Join Date
    May 2014
    Posts
    69

    Problems while reading a file line by line

    Hello! I have several problems with my method and I really don't see where my mistakes are, so maybe someone could help me!

    I am trying to read a file line by line and then do something with the informations, so my method looks like this:

    Code:
    void open_file(char *link) {
    
    
        FILE *file = fopen(link, "r");
    
    
        if (file == NULL) {
    
    
            fprintf(stderr, "Could not open file.\n");
            exit(EXIT_FAILURE);
        }
    
    
        int ge, head_l, seq_num, seq_i, i, read_ll;
        char *line, **head, **seqs;
        BOOL head_s = FALSE;
        rep_sequences *rep_s;
        ge = 56000;
        head_l = 100;
        seq_num = 10;
    
    
        ALLOC(line, char, ge);
        ALLOC(head, char*, seq_num);
        ALLOC(seqs, char*, seq_num);
        for (seq_i = 0; seq_i < seq_num; seq_i++) {
            ALLOC(seqs[seq_i], char, ge);
            ALLOC(head[seq_i], char, head_l);
        }
        seq_i = 0;
    
        while (fgets(line, ge, file) != NULL) {
         
            read_ll = strlen(line);
            if (line[read_ll - 1] == '\n'
                    || (line[read_ll - 1] != '\n' && feof(file))) {
    
    
                if (line[0] == '>') {
    
    
                    if (head_l < read_ll - 1) {
    
    
                        while (head_l < read_ll - 1) {
    
    
                            head_l *= 2;
                            REALLOC(head[seq_i], char, head_l);
                        }
                    }
                    
                    memcpy(&head[seq_i], &line, read_ll-1);
                    head[seq_i][read_ll-1] = '\0';
    
    //realloc to shrink the memory
                    head[seq_i] = realloc(sizeof(char)*read_ll);
                    seq_i++;
                    head_s = TRUE;
                    
     
                } else {
    
    
                    if (!head_s) {
                        exit(EXIT_FAILURE);
                    }
    
    
                    strncpy(seqs[seq_i - 1], line, read_ll - 1);
                    seqs[seq_i - 1][read_ll] = '\0';
                    head_s = FALSE;
                }
            } else {
    
    
                ge *= 2;
                line= realloc(sizeof(char)*ge);
            }
        }
    /*do something... */
    
    
        FREE(line);
        fclose(file);
        for (i = 0; i < seq_num; i++) {
    
    
            FREE(head[i]);
            FREE(seqs[i]);
        }
        FREE(head);
        FREE(seqs);
    }
    1) The first complain of valgrind is at the line where I use fgets and its telling me (invalid write of size x), but I have allocated my line to 56000 and the read line is shorter, why is there a write size error then :S?

    2) at the line where I realloc where I try to shrink the space he's telling me: Address .... is 0 bytes inside a block of size 56000,
    But I know i need only this space so why is there a write over space error :S??

    I am getting really desperate, since there are a lot of more and I don't see where but i am apparently doing everything wrong
    Last edited by tinchi; 08-17-2014 at 09:00 AM.

  2. #2
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    You use genome in the fgets call, but the declaration of genome was not shown. Your call to realloc is simply invalid because it has missing arguments.

    EDIT:
    You should write functions that do one thing and do it well. You're probably trying to take the advice to simplify your code for posting, and that's good, but because your functions are too long and do too much to begin with, simplifying them is a difficult task.
    Last edited by laserlight; 08-17-2014 at 08:59 AM.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  3. #3
    Registered User
    Join Date
    May 2014
    Posts
    69
    OH no sorry that is a typo! It is ge instead of genome - so it is a copy error here, in my code it is correct

  4. #4
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,666
    > while (fgets(line, genome, file) != NULL)
    How (or where) is genome declared?
    Why aren't you passing ge, which is the variable you used to allocate memory.

    How are your ALLOC etc macros implemented?
    Why are you using them to begin with - all they seem to do is obfuscate the issue.
    Worse still, you flip between your macros and the real calls. If you're going to use the macros, at least be consistent about it.


    > (line[read_ll - 1] != '\n' && feof(file))
    feof() will never be true inside the loop. If fgets() succeeds, then you are NOT at the end of the file.

    > void *realloc(void *ptr, size_t size);
    vs.
    > line= realloc(sizeof(char)*ge);
    How on earth are you getting away with running a program with such an obvious compilation warning in the code?

    Is your program warning free when compiled with the -Wall option?
    If not, come back when it is.

    This should be your test program for your valgrind tests (at least for now).
    Code:
    int main ( ) {
        open_file("file.txt");
        return 0;
    }
    EDIT
    > so it is a copy error here, in my code it is correct
    Stop wasting everyone's time then!
    Post your actual code - we've got better things to do that second-guess your finger troubles.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  5. #5
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by tinchi
    I am trying to read a file line by line and then do something with the informations
    What is the file format? What exactly are you trying to do with what is read from file?

    Consider this from the perspective of people trying to help you: you give a vague description of what you are trying to do, mention an error message from Valgrind that seems to have been edited ("invalid write of size x" - x?), then provide code that isn't what you compiled. That's like driving your neighbour's car to the mechanic, telling him/her that there's a problem without specifying what, and then expecting him/her to fix your car.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  6. #6
    Registered User
    Join Date
    May 2014
    Posts
    69
    I don't get this forums really, I thought it is supposed to be for people who try to learn something and then have problems. Or do you think I sit here and put typos on purpose in my code? I think it is more in my interest to have progress and solve the problems then for you. But thank you really much for you advice

  7. #7
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by tinchi
    I don't get this forums really, I thought it is supposed to be for people who try to learn something and then have problems. Or do you think I sit here and put typos on purpose in my code?
    No, but if there are typos in your code, how do we know that the problem isn't caused by something that appears to be a typo when you posted versus something that is a typo in the code that you compiled? Therefore, we must act on the typos as if they are really in your actual code, because they might be.

    What I can suggest is for you to start small. Instead of writing a long function, finding problems, and then try to simplify it and come here for help, write a small problem and test it. At that point, if you find a problem, you can come here and post the function in full without simplifying it. This way, there will be no typos, hence what we address will be the problems in your actual code. Slowly but surely, you can add more small functions, building on them to write a complete and well tested program.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  8. #8
    Registered User
    Join Date
    May 2014
    Posts
    69
    Ok so I will try one more time, with copying everything without changing even a variables name:

    Makros:

    Code:
    #define ALLOC(P,T,S) if(((P)=(T*)malloc(sizeof(T)*(S))) == NULL) {\
          fprintf(stderr,"malloc failed");\
          exit(EXIT_FAILURE);\
            }
    #define REALLOC(P,T,S) if(((P)=(T*)realloc(P,sizeof(T)*(S))) == NULL) {\
              fprintf(stderr,"realloc failed");\
              exit(EXIT_FAILURE);\
            }
    My nice not working method.
    Code:
    void open_file(char *link) {
    
    
        FILE *file = fopen(link, "r");
    
    
        if (file == NULL) {
    
    
            fprintf(stderr, "Could not open file\n");
            exit(EXIT_FAILURE);
        }
        int ge, head_l, seq_num, seq_i, i, read_ll;
        char *line, **head, **seqs;
        BOOL head_s = FALSE;
        ge = 56000;
        head_l = 100;
        seq_num = 10;
    
    
        ALLOC(line, char, ge);
        ALLOC(head, char*, seq_num);
        ALLOC(seqs, char*, seq_num);
        for (seq_i = 0; seq_i < seq_num; seq_i++) {
            ALLOC(seqs[seq_i], char, ge);
            ALLOC(head[seq_i], char, head_l);
        }
        seq_i = 0;
    
    
        while (fgets(line, ge, file) != NULL) {
    
    
            read_ll = strlen(line);
            if (line[read_ll - 1] == '\n'
                    || (line[read_ll - 1] != '\n' && feof(file))) {
    
    
                if (line[0] == '>') {
    
    
                    if (head_l < read_ll - 1) {
    
    
                        while (head_l < read_ll - 1) {
    
    
                            head_l *= 2;
                            REALLOC(head[seq_i], char, head_l);
                        }
                    }
                    
                    memcpy(&head[seq_i], &line, read_ll-1);
    
    
                    // strncpy(head[seq_i], line, read_ll - 1);
                    head[seq_i][read_ll-1] = '\0';
                    printf("%s \n", head[seq_i]);
                    REALLOC(head[seq_i], char, read_ll);
                    seq_i++;
                    head_s = TRUE;
                    
     
                } else {
    
    
                    if (!head_s) {
    
    
                        fprintf(stderr, "No header was saved before. Wrong format!\n");
                        exit(EXIT_FAILURE);
                    }
    
    
                    strncpy(seqs[seq_i - 1], line, read_ll - 1);
                    seqs[seq_i - 1][read_ll] = '\0';
                    head_s = FALSE;
                }
            } else {
    
    
                ge *= 2;
                REALLOC(line, char, ge);
            }
        }
    
    
    /*only deleted part, because It is just doing some stuff to my data*/
    
    
    
    
        FREE(line);
        fclose(file);
        for (i = 0; i < seq_num; i++) {
            printf("%p \n", head[i]);
            printf("%p \n", &head[i]);
            FREE(head[i]);
            FREE(seqs[i]);
        }
        FREE(head);
        FREE(seqs);
    
    
    }
    I am trying to read text file line by line, and extract the first line separately from the other...

  9. #9
    Registered User
    Join Date
    May 2014
    Posts
    69
    But yes, maybe you are right, i Should split it into several small methods and test step by step. So I go back to work..thanks

  10. #10
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,666
    I suppose one last chance at giving you some advice, just to see how much you ignore (I'm pointing at the feof() thing you still have there).

    > while (head_l < read_ll - 1)
    Why don't you cut the waste and just go with
    head_l = read_ll;
    and be done with it.

    For that matter, get rid of these as well
    Code:
        for (seq_i = 0; seq_i < seq_num; seq_i++) {
            ALLOC(seqs[seq_i], char, ge);
            ALLOC(head[seq_i], char, head_l);
        }
    and
    Code:
    //realloc to shrink the memory
                    head[seq_i] = realloc(sizeof(char)*read_ll);
    and just go with a simple line to malloc exactly what you need on a case by case basis. No messing around with guessed initial mallocs, no messing around with guessing reallocs. Just malloc what you need and be done with it.

    > memcpy(&head[seq_i], &line, read_ll-1);
    Now whilst line and &line get you the same value, when line is an array, when line is a char*, the use of & is simply BROKEN!
    You took the address of your pointer, and proceeded to copy a random amount of your stack.

    > printf("%s \n", head[seq_i]);
    Did your printf() actually print anything useful, or was it garbage?

    Code:
            } else {
                ge *= 2;
                REALLOC(line, char, ge);
            }
    Great - and what are you going to do with the partial line that fgets() has already stored in the first 'ge' bytes of memory? Yes they're preserved, but the next call to fgets() is just going to overwrite them.

    > seq_i++;
    How do you keep this less than seq_num?
    If this overflows, valgrind is the least of your worries.

    I'll save explaining the flaw in your REALLOC macro for another day.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 6
    Last Post: 06-07-2012, 02:50 AM
  2. Replies: 7
    Last Post: 12-13-2010, 02:13 PM
  3. reading words line by line from a file
    By -EquinoX- in forum C Programming
    Replies: 3
    Last Post: 05-04-2008, 12:34 AM
  4. Reading Input from a file, line-by-line
    By Acolyte in forum C Programming
    Replies: 8
    Last Post: 09-30-2007, 01:03 PM
  5. reading a file line by line and char *
    By odysseus.lost in forum C Programming
    Replies: 8
    Last Post: 05-31-2005, 09:47 AM