Thread: reading a txt file line by line and checking for an invalid character

  1. #1
    Registered User
    Join Date
    Apr 2018
    Posts
    43

    reading a txt file line by line and checking for an invalid character

    I wrote a code that reads a txt file line by line and checks if each line has the invalid character ' ' in it.

    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #define NCHAR 64
    
    char *readline (FILE *fp, char **buffer);
    
    int main (int argc, char **argv) {
    
        char *line = NULL;
        size_t idx = 0;
        FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
        if (!fp) {
            fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
            return 1;
        }
    
        while (readline (fp, &line)) {  /* read each line in 'fp' */
            printf (" line[%2zu] : %s\n", idx++, line);
            free (line);
            line = NULL;
        }
        if (fp != stdin) fclose (fp);
    
        return  0;
    }
    
    /* read line from 'fp' allocate *buffer NCHAR in size
     * realloc as necessary. Returns a pointer to *buffer
     * on success, NULL otherwise.
     */
    char *readline (FILE *fp, char **buffer) 
    {
        int ch;
        size_t buflen = 0, nchar = NCHAR;
        size_t n;
        char *invalid_character = " ";
    int i = 0;
        *buffer = malloc (nchar);    /* allocate buffer nchar in length */
        if (!*buffer) {
            fprintf (stderr, "readline() error: virtual memory exhausted.\n");
            return NULL;
        }
    
        while ((ch = fgetc(fp)) != '\n' && ch != EOF) 
        {
            (*buffer)[buflen++] = ch;
    
            if (buflen + 1 >= nchar) {  /* realloc */
                char *tmp = realloc (*buffer, nchar * 2);
                if (!tmp) {
                    fprintf (stderr, "error: realloc failed, "
                                    "returning partial buffer.\n");
                    (*buffer)[buflen] = 0;
                    return *buffer;
                }
                *buffer = tmp;
                nchar *= 2;
            }
        }
        (*buffer)[buflen] = 0;           /* nul-terminate */
        
       if (invalid_characters[n = strspn(invalid_characters, buffer[i])] == '\0')
        {
            puts(" valid characters");
            i++;
        } 
    
        if (buflen == 0 && ch == EOF) {  /* return NULL if nothing read */
            free (*buffer);
            *buffer = NULL;
        }
    
        return *buffer;
    }
    The text file looks something like this:
    programming //valid characters
    loves //valid characters
    programming cool //this is invalid because the line has a space in between the words it should not print valid characters

    Does someone know what I could be doing wrong?

  2. #2
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    You should be using strcspn instead of strspn, and you should be calling it separately in order to compare it with strlen, i.e., if the span that doesn't contain the invalid characters is equal to the length of the entire string, then the string is valid.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  3. #3
    Registered User
    Join Date
    Dec 2017
    Posts
    1,626
    buffer[i] makes no sense here.
    In fact, i itself makes no sense.
    What are you trying to do here?

    And, technically, invalid_characters should be const. (And spelled consistently!)
    Last edited by john.c; 12-08-2019 at 05:36 PM.
    A little inaccuracy saves tons of explanation. - H.H. Munro

  4. #4
    misoturbutc Hodor's Avatar
    Join Date
    Nov 2013
    Posts
    1,791
    In addition to what @laserlight said, I think you have the order of your parameters incorrect for strcspn(); i.e. invalid_characters should be second, not first

    Edit: Err nevermind (ignore the above), I see what you're trying to do now. I'd do it differently
    Last edited by Hodor; 12-08-2019 at 05:34 PM.

  5. #5
    Registered User
    Join Date
    Apr 2018
    Posts
    43
    Thanks for the reply.
    I don't quite understand what you mean when you said I should be calling strcspn separately
    Where should I call it?

    I changed the strspn to
    Code:
    if (strcspn(buffer[i],invalid_character) == '\0')
        {
            puts(" valid characters");
            i++;
        }
    but the result is still the same

  6. #6
    Registered User
    Join Date
    Apr 2018
    Posts
    43
    buffer[i] should have the first string in the file if i = 0
    or is it pointing to the first character instead of the first string?

  7. #7
    Registered User
    Join Date
    Apr 2018
    Posts
    43
    what would you change about my approach?

  8. #8
    Registered User
    Join Date
    Dec 2017
    Posts
    1,626
    The program is badly structured.
    Your "readline" routine should just, you know, read a line.
    Then you can pass it to another function to do whatever it is you are trying to do (which I can't quite figure out).

    Also it's a very bad habit to declare all your variables at the top of a function.
    A little inaccuracy saves tons of explanation. - H.H. Munro

  9. #9
    misoturbutc Hodor's Avatar
    Join Date
    Nov 2013
    Posts
    1,791
    Quote Originally Posted by Obvious Guy View Post
    what would you change about my approach?
    I dunno. To me, I'd be looking for the invalid character in the buffer, not in the invalid_character string. Dunno why It's just how I was reading it based on your description. I'd also do what @laserlight suggested.

    Maybe something along the lines of
    Code:
    /* ... */
    
        (*buffer)[buflen] = '\0';           /* nul-terminate */
    
        if (buflen == 0 && ch == EOF) {  /* return NULL if nothing read */
            free (*buffer);
            *buffer = NULL;
        } else {
            n = strcspn(*buffer, invalid_characters);
            if (n == strlen(*buffer)) {
                puts(" valid characters");
                i++;
           } else {
                printf("Invalid character: '%c'\n", (*buffer)[n]);
           }
        }
    
    /* ... */
    Edit: Note, I deliberately didn't use the variable 'i' because I don't really understand what you were trying to do with it. Adjust as necessary.
    Last edited by Hodor; 12-08-2019 at 05:48 PM.

  10. #10
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by Obvious Guy
    I don't quite understand what you mean when you said I should be calling strcspn separately
    Where should I call it?
    That's because I had in mind comparing with strlen: it would be easier to see that way since you can name the variable so the reader knows why it is compared to strlen. But if you're comparing with the null character, which indeed would be more efficient, embedding it into the index notation is fine.

    Looking at Hodor's edit, I think you were trying to use strspn for the other way: check to see if the list of invalid characters matches with an input character.

    Quote Originally Posted by Obvious Guy
    buffer[i] should have the first string in the file if i = 0
    You were using *buffer all the way; why use buffer[i] now?

    My suggestion is to rename the buffer parameter to output. Then, declare a local variable named buffer of type char* instead of char**. This will help you think more clearly in terms of a buffer of char. At the end of the function, just assign buffer to *output.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  11. #11
    misoturbutc Hodor's Avatar
    Join Date
    Nov 2013
    Posts
    1,791
    Quote Originally Posted by laserlight View Post
    Looking at Hodor's edit, I think you were trying to use strspn for the other way: check to see if the list of invalid characters matches with an input character.
    Yeah, I might have got things backwards (I adjusted my first reply appropriately)

    Edit: Also I might be missing something completely because I don't understand why a pointer to pointer is being passed (in addition to the use of 'i') when the buffer is being returned anyway. The code would be a lot simpler if it was, e.g.

    Code:
    /* ... */
    while ((line = readline (fp)) != NULL) {  /* read each line in 'fp' */
            printf (" line[%2zu] : %s\n", idx++, line);
    
    /* ... */
    
    char *readline (FILE *fp)
    
    /* ... */
    Unless the code is meant to be building an "array" of character buffers... but it's not doing that at the moment, hence my confusion regarding 'i'
    Last edited by Hodor; 12-08-2019 at 06:04 PM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Help Reading File Line By Line On Windows?
    By new2c95 in forum C Programming
    Replies: 13
    Last Post: 10-14-2014, 09:39 AM
  2. Replies: 6
    Last Post: 06-07-2012, 02:50 AM
  3. Checking for a Blank Line When Reading From a File
    By jeanermand in forum C Programming
    Replies: 6
    Last Post: 02-07-2012, 06:50 PM
  4. Replies: 7
    Last Post: 12-13-2010, 02:13 PM
  5. reading words line by line from a file
    By -EquinoX- in forum C Programming
    Replies: 3
    Last Post: 05-04-2008, 12:34 AM

Tags for this Thread