Thread: Spell checker help

  1. #1
    Registered User
    Join Date
    Nov 2009
    Posts
    19

    Spell checker help

    Hi, I need to make this spell checker for homework and I cant figure out why it wont work. I send the array of words into the function and it returns all the words as misspelled. Can anyone give me some ideas?

    I also get this warning for some reason: Dictionary.c:53: warning: implicit declaration of function ‘getline’

    If the code is hard to read, I can insert comments.

    insert:
    Code:
    #include <stdlib.h>
    #include <stdio.h>
    #include <string.h>
    
    typedef struct {
        FILE *fp, *wfp;
    } Dict;
    
    int inDict(FILE*, char*);
    
    int main(int argc, char **argv)
    {
        int size = (1000 * sizeof(int));
        Dict words;
        char *word, *buf;
        char *delim = "\n\t\",. ";
        words.fp = fopen("/usr/share/dict/words", "r");
        words.wfp = fopen(argv[1], "r");
    
        if ((buf = (char*)malloc(1000 * sizeof(char*))) == NULL) {
            printf("Allocation error");
        }
    
        while (fgets(buf, size, words.wfp)) {
            for (word = strtok(buf, delim); word; word = strtok(0, delim)) {
                if (inDict(words.fp, word)) {
                    printf("%s  OK\n", word);
                }
                else {
                    printf("%s  MISSPELLED\n", word);
                }
            }
        }
    
        free(buf);
        free(word);
        fclose(words.fp);
        fclose(words.wfp);
    
      return 0;
    }
    
    int inDict(FILE* d, char* word) {
        int len = 0;
        char *dict;
        rewind(d);
        while (getline(&dict, &len, d) > 0) {
            if(strcasecmp(word,dict) == 0) {
                return 1;
            }
        }
        return 0;
    }

  2. #2
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    Getline is C++, not C. Perhaps you want fgets() ?

    You'll also need strcmp() or stricmp() for comparing strings. The first is case sensitive, the latter is insensitive.

  3. #3
    Registered User
    Join Date
    Nov 2009
    Posts
    19
    Didn't know getline() was a c++ function, it was in the man pages for c.

    This is my new if statement, but even if the words are the same the statement is not hit. I passed in "apple" which is a word that appears in the dictionary and it was also misspelled.

    insert
    Code:
    if((strcasecmp(word,dict) == 0) && (strcmp(word,dict) == 0)) {
          return 1;
    }

  4. #4
    Registered User
    Join Date
    Nov 2009
    Posts
    60
    I think getline() is a C function of gcc but it isn't standard C.

    Here is a link to the GNU C Programming tutorial

    According to this site, "the getline function is the preferred method for reading lines of text from a stream, including standard input. The other standard functions, including gets, fgets, and scanf, are too unreliable."

  5. #5
    Registered User
    Join Date
    Nov 2009
    Posts
    19
    I took out getline() completely, but the main problem still remains. All my words keep coming out misspelled.

    insert
    Code:
    int inDict(FILE* d, char* word) {
        int len = (1000 * sizeof(int)), i = 0;
        char *dict;
        if ((dict = (char*)malloc(1000 * sizeof(char*))) == NULL) {
            printf("Allocation error");
        }
        rewind(d);
        while (fgets(dict, len, d)) {
            if((strcasecmp(word,dict) == 0) && (strcmp(word,dict) == 0)) {
                return 1;
            }
        }
        free(dict);
        return 0;
    }

  6. #6
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    This is wrong:
    Code:
    dict = (char*)malloc(1000 * sizeof(char*))
    It should be:
    Code:
    dict = malloc(1000 * sizeof(*dict))
    But if you know for sure that dict is a pointer to char, and since sizeof(char) == 1, you could simplify further:
    Code:
    dict = malloc(1000)
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  7. #7
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    You're still using a string comparison function that I'm not familiar with in C, and you haven't explained why.

    You shouldn't need two string comparison functions. Either the strings are to be compared case sensitive ( strcmp() ), or they are to be compared case insensitive ( stricmp() ).

    One or the other, but not both, should serve nicely.

    The usual problem with comparing strings is that one will have the end of string marker char '\0' on the end of it, and the other one won't. So check your input string, and your dictionary words, and see what convention you want to follow.
    Last edited by Adak; 11-25-2009 at 01:24 AM.

  8. #8
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by Adak
    You're still using a string comparison function that I'm not familiar with in C, and you haven't explained why.

    You shouldn't need two string comparison functions. Either the strings are to be compared case sensitive ( strcmp() ), or they are to be compared string insensitive ( stricmp() ).
    If it is the one I have in mind, then strcasecmp provides case insensitive comparison. Both strcasecmp and stricmp are non-standard anyway.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  9. #9
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    No stricmp() in the standard?

    What were they thinking??

    Well anyway, use one comparison function or the other, but both are not needed.

  10. #10
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by Adak
    No stricmp() in the standard?

    What were they thinking??
    Dunno, but at least strcasecmp appears to be part of the POSIX standard. I cannot figure out if stricmp is part of any standard, or was deprecated in POSIX, or simply just a standard library extension.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  11. #11
    Registered User
    Join Date
    Nov 2009
    Posts
    19
    Quote Originally Posted by Adak View Post
    You're still using a string comparison function that I'm not familiar with in C, and you haven't explained why.

    You shouldn't need two string comparison functions. Either the strings are to be compared case sensitive ( strcmp() ), or they are to be compared case insensitive ( stricmp() ).

    One or the other, but not both, should serve nicely.

    The usual problem with comparing strings is that one will have the end of string marker char '\0' on the end of it, and the other one won't. So check your input string, and your dictionary words, and see what convention you want to follow.
    I looked at strcasecmp() again, and you are right I only need one function to compare. I was under the impression that it only compares the 1st char of the string, but it compares the whole string.

    Good call on the '\0' thing, that is why it won't work. I am trying to slap a '\n' to the end of each word in the file so it comes out looking like the dict file.

    I passed in the string "apple\n" and it worked, so that was my problem.
    Thanks for everyone's help, if anyone has any idea how I can add a '\n' to the end of word*, I would love to hear it.
    Last edited by MyNameIs..; 11-25-2009 at 11:19 AM.

  12. #12
    Registered User slingerland3g's Avatar
    Join Date
    Jan 2008
    Location
    Seattle
    Posts
    603
    I would think it best to edit out that '\n' (newline) from your dictionary. This way your users, or this file, will not have to actually include this with the inserted word to check for spelling, and that would be odd. If you must, then the only way I see it is that you will have to store each word you input into an array in order to then tack on this extra character '\n' and then pass this off to be checked against your dictionary.

  13. #13
    Registered User
    Join Date
    Nov 2009
    Posts
    19
    Quote Originally Posted by slingerland3g View Post
    I would think it best to edit out that '\n' (newline) from your dictionary. This way your users, or this file, will not have to actually include this with the inserted word to check for spelling, and that would be odd. If you must, then the only way I see it is that you will have to store each word you input into an array in order to then tack on this extra character '\n' and then pass this off to be checked against your dictionary.
    I am trying to use strtok() to divide the dictionary file up, so it puts '\0' where there was a '\n'. It works fine but it messes up my words file. The words file is divided by 2 '\n' symbols and there's maybe 5 or 6 words per line. When I run it only compares the first word on each line. It checks for correct spelling, but does not access the rest of the line. Any ideas why that is?

    My code:
    Code:
    #include <stdlib.h>
    #include <stdio.h>
    #include <string.h>
    
    typedef struct {
        FILE *fp, *wfp;
    } Dict;
    
    int inDict(FILE*, char*);
    
    int main(int argc, char **argv)
    {
        int size = (1000 * sizeof(int)), i=0;
        Dict words;
        char *word, *buf;
        char *delim = "\n\t\",. ";
        words.fp = fopen("/usr/share/dict/words", "r");
        words.wfp = fopen(argv[1], "r");
    
        if ((buf = (char*)malloc(1000 * sizeof(char*))) == NULL) {
            printf("Allocation error");
        }
    
        while (fgets(buf, size, words.wfp)) {
            for (word = buf; word; word = NULL) {
                word = strtok(buf, delim);
                if (inDict(words.fp, word)) {
                    printf("%s  OK\n", word);
                }
                else {
                    printf("%s  MISSPELLED\n", word);
                }
            }
        }
    
        free(buf);
        fclose(words.fp);
        fclose(words.wfp);
    
      return 0;
    }
    
    int inDict(FILE* d, char* word) {
        int len = (1000 * sizeof(int)), i=0;
        char *dict, *buffer;
    
        if ((dict = (char*)malloc(1000 * sizeof(char*))) == NULL) {
            printf("Allocation error");
        }
    
        if ((buffer = (char*)malloc(1000 * sizeof(char*))) == NULL) {
            printf("Allocation error");
        }
        
        while (fgets(buffer, len, d)) {
            for (dict = strtok(buffer, "\n"); dict; dict = strtok(0, "\n")) {
                if((strcasecmp(word,dict) == 0)) {
                    return 1;
                }
            }
        }
        rewind(d);
        free(dict);
        return 0;
    }
    Last edited by MyNameIs..; 11-25-2009 at 12:14 PM.

  14. #14
    Registered User slingerland3g's Avatar
    Join Date
    Jan 2008
    Location
    Seattle
    Posts
    603
    For one your delimiter is off then, you should set it to "\n\n". Also this is strictly assuming words are separated by this delimiter only and no spaces.

    word1\n\nword2\n\nword3\n\nword4\n\n...


    Also I am not entirely understanding your logic with this for loop

    Code:
       for (word = buf; word; word = NULL)

  15. #15
    Registered User
    Join Date
    Nov 2009
    Posts
    19
    Quote Originally Posted by slingerland3g View Post
    For one your delimiter is off then, you should set it to "\n\n". Also this is strictly assuming words are separated by this delimiter only and no spaces.

    word1\n\nword2\n\nword3\n\nword4\n\n...


    Also I am not entirely understanding your logic with this for loop

    Code:
       for (word = buf; word; word = NULL)
    It's the same logic as my previous for loop, I just took it right from the example in the man pages.

    I changed the delimiter and it still does the same thing, first word from each line.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Simple spell checker
    By purplechirin in forum C Programming
    Replies: 31
    Last Post: 03-19-2008, 07:17 AM
  2. Spell Checker
    By DeepFyre in forum Tech Board
    Replies: 2
    Last Post: 02-11-2005, 12:17 PM
  3. Spell checker poem.
    By adrianxw in forum A Brief History of Cprogramming.com
    Replies: 4
    Last Post: 01-13-2004, 10:49 AM
  4. spell checker in c needs help
    By madmax in forum C Programming
    Replies: 3
    Last Post: 03-13-2003, 09:36 AM
  5. spell checker
    By bob20 in forum Windows Programming
    Replies: 3
    Last Post: 12-03-2002, 02:35 AM