Thread: seg faulting

  1. #1
    Registered User
    Join Date
    Apr 2007
    Posts
    1

    seg faulting

    Hey guys

    I don't really consider myself the best of a programmer but i'm currently working on a difficult program, or what i consdier difficult.

    There's one hitch though, i run this through the command line ./list -t words.txt < awords.txt. Basically it gets awords.txt and enters it into a trie structure and compares it with words.txt to see if their present or not.

    If i compare them both and they don't have any words, it seg faults and obviously crashes... i'm really worn out and can't think straight right now so i would greatly appreciate the help. From what i understand it seg faults in trie_find.

    If you want any more code, just ask. Cheers

    Code:
    letterTypePtr trie_insert(char *word,ListTypePtr list, letterTypePtr node, int len)
    {	
        char character[2];
        letterTypePtr prevLetter = NULL;
        
        --len;
        if(len != 0)
        {
            node = trie_insert(word,list, node, len);
            prevLetter = node;
            node = node->downLet;
        }
        
        character[0] = word[len];
        character[1] = '\0';
        
        if(node == NULL && prevLetter == NULL)
        {	    
            list->head = add_node(character);
            return list->head;
        }
        
        if(prevLetter != NULL && node == NULL)
        {
            prevLetter->downLet = add_node(character);
            return prevLetter->downLet;
        }
        
        if(prevLetter != NULL && node != NULL)
        {
           while (node != NULL)
           {
                if (strcmp(node->character, character) == 0 )
                    return node;
                prevLetter = node;   
                node = node->nextLet;
           }
           
           prevLetter->nextLet = add_node(character);
           return prevLetter->nextLet;       
        }
        
        if(prevLetter == NULL && node != NULL)
        {
            while (node != NULL) 
            {
                if (strcmp(node->character, character) == 0 )
                    return node;
                
                prevLetter = node;   
                node = node->nextLet;
              
           }
           
           prevLetter->nextLet = add_node(character);
           return prevLetter->nextLet;       
        }
        
    
    }
    
    /* just gets memory for the node */
    letterTypePtr add_node(char *word)
    {
        letterTypePtr newLetter;
        
        newLetter = malloc(sizeof(letterType));
                 assert(newLetter);
        newLetter->character = malloc(sizeof(char) * (strlen(word) + 1));
        strcpy(newLetter->character, word);
        newLetter->downLet = NULL;
        newLetter->nextLet = NULL;	
        
        return newLetter;
    }
    
    /* trie find */
    letterTypePtr trie_find(char *word, letterTypePtr node, int len)
    {   
        letterTypePtr prevLetter;
        char character[2];
    
        --len;
        if(len != 0)
        {
            node = trie_find(word, node, len);
            prevLetter = node;
            node = node->downLet;
        }
        
        character[0] = word[len];
        character[1] = '\0';
        
        if (node == NULL) 
        {
           return NULL;
        }
          
        while (node != NULL) 
        {
            if (strcmp(node->character, character) == 0) 
            {
                return node;
            }
            
            node = node->nextLet;
        }
        
        return NULL;
        
    }
    
    /* loadWords
    */
    int loadWords(ListTypePtr list)
    {
    
       char word[50];
       while (fscanf(stdin,"%s",word) != EOF)
       {
            trie_insert(word,list, list->head, strlen(word));
           
            list->count++;
       }
       return 1;
    }

  2. #2
    Registered User
    Join Date
    Apr 2007
    Posts
    45
    At a first glance, it looks like you've coded your find() routine to search down the list to the depth level equalling the length of the string passed :

    Code:
        --len;
        if(len != 0)
        {
            node = trie_insert(word,list, node, len);
            prevLetter = node;
            node = node->downLet;
        }
    I'm not really sure what order you're doing things in or if this is specifically causing your problem, but presuming you're calling trie_insert() to fill your initial linked list, this will certainly fail as the nodes won't have all been created already and you'll potentially be accessing data you're not priviledged to access by the OS (access violation).

    It would be worth your while putting a load of printf() statements in there to try and identify exactly where the problem is and working from there... (or using a symbollic debugger like GNU's gdb to pinpoint the problem - very easy to use, do a 'man gdb').
    we are one

  3. #3
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    I noticed a few things:
    1. sizeof(char) is always 1, so you don't need to use it in your malloc() sizes.
    2. fscanf(stdin, ...) is the same as scanf(...).
    3. You have a potential buffer overrun with your fscanf().
    4. load_words() always returns 1.

    Anyway, I think this is your problem:
    Code:
    letterTypePtr trie_find(char *word, letterTypePtr node, int len)
    {   
        letterTypePtr prevLetter;
        char character[2];
    
        --len;
        if(len != 0)
        {
            node = trie_find(word, node, len);
            prevLetter = node;
            node = node->downLet;
        }
        
        character[0] = word[len];
        character[1] = '\0';
        
        if (node == NULL) 
        {
           return NULL;
        }
    In the red code, you assume that node != NULL, but you only check if this is true in the blue code later on.
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  4. #4
    Registered User
    Join Date
    Sep 2006
    Posts
    835
    > sizeof(char) is always 1, so you don't need to use it in your malloc() sizes.
    It's useful as a comment, though, to see at a glance what kind of array it's being used as (much better than a cast IMO, since always putting the exact type in the sizeof() takes up on average much less extra space).

  5. #5
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by robatino View Post
    > sizeof(char) is always 1, so you don't need to use it in your malloc() sizes.
    It's useful as a comment, though, to see at a glance what kind of array it's being used as (much better than a cast IMO, since always putting the exact type in the sizeof() takes up on average much less extra space).
    I respectfully disagree -- assuming sizeof(char) == 1 is idiomatic. Most experienced people will immediately understand that a char is intended.

    Like the universe, C is built on "fundamental axioms," one of which is that sizeof(char) is always 1.

  6. #6
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    Quote Originally Posted by robatino View Post
    > sizeof(char) is always 1, so you don't need to use it in your malloc() sizes.
    It's useful as a comment, though, to see at a glance what kind of array it's being used as (much better than a cast IMO, since always putting the exact type in the sizeof() takes up on average much less extra space).
    Yes, well, in that code strlen() is being used at the same place, so I think you could figure out the type of the array.

    That's an interesting point, though. I don't think it ever came up in any of the discussions (okay, arguments) about casting malloc(). I'll be sure to mention it next time.

    [edit] Looks like this is going to turn into another "discussion" . . .
    I respectfully disagree -- assuming sizeof(char) == 1 is idiomatic. Most experienced people will immediately understand that a char is intended.
    What about for sizeof(int), for example? Would you agree then? (I agree with you, by the way. No size is chars.) [/edit]
    Last edited by dwks; 04-18-2007 at 12:33 PM.
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  7. #7
    Registered User
    Join Date
    Sep 2006
    Posts
    835
    Quote Originally Posted by brewbuck View Post
    I respectfully disagree -- assuming sizeof(char) == 1 is idiomatic. Most experienced people will immediately understand that a char is intended.

    Like the universe, C is built on "fundamental axioms," one of which is that sizeof(char) is always 1.
    But which kind? signed char, unsigned char, or just plain char (which can be thought of as a third type since it's implementation-dependent whether it's signed or unsigned)? Without specifying it's not clear (unless you take the default to be plain char, and specify the other two, but I don't think that's a good idea).

  8. #8
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    Quote Originally Posted by robatino View Post
    But which kind? signed char, unsigned char, or just plain char (which can be thought of as a third type since it's implementation-dependent whether it's signed or unsigned)? Without specifying it's not clear (unless you take the default to be plain char, and specify the other two, but I don't think that's a good idea).
    Does it matter? What if you change the type of the array? Will you remember to change the malloc() calls? After all, this still works but is misleading:
    Code:
    char *p;
    p = malloc(sizeof(unsigned char) * 10);
    Dave_Sinkula and others actually recommend always using the variable in sizes for memory allocation calls:
    Code:
    char *p;
    p = malloc(sizeof(*p) * 10);
    That way, if you change the type of p, the malloc statement still stands correct. But that defeats your purpose since it doesn't specify the type of p.
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  9. #9
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by dwks View Post
    What about for sizeof(int), for example? Would you agree then? (I agree with you, by the way. No size is chars.)
    For anything but a char, all bets are off. The standard only specified that a short is AT LEAST as long as a char, and a long is AT LEAST as long as a short, etc. However, there is plenty of code under the sun that assumes that sizeof(int) is equal to 4. I tend not to assume such things :-)

  10. #10
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by robatino View Post
    But which kind? signed char, unsigned char, or just plain char (which can be thought of as a third type since it's implementation-dependent whether it's signed or unsigned)? Without specifying it's not clear (unless you take the default to be plain char, and specify the other two, but I don't think that's a good idea).
    Signed and unsigned never make any difference to the size of the type. So it really doesn't matter in this aspect.

  11. #11
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    As for sizeof(int), I meant, would you put this
    Code:
    int *p = malloc(sizeof(int) * 10);
    or this?
    Code:
    int *p = malloc(sizeof(*p) * 10);
    (With or without parentheses. )
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  12. #12
    Registered User
    Join Date
    Sep 2006
    Posts
    835
    Quote Originally Posted by dwks View Post
    Does it matter? What if you change the type of the array? Will you remember to change the malloc() calls? After all, this still works but is misleading:
    Code:
    char *p;
    p = malloc(sizeof(unsigned char) * 10);
    Dave_Sinkula and others actually recommend always using the variable in sizes for memory allocation calls:
    Code:
    char *p;
    p = malloc(sizeof(*p) * 10);
    That way, if you change the type of p, the malloc statement still stands correct. But that defeats your purpose since it doesn't specify the type of p.
    I think whether that's a good idea depends on how likely it is that the type of *p will get changed - if likely, then sizeof(*p) is preferable, otherwise hardcoding the type is since it acts as a comment. If one chooses to go with the "comment" and hardcode the type, then it's better to make it exact, which means writing sizeof(unsigned int) instead of sizeof(int) for an unsigned int array, for example, even though they're equal, and using sizeof(char), sizeof(unsigned char), or sizeof(signed char), even though they all equal 1. It's harmless anyway, since the compiler optimizes away these expressions.

  13. #13
    Registered User
    Join Date
    Sep 2006
    Posts
    835
    Quote Originally Posted by brewbuck View Post
    Signed and unsigned never make any difference to the size of the type. So it really doesn't matter in this aspect.
    I know, that's my point. If you leave out sizeof(char) since it's always 1, then you have to leave out sizeof(signed char) and sizeof(unsigned char) for the same reason. Then if one sees something like malloc(100), there are 3 possibilities for the type, so the expression no longer serves as a comment. Of course, it's a matter of personal preference whether one uses the sizeof() inside malloc() as a comment, but if one does, then the exact type needs to be there to serve that purpose.

  14. #14
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by robatino View Post
    I know, that's my point. If you leave out sizeof(char) since it's always 1, then you have to leave out sizeof(signed char) and sizeof(unsigned char) for the same reason. Then if one sees something like malloc(100), there are 3 possibilities for the type, so the expression no longer serves as a comment. Of course, it's a matter of personal preference whether one uses the sizeof() inside malloc() as a comment, but if one does, then the exact type needs to be there to serve that purpose.
    There is only one possibility for the type: it's the type of the pointer the value of malloc() is assigned to. The allocation of memory isn't the part that needs commenting, anyway.

  15. #15
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by dwks View Post
    As for sizeof(int), I meant, would you put this
    Code:
    int *p = malloc(sizeof(int) * 10);
    or this?
    Code:
    int *p = malloc(sizeof(*p) * 10);
    (With or without parentheses. )
    I would almost always write the latter. But I would write it as sizeof(p[0]), not sizeof(*p). Matter of convention -- when allocating a single object I use a star, when allocating an array of objects I array-dereference it.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 1
    Last Post: 05-31-2009, 04:02 PM
  2. Seg Fault in Compare Function
    By tytelizgal in forum C Programming
    Replies: 1
    Last Post: 10-25-2008, 03:06 PM
  3. seg fault at vectornew
    By tytelizgal in forum C Programming
    Replies: 2
    Last Post: 10-25-2008, 01:22 PM
  4. pointers seg faulting
    By snareguy in forum C Programming
    Replies: 1
    Last Post: 03-21-2004, 12:33 PM
  5. Seg Fault Problem
    By ChazWest in forum C++ Programming
    Replies: 2
    Last Post: 04-18-2002, 03:24 PM