Thread: Smells like undefined spirit

  1. #1
    Charming Registered User
    Join Date
    May 2011
    Posts
    49

    Smells like undefined spirit

    Hello again, I struggled for a few hours with this one but I can't seem to find what I am looking for, so thank you in advance for any help. I am trying to do the following exercise:

    Write a program that sorts a series of words entered by the user. Assume that each word is no more that 20 characters long. Stop reading when the user enters an empty word. Store each word in a dynamically allocated string, using any array of pointers to keep track of the strings. After all words have been read, sort the array (using any sorting technique) and then use a loop to print the words in sorted order.

    Here is what I have written:
    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <ctype.h>
    #include <string.h>
    
    #define MAX_WORD 20
    #define NUM_WORDS 10
    
    int read_word(char *, int);
    
    int main(int argc, char **argv) {
    	char *words_vector[NUM_WORDS];
    	char word[MAX_WORD + 1];
    	int i, j, word_len, index = 0;
    	char *temp;
    	
    	for(;;) {
    		if(index == NUM_WORDS) {
    			printf("\n---word vector is full---\n");
    			break;
    		}
    		
    		printf("Enter word: ");
    		word_len = read_word(word, MAX_WORD);
    		if(word_len == 0)
    			break;
    		
    		*(words_vector + index) = malloc(word_len);
    		if(*(words_vector + index) == NULL) {
    			printf("\nmalloc failed in element %d of words_vector!\n", index);
    			exit(EXIT_FAILURE);
    		}
    		strcpy(*(words_vector + index), word);
    		++index;
    	}
    	
    	printf("\nIn unsorted order:");
    	for(i = 0; i < index; ++i)
    		printf(" %s", *(words_vector + i));
    	
    	for(i = 0; i < index - 1; ++i)
    		for(j = i + 1; j < index; ++j)
    			if(strcmp(*(words_vector + i), *(words_vector + j)) > 0) {
    				temp = *(words_vector + j);
    				*(words_vector + j) = *(words_vector + i);
    				*(words_vector + i) = temp;
    			}
    	
    	printf("\nIn sorted order:");
    	for(i = 0; i < index; ++i) {
    		printf(" %s", *(words_vector + i));
    		free(*(words_vector + i));
    	}
    			
    	return 0;
    }
    
    int read_word(char *str, int n) {
    	int c, i = 0;
    	
    	while(!isspace(c = getchar()))
    		if(i < n)
    			*(str + i++) = c;
    	
    	while(c != '\n')
    		c = getchar();
    		
    	*(str + i) = '\0';
    	return i;
    }
    The program "seems" to work fine, but 1 out of 20 crashes. That made me think that I definitely messed up somewhere. I spend a few hours experimenting but I can't see my error. The time the program crashes, is after I enter some word (random from the 1rd to 10th) and same input words, doesn't seem to be the issue.

    Any hint to point my error, will be valuable.

    Also do not mind my sorting, I intend to rewrite the same program using qsort and then again with a "hand made" implementation of the quicksort algorithm. But if you have any suggestion about how to better write my code will be much appreciated.

    Cheers
    If you are the smartest person in the room, you are in the wrong room.

  2. #2
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    I haven't been able to replicate your problem yet, but I think your problem is that you only allocate word_len bytes. You need to allocate word_len + 1 bytes, to leave room for the null.

    I also suggest you use array notation instead of pointer arithmetic. Everytime you use *(array_name + i), change it to array_name[i]. It's much easier to read and keep track of what you're accessing, so you can more easily avoid pointer bugs.

  3. #3
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,612
    One problem pointed out by my debugger is the way that you are copying words.

    Code:
    (gdb) step
    28              *(words_vector + index) = malloc(word_len);
    (gdb) print word_len
    $5 = 3
    (gdb) print word
    $6 = "fry\000lant\000\b\000\000\000/N├w)N├w"
    I typed in various foods when I tested your program, and the current word is "fry". You are allocating with malloc only three bytes for four bytes that you need to copy -- the length of the string plus the terminating zero. When strcpy() is trying to copy word into the words_vector, it oversteps the array boundary while copying. This results in undefined behavior. To fix it you simply need to ask for enough memory.

    Code:
    *(words_vector + i) = malloc(word_len + 1);
    edit -- just for edification, this is what my debugger eventually ended up saying:
    Code:
    In sorted order: applewarning: HEAP[a.exe]:
    warning: Heap block at 003E24B0 modified at 003E24BD past requested size of 5
    
    Program received signal SIGTRAP, Trace/breakpoint trap.
    0x7c90120f in ntdll!DbgUiConnectToDbg () from C:\WINDOWS\system32\ntdll.dll
    Last edited by whiteflags; 11-01-2011 at 06:17 PM.

  4. #4
    Charming Registered User
    Join Date
    May 2011
    Posts
    49
    Thank you anduril

    About the pointer arithmetic usage, I am writing this way, just to start feeling comfortable with pointer arithmetic, sorry if the code was a bit cloudy.

    Now about my error, I think that my read_word() function is actually returning the word's characters, plus the \0 and in fact I have experienced the crash with words as short as 5 characters. I have made the correction you suggest and after this post I will test but if this is my error, I still do not understand it clearly.

    I will report back as soon as I test enough with the change you suggest anduril.
    If you are the smartest person in the room, you are in the wrong room.

  5. #5
    Charming Registered User
    Join Date
    May 2011
    Posts
    49
    Thank you whiteflags, you point out, the same error anduril suggested and in fact after I changed to malloc(word_len + 1) the crashing is not occuring anymore. But my confusion remains. I mean my read_word() function for word "fry" returns 4, which are enough bytes to allocate for storing the word plus the \0.

    I do not say that you are wrong, only that I didn't understand my error. Is the read_word() written bad? Or the malloc needs (characters + 2) to store the string?

    Apologies for wasting your time by not getting it right away, I am in debt nevertheless.

    ---edit---

    Ok now it is clear and I feel really stupid for not seeing that before. The read_word() is fine and returns 3 for the word "fry" which are enough for the strcpy since a string counts from 0, but not enough for malloc which we should tell that need 4 bytes allocated for the same word. Thank you guys again for your help and your time.
    Last edited by ardavirus; 11-01-2011 at 06:42 PM.
    If you are the smartest person in the room, you are in the wrong room.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. loop with spirit
    By grygus in forum C++ Programming
    Replies: 10
    Last Post: 05-19-2009, 08:12 AM
  2. Is this undefined?
    By carrotcake1029 in forum C Programming
    Replies: 3
    Last Post: 01-28-2009, 11:42 AM
  3. boost::spirit, abstract syntax tree question
    By Raven Arkadon in forum C++ Programming
    Replies: 1
    Last Post: 08-18-2007, 07:31 PM
  4. Christmas spirit
    By Mercadogs in forum C Programming
    Replies: 8
    Last Post: 09-20-2006, 03:18 PM
  5. Come in, show your spirit!
    By RoD in forum A Brief History of Cprogramming.com
    Replies: 7
    Last Post: 12-24-2002, 11:04 PM