Thread: How come my code works? Seriously. :)

  1. #1
    Registered User
    Join Date
    Jul 2010
    Posts
    21

    How come my code works? Seriously. :)

    Hello.

    The following code asks the user to enter a couple of words and then sorts them alphabetically. Not much and it does the job. However, just to see what happens, I didn't make enough room for the null character while declaring the word array and when using malloc. Still, everything seems to be working properly and I just wonder why.

    Code:
    #include <stdio.h>
    #include <string.h>
    #include <stdlib.h>
    
    #define LIMIT 8
    
    int wcompare(const void *p, const void *q);
    int rline(char rword[], int i);
    
    int main(void)
    {
        int i, n = 0;
    
        char *words[LIMIT];
        char word[LIMIT];                           // Should have been: LIMIT + 1?
    
        for (;;) {
            printf("Word: ");
            rline(word, LIMIT);
                if (*word == '\0')
                    break;
            words[n] = malloc(strlen(word));        // Should have been: strlen(word) + 1?
            strcpy(words[n++], word);
        }
    
        qsort(words, n, sizeof(char *), wcompare);
    
        printf("Words: ");
    
        for(i = 0; i < n; i++)
            printf("%s ", words[i]);
    
        printf("\n");
    
        return 0;
    }
    int wcompare(const void *p, const void *q)
    {
        return strcmp(*(char **)p, *(char **)q);
    }
    int rline(char rword[], int i)
    {
        int ch, j = 0;
    
        while ((ch = getchar()) != '\n')
            if (j < i)
                rword[j++] = ch;
        rword[j] = '\0';
    
        return j;
    }
    A typical session might look like this:

    Code:
    assiduus@ubuntu:~$ cc assiduus.c
    assiduus@ubuntu:~$ ./a.out
    Word: assiduus
    Word: tmptmptmp
    Word: blahblahblah
    Word:
    Words: assiduus blahblah tmptmptm
    Where I'd suspect the program to crash is within the rline function when "rword[j] = '\0';" assigns \0 to the *nineth* element of the word array. The second questionable piece of code is when malloc doesn't allocate enough memory and the code still works flawlessly. Btw, I have no *real* programming experience so there's probably something obvious I'm missing but I just don't get it at the moment.

  2. #2
    Gawking at stupidity
    Join Date
    Jul 2004
    Location
    Oregon, USA
    Posts
    3,218
    Writing to unassigned memory doesn't result, specifically, in a crash. It specifically results in "undefined behavior". It just so happens that when you tried it worked without problems. Trying to figure out why it works is pointless. You simply cannot define what is by design undefined.
    If you understand what you're doing, you're not learning anything.

  3. #3
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    If you're unsure about malloc, then use valgrind.
    Code:
    $ gcc -g bar.c
    Compile with debug
    $ valgrind ./a.out
    ==5695== Memcheck, a memory error detector
    ==5695== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.
    ==5695== Using Valgrind-3.6.0.SVN-Debian and LibVEX; rerun with -h for copyright info
    ==5695== Command: ./a.out
    ==5695== 
    Word: hello
    ==5695== Invalid write of size 1
    ==5695==    at 0x4026107: strcpy (mc_replace_strmem.c:311)
    ==5695==    by 0x804866F: main (bar.c:23)
    Where strcpy was called from
    ==5695==  Address 0x419802d is 0 bytes after a block of size 5 alloc'd
    What strcpy did (write just off the end of the array)
    ==5695==    at 0x4024F20: malloc (vg_replace_malloc.c:236)
    ==5695==    by 0x804864E: main (bar.c:22)
    Where that block of memory was allocated
    ==5695== 
    Word: world
    Word: fred
    Word: barney
    Word: 
    ==5695== Invalid read of size 1
    ==5695==    at 0x407C50B: vfprintf (vfprintf.c:1614)
    ==5695==    by 0x408315F: printf (printf.c:35)
    ==5695==    by 0x804868D: main (bar.c:31)
    ==5695==  Address 0x41980d6 is 0 bytes after a block of size 6 alloc'd
    ==5695==    at 0x4024F20: malloc (vg_replace_malloc.c:236)
    ==5695==    by 0x804864E: main (bar.c:22)
    ==5695== 
    Words: barney fred hello world 
    ==5695== 
    ==5695== HEAP SUMMARY:
    ==5695==     in use at exit: 20 bytes in 4 blocks
    ==5695==   total heap usage: 4 allocs, 0 frees, 20 bytes allocated
    You need to add some calls to free()
    ==5695== 
    ==5695== LEAK SUMMARY:
    ==5695==    definitely lost: 20 bytes in 4 blocks
    ==5695==    indirectly lost: 0 bytes in 0 blocks
    ==5695==      possibly lost: 0 bytes in 0 blocks
    ==5695==    still reachable: 0 bytes in 0 blocks
    ==5695==         suppressed: 0 bytes in 0 blocks
    ==5695== Rerun with --leak-check=full to see details of leaked memory
    ==5695== 
    ==5695== For counts of detected and suppressed errors, rerun with: -v
    ==5695== ERROR SUMMARY: 8 errors from 2 contexts (suppressed: 13 from 8)
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  4. #4
    Registered User
    Join Date
    Jul 2010
    Posts
    21
    Thanks for valuable pieces of info. I really appreciate it.

  5. #5
    Registered User
    Join Date
    Jul 2010
    Posts
    21
    As long as I'm working with strings, I've got one more (silly?) question:

    Code:
    #include <stdio.h>
    #include <string.h>
    
    int main(void)
    {
        char s[8] = "assiduus";
        char t[] = "assiduus";
        char u[] = { 'a', 's', 's', 'i', 'd', 'u', 'u', 's' };
    
        printf("%d, %d\n", strlen(s), strcmp(s, t));
        printf("%d, %d\n", strlen(u), strcmp(u, t));
    
        return 0;
    }
    The session:

    Code:
    assiduus@ubuntu:~$ cc tmp.c
    assiduus@ubuntu:~$ ./a.out
    8, 0
    16, 97
    As you can see, there's no place to add the null character as far as "s" is concerned and, therefore, the compiler doesn't add one. From what I understand, it means that "s" cannot be treated as a string and dealt with through string-related functions. And yet, when "measured" and compared with "t" everything works just fine. How does strlen know where "s" ends? And when I try to recreate the same situation by hand with "u", it fails. What's the secret?

  6. #6
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    > How does strlen know where "s" ends?
    It doesn't, you just got lucky.

    > And when I try to recreate the same situation by hand with "u", it fails. What's the secret?
    That would be you being unlucky.

    You can still use strn... functions on such strings, but you certainly can't use anything expecting a \0 to be at the end.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  7. #7
    Bit Fiddler
    Join Date
    Sep 2009
    Posts
    79
    s is null terminated. A char array of [8] got nine slots, (0 - 8). Eight for your characters and one for the terminating null.

    EDIT: Nevermind... I'll go get som sleep.
    Last edited by Fader_Berg; 02-10-2011 at 08:16 AM.

  8. #8
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Quote Originally Posted by Fader_Berg View Post
    s is null terminated. A char array of [8] got nine slots, (0 - 8). Eight for your characters and one for the terminating null.
    Ummm... not to be too picky here, but it's not automatically null terminated at all.

    An array of char [8] has 8 elements numbered 0 to 7... element 8 is out of bounds.

    If you put 8 characters in it, it is not null terminated... the null will be a 9th element which will be written out of bounds, launching you into the realm of undefined behavior.

  9. #9
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Quote Originally Posted by assiduus View Post
    As you can see, there's no place to add the null character as far as "s" is concerned and, therefore, the compiler doesn't add one.
    The compiler does not add nulls to anything. The nulls at the end of strings are added by the various string handling functions in the library, at run time. If you are writing your own string functions you need to add the trailing nulls yourself. When defining memory buffers to hold strings you always have to allow space for the null.

    char S[10]; can only hold 9 visible characters, the extra 10th space is for the null.

    From what I understand, it means that "s" cannot be treated as a string and dealt with through string-related functions. And yet, when "measured" and compared with "t" everything works just fine. How does strlen know where "s" ends? And when I try to recreate the same situation by hand with "u", it fails. What's the secret?
    Strlen simply finds the null at the end and returns it's offset from the beginning of the string.

    Actually, in an example as simple as yours, the answer is that you got lucky and nothing overwrote the null that was placed out of bounds. You'll find out how important that null is the first time you go to print a 5 or 10 character string and your screen fills with page after page of garbage.

    This is called "undefined behavior" and, as several here have pointed out, trusting it most often results in programs that work just fine until you are demonstrating them to the customer...

    One of the hardest lessons a C programmer learns is just how stupid the language really is... C is an entirely obedient idiot, you have to tell it everything in exactly the right order and, if you tell it to do something wrong, it will do that just as happily. It is entirely up to you to make sure your code keeps it within the bounds of reasonable behavior.
    Last edited by CommonTater; 02-10-2011 at 08:43 AM.

  10. #10
    Bit Fiddler
    Join Date
    Sep 2009
    Posts
    79
    Quote Originally Posted by CommonTater View Post
    Ummm... not to be too picky here, but it's not automatically null terminated at all.

    An array of char [8] has 8 elements numbered 0 to 7... element 8 is out of bounds.

    If you put 8 characters in it, it is not null terminated... the null will be a 9th element which will be written out of bounds, launching you into the realm of undefined behavior.
    I know, I know... I don't know where that came from.

  11. #11
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Quote Originally Posted by Fader_Berg View Post
    I know, I know... I don't know where that came from.
    No worries...

    Normally I'd just raise an eyebrow and let that go by. But in the context of someone trying to understand char arrays, it really did need correction.

  12. #12
    Registered User
    Join Date
    Jul 2010
    Posts
    21
    What I meant by "the compiler doesn't add one" was that if your initializer is shorter than the string variable, the compiler adds extra null characters and if it (the initializer) has exactly the same length (not counting the null character), the compiler won't try to store a null character. Adding or not adding \0 at the end of the string by string-reading functions is something I'm aware of (hence my "own" reading function). Am I right with the above? If not I must have seriously misunderstood the chapter on string variables by K.N. King. I am, of course, thankful for all your previous answers.

  13. #13
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    Code:
    char a[4] = "a";  // a + 3 \0
    char a[4] = "abcd";  // abcd and NO \0
    char a[] = "abcd"; // compiler counts chars and adds a \0
    Your understanding seems to be correct.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  14. #14
    Gawking at stupidity
    Join Date
    Jul 2004
    Location
    Oregon, USA
    Posts
    3,218
    Quote Originally Posted by assiduus View Post
    What I meant by "the compiler doesn't add one" was that if your initializer is shorter than the string variable, the compiler adds extra null characters and if it (the initializer) has exactly the same length (not counting the null character), the compiler won't try to store a null character. Adding or not adding \0 at the end of the string by string-reading functions is something I'm aware of (hence my "own" reading function). Am I right with the above? If not I must have seriously misunderstood the chapter on string variables by K.N. King. I am, of course, thankful for all your previous answers.
    Yes, and this is why:

    ISOC99 6.7.8.21:

    If there are fewer initializers in a brace-enclosed list than there are elements or members
    of an aggregate, or fewer characters in a string literal used to initialize an array of known
    size than there are elements in the array, the remainder of the aggregate shall be
    initialized implicitly the same as objects that have static storage duration.
    If you understand what you're doing, you're not learning anything.

  15. #15
    Registered User
    Join Date
    Jul 2010
    Posts
    21
    Thanks again.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Code works in Win2K but not XP
    By phantom in forum Windows Programming
    Replies: 7
    Last Post: 04-09-2010, 09:16 PM
  2. Explain this C code in english
    By soadlink in forum C Programming
    Replies: 16
    Last Post: 08-31-2006, 12:48 AM
  3. Updated sound engine code
    By VirtualAce in forum Game Programming
    Replies: 8
    Last Post: 11-18-2004, 12:38 PM
  4. Binary Search Trees Part III
    By Prelude in forum A Brief History of Cprogramming.com
    Replies: 16
    Last Post: 10-02-2004, 03:00 PM
  5. Seems like correct code, but results are not right...
    By OmniMirror in forum C Programming
    Replies: 4
    Last Post: 02-13-2003, 01:33 PM