String overflow behaviour

This is a discussion on String overflow behaviour within the C Programming forums, part of the General Programming Boards category; I wanted to check out what strlen did if the string it was acting upon overflowed... Code: #include <stdio.h> #include ...

  1. #1
    Registered User
    Join Date
    Oct 2002
    Posts
    98

    String overflow behaviour

    I wanted to check out what strlen did if the string it was acting upon overflowed...

    Code:
    #include <stdio.h>
    #include <string.h>
    
    int
    main()
    {
      char str[10];
    
      sprintf( str, "four" );
      printf( "\n%s - %d", str, strlen( str ));
    
      strcat( str, "four" );
      printf( "\n%s - %d", str, strlen( str ));
    
      strcat( str, "four" );
      printf( "\n%s - %d", str, strlen( str ));
    
      printf( "\n" );
    
      return 0;
    }
    gave this output:

    four - 4
    fourfour - 8
    fourfourfour - 12

    Is this behaviour predictable i.e. can I check that a string has not overflowed by using strlen to ensure that the length of the string is less than the space originally allocated to it?

    Thanks.
    There is no such thing as a humble opinion

  2. #2
    twm
    twm is offline
    root
    Join Date
    Sep 2003
    Posts
    232
    Once you've written out of the string's boundaries it's too late, the behavior is unpredictable.
    The information given in this message is known to work on FreeBSD 4.8 STABLE.
    *The above statement is false if I was too lazy to test it.*
    Please take note that I am not a technical writer, nor do I care to become one.
    If someone finds a mistake, gleaming error or typo, do me a favor...bite me.
    Don't assume that I'm ever entirely serious or entirely joking.

  3. #3
    Registered User
    Join Date
    Sep 2003
    Posts
    23
    If the string is overflowed, it's too late in most cases

    That is generally not a good idea. By declaration char str[10]; you will get an array of characters, also the memory for them, in your case you have memory for 10 chars, normally 10 bytes. If you try to use more, like with the "fourfourfour" string, it will be a buffer overflow. So, it can generate a 'segmentation failed' message under linux or simply crash down or do something unpredictable. It's because you are trying to access memory which is not yours, you have only 10 chars and you are trying to access 13. So, if the character '\0' (which marks the end of string) does not appear in your ten bytes, the strlen() function will also access not allocated memory, in which is some garbage or your another data (other variables).

    And if you use your array as an array of characters, then your array can contain all characters different from '\0', so your method will show that it's overflowed, while it still isn't.

  4. #4
    Been here, done that.
    Join Date
    May 2003
    Posts
    1,161
    Most likely the overflow is being written into other variables in your program.

    Try this:
    Code:
    #include <stdio.h>
    #include <string.h>
    
    int main()
    {
        char t2[10];
        char t1[10];
        
        strcpy(t1, "four");
        strcpy(t2, "abcdefghi");
        printf("t1:[%s]  t2:[%s] \n", t1, t2);
    
        strcat(t1, "five");
        printf("t1:[%s]  t2:[%s] \n", t1, t2);
    
        strcat(t1, "six");
        printf("t1:[%s]  t2:[%s] \n", t1, t2);
    
        strcat(t1, "seven");
        printf("t1:[%s]  t2:[%s] \n", t1, t2);
    
    }
    My compiler (Borland 5.5) displayed:
    t1:[four] t2:[abcdefghi]
    t1:[fourfive] t2:[abcdefghi]
    t1:[fourfivesix] t2:[abcdefghi]
    t1:[fourfivesixseven] t2:[even]

    Notice that the "ixs" has not bled into t2? When setting up a buffer, the compiler starts each variable on a 4 byte boundary. There are a couple bytes between t1 and t2 that are thrown in for ease of processing.

    In your case, you didn't overwrite far enough to cause any noticable impact. Try a couple more strcat()'s and you may blow up. I did while testing the above.

    Also (something I didn't realize) the variables are defined in reverse order from their declaration.
    Last edited by WaltP; 09-17-2003 at 12:18 PM.
    Definition: Politics -- Latin, from
    poly meaning many and
    tics meaning blood sucking parasites
    -- Tom Smothers

  5. #5
    That Creepy Network Guy DeepBlackMagic's Avatar
    Join Date
    Feb 2003
    Posts
    265
    Thats interesting behavior how T2 is declaired first but appears in memory before T1, is this behavior compiler specific, os specific, or completely unpredictable? If memory is always allocated in reverse order to variable declairations, why not prevent buffer overflows in popular software by sticking a huge char no_more_buffer_overflows[65535]; so when some st00p1d skr1pt k1ddi3 fills up your login buffer with 5000 0x7F's he gets nowhere fast? Take THAT you stupid EAX pointer!

  6. #6
    End Of Line Hammer's Avatar
    Join Date
    Apr 2002
    Posts
    6,231
    >>by sticking a huge char no_more_buffer_overflows[65535];
    Why not just write good code in the first place?

    The behaviour is undefined, therefore the results are undefined. End of story.
    When all else fails, read the instructions.
    If you're posting code, use code tags: [code] /* insert code here */ [/code]

  7. #7
    That Creepy Network Guy DeepBlackMagic's Avatar
    Join Date
    Feb 2003
    Posts
    265
    Originally posted by Hammer
    Why not just write good code in the first place?
    A few reasons, frst, im overworked, underpaid, poorly treated, and completely unloved in my work. Second, im lazy. Lazyness is an important trait in good programmers. Dilligent programmers code for every possible situation, using thousands of unnecessary lines of code and going over budget and over time for the project. Lazy coders write one function that handles 99% of the problems, and cleans up around the edges. The beauty of that 99% funciton is whats important because it will probablly be reused over and over again =P (note how every version of windows seems to be vulnerable to a few things). Finally, the 2 things every project involves. Time and Money (tm) For a project to exist 2 conditions must be met. There must never be enough time to complete it (time is calculated by managerial monkeys) and there is never enough money to keep it going past deadline. Note the part of the word that matters: DEADline. after this line in time, the project is DEAD. This wonderful business strategy is what drives our current research and development cycle in the USA. There are hundreds of unfinished or impossible projects that never get done because they get ditched the second they are percieved as unprofitable by the morons that dreamed them up in the first place.

  8. #8
    & the hat of GPL slaying Thantos's Avatar
    Join Date
    Sep 2001
    Posts
    5,681
    Just had to respond after reading DeepBlackMagic's post. Heard this a long time ago and its so damn true. With any projects there are there three ways to have it done:

    1) Done fast
    2) Done cheaply
    3) Done well

    You can have 2 but you can never have all 3

  9. #9
    That Creepy Network Guy DeepBlackMagic's Avatar
    Join Date
    Feb 2003
    Posts
    265
    Seems like just about nobody cares about option 3 these days.

  10. #10
    End Of Line Hammer's Avatar
    Join Date
    Apr 2002
    Posts
    6,231
    Yes, but... in the context of this thread, there is a difference between writing code that can be broken by adverse user input (for example), to one that brakes itself without intervention, by relying on undefined behaviour.
    When all else fails, read the instructions.
    If you're posting code, use code tags: [code] /* insert code here */ [/code]

  11. #11
    Registered User
    Join Date
    Oct 2002
    Posts
    98
    OK thanks.

    What I was originally trying to do was strcat strings into a buffer with a fixed maximum length. I don't know how many strings are going to be written, or how long they are (though they are likely to be less than the maximum length).

    I would like to know if I can check to see if the length has been exceeded without resorting to summing the lengths of each string as it is added to the buffer...
    There is no such thing as a humble opinion

  12. #12
    End Of Line Hammer's Avatar
    Join Date
    Apr 2002
    Posts
    6,231
    You could always just use strncat() to be safe. Like so:
    Code:
    #include <stdio.h>
    #include <string.h>
    
    int main(void)
    {
      char buf[5];
      
      buf[sizeof(buf)-1] = '\0';
      strncpy (buf, "12", sizeof(buf));
      if (buf[sizeof(buf)-1] != '\0')
      {
        puts("strncpy failed to copy all the data");
        /*
         Handle error here
         */
      }
      else
      {
        strncat (buf, "3456", sizeof(buf));
        if (buf[sizeof(buf)-1] != '\0')
        {
          puts("strncat failed to copy all the data");
          /*
           Handle error here
           */
        }
      }
      
      return(0);
    }
    When all else fails, read the instructions.
    If you're posting code, use code tags: [code] /* insert code here */ [/code]

  13. #13
    Registered User
    Join Date
    Oct 2002
    Posts
    98
    Oh yeah. Good idea. I like the way that the best solutions are often the simplest!
    There is no such thing as a humble opinion

  14. #14
    Visionary Philosopher Sayeh's Avatar
    Join Date
    Aug 2002
    Posts
    212
    Why not just write good code in the first place?
    thank you, Hammer. Well said.

    No excuse not to write good code when you're being paid for it. If you can't write good code, fast enough, you're underskilled. There is _NO_ excuse for any commercial program to have buffer-overflow problems.

    the truth is that few programmers bother to range check anything any more.

    My hat's off to those of you who do in every piece of code your write.
    It is not the spoon that bends, it is you who bends around the spoon.

  15. #15
    Registered User
    Join Date
    Oct 2002
    Posts
    98
    This original experiment has nothing to do with writing formal commercial code. I was playing games with C, just doing something unusual in code and seeing what would happen next. Ths stuff about how variables are ordered in memory is an interesting outcome of that first thought - a sermon about other people's bad coding standards isn't.
    There is no such thing as a humble opinion

Page 1 of 2 12 LastLast
Popular pages Recent additions subscribe to a feed

Similar Threads

  1. char Handling, probably typical newbie stuff
    By Neolyth in forum C Programming
    Replies: 16
    Last Post: 06-21-2009, 05:05 AM
  2. RicBot
    By John_ in forum C++ Programming
    Replies: 8
    Last Post: 06-13-2006, 07:52 PM
  3. Classes inheretance problem...
    By NANO in forum C++ Programming
    Replies: 12
    Last Post: 12-09-2002, 03:23 PM
  4. creating class, and linking files
    By JCK in forum C++ Programming
    Replies: 12
    Last Post: 12-08-2002, 02:45 PM
  5. Warnings, warnings, warnings?
    By spentdome in forum C Programming
    Replies: 25
    Last Post: 05-27-2002, 07:49 PM

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21