Thread: Playing around with strings

  1. #1
    Registered User
    Join Date
    Apr 2016
    Posts
    17

    Arrow Playing around with strings

    So I'm gonna show you guys some code samples along with the results I get in the hope that you will help me understand why I'm seeing what I'm seeing(Assume all required header files are included.)

    Sample 1:
    Code:
    #define NUMBERS "/tmp/numbers"
    int main() {
       char path[108];
       strncpy(path, NUMBERS, sizeof(path) - 1);
       return 0;
    }
    So here my question is, why do we need to specifically copy sizeof(path)-1 Bytes i.e. 107 Bytes?

    Sample 2(I wrote this myself):
    Code:
    #define NUMBERS "/tmp/numbers"
    int main() {
       char path[20];
       strncpy(path, NUMBERS, sizeof(NUMBERS)-5);
       printf("length of path after: %lu\n", strlen(path));   printf("%s\n", path);
       printf("%d\n", path[strlen(path)]);
       printf("%d\n", path[strlen(path)+1]);
       printf("%d\n", path[strlen(path)+2]);
       printf("%d\n", path[strlen(path)+3]);
       return 0;
    }
    When I run this I get:
    size of NUMBERS: 13
    length of path after: 11
    /tmp/num@@
    0
    0
    0
    0
    Shouldn't path's length be 8 now?
    And what's with all the zeros?
    I had expected the rubbish after num since there's no terminating 0, but I don't get why the characters after that are 0s.

    Sample 3(I wrote this myself):
    Code:
    #define NUMBERS "/tmp/numbers"
    int main() {
      
      char path[10];
      printf("size of NUMBERS: %lu\n", sizeof(NUMBERS));
      strncpy(path, NUMBERS, sizeof(NUMBERS));
      printf("length of path after: %lu\n", strlen(path));
      printf("%s\n", path);
      printf("%d\n", path[strlen(path)]);
      printf("%d\n", path[strlen(path)+1]);
      printf("%d\n", path[strlen(path)+2]);
      printf("%d\n", path[strlen(path)+3]);
      return 0;
    }
    Here I get:
    size of NUMBERS: 13
    length of path after: 12
    /tmp/numbers
    0
    0
    0
    0
    I don't understand how path could have grown. Isn't it limited to 10 Bytes?
    And again the same thing with the zeros at the end.
    Also, is path now 0 terminated?

    Thanks in advance guys

  2. #2
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,612
    One would think you should use the size of path instead. You're just writing past the end of your array and getting lucky there is no crash.

    The manual says:
    The strncpy() function is similar, except that at most n bytes of src are copied. Warning: If there is no null byte among the first n bytes of src, the string placed in dest will not be null-terminated.
    Since there is a chance the string won't be null-terminated, we have to calculate the size to make sure that it actually is terminated.

    Code:
    #include <stdio.h>
    
    
    int main()
    {
        char path[10];
        strncpy(path, "tmp/numbers", sizeof path - 1);
        puts(path);
        return 0;
    }
    This has been rolled into a function that is easier to use called strlcpy(). You can call it or write your own if you are looking for alternatives.

  3. #3
    Registered User
    Join Date
    Apr 2016
    Posts
    17
    Why are we specifying sizeof(path)-1 and not simply sizeof(path)?

  4. #4
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,612
    Because I was trying to save room for the null character at the end.

    I think I did it wrong anyway...the typical implementation of strcpy is an all or nothing proposition:
    Code:
    char *
    strncpy(char *dest, const char *src, size_t n)
    {
        size_t i;
    
    
       for (i = 0; i < n && src[i] != '\0'; i++)
            dest[i] = src[i];
        for ( ; i < n; i++)
            dest[i] = '\0';
    
    
       return dest;
    }
    i.e. no matter what strncpy() will only work if the buffer is big enough.
    Code:
    if (strlen(NUMBERS) < sizeof(path))
       strncpy(path, NUMBERS, sizeof(path));
    strncat() can be used if truncating the string is okay, since strncat is guaranteed to null-terminate the destination.

  5. #5
    Registered User
    Join Date
    Apr 2016
    Posts
    17
    By buffer you mean dest, right?
    Also, sample one was written by my professor, so you can't BOTH be wrong. But it still didn't make sense to me to specifically copy sizeof(path)-1 Bytes if you're not gonna explicitly fill the very last byte with a '\0'. That means he was trying to copy max sizeof(path)-1 bytes, leaving that last byte to be filled by the terminating null which he, however, never does. So, sample 2 and 3 do you have an idea why I'm getting these zeros when I'm trying to access the elements of path which in sample 2 were not filled at all and in sample 3 are out of bounds?

  6. #6
    Registered User
    Join Date
    Apr 2016
    Posts
    17
    Oh also, I'm still confused as to why sample 2 tells me that path is 11 chars long even though only 8 bytes were copied.
    Code:
    strncpy(path, NUMBERS, sizeof(NUMBERS)-5); //sizeof(NUMBERS)-5) = 8
    And I'm assuming that the @@ are caused by the lack of a terminating null.

  7. #7
    Registered User
    Join Date
    May 2010
    Posts
    4,632
    Without the end of string termination strlen() will be some undefined value. You can't know where or when or even if strlen() will ever encounter a value equal to the end of string character. Remember strlen() requires a properly terminated C-string.

    Jim

  8. #8
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,612
    So, sample 2 and 3 do you have an idea why I'm getting these zeros when I'm trying to access the elements of path which in sample 2 were not filled at all and in sample 3 are out of bounds?
    The idea behind strncpy() is that you would copy not only some bytes of string, but also pad nulls for any remaining characters. Look at the second loop in the implementation in post 4.

  9. #9
    Registered User
    Join Date
    Oct 2015
    Posts
    28
    strncpy is for copying strings to zero-padded fields of a fixed length. It is meant to be used for serialising strings to files/networks, not for copying them inside your program.

    To copy a string, just use strcpy,

    Code:
    char path[sizeof NUMBERS];
    strcpy(path, NUMBERS);
    Alternatively, as NUMBERS is a macro, it may be used as an initialiser for the array,

    Code:
    char path[] = NUMBERS;

  10. #10
    Registered User
    Join Date
    Apr 2016
    Posts
    17
    Yes, but only if we're trying to copy more bytes than the length of src i.e. if n>sizeof(src) then n-sizeof(src) 0s will be padded to the end of dest. I can't see that happening in sample 2 and 3 since in sample 2 we're only copying 8 bytes where src is 13 bytes long and in sample 3 we're copying exactly the amount of bytes in src. So in both cases there should be no padding. Could it be that in sample 2 the array was initialized with zeros and in sample 3 (where I'm actually accessing elements that are out of bounds) it just so happens that there are zeros saved at those specific spots in memory?

  11. #11
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,612
    Yes, but only if we're trying to copy more bytes than the length of src i.e. if n>sizeof(src) then n-sizeof(src) 0s will be padded to the end of dest.
    Actually no, this is wrong. The manual says:
    The strncpy() function is similar, except that at most n bytes of src are copied. Warning: If there is no null byte among the first n bytes of src, the string placed in dest will not be null-terminated.
    If the length of src is less than n, strncpy() writes additional null bytes to dest to ensure that a total of n bytes are written.
    sizeof() only tells you how many bytes an object is. The length of a string stored in a very large array can be very different:

    Code:
    // c99:
    #include <stdio.h>
    #include <string.h>
    
    int main(void)
    {
       char foo[1000] = "&#$";
       printf("length of %s = %zu, sizeof(foo) = %zu\n", foo, strlen(foo), sizeof(foo));
    }
    Could it be that in sample 2 the array was initialized with zeros and in sample 3 (where I'm actually accessing elements that are out of bounds) it just so happens that there are zeros saved at those specific spots in memory?
    It is possible. If you learn anything from this experience, it should be that uninitialized variables are unpredictable and that they can impact the correctness of the program.

    If we look at Sample 2 again, we see that for some reason there is @@ in the string which is not correct. Elements after that happen to be 0. In Sample 3 you happen to see what you expect to see.

    When I tested it, I got completely different results from you, and only fun1() was consistently correct.
    Code:
    #include <stdio.h>
    #include <string.h>
    #include <stdlib.h>
    
    #define NUMBERS "/tmp/numbers"
    
    void fun1(void);
    void fun2(void);
    void fun3(void);
    
    int main(void)
    {
        fun1();
        fun2();
        fun3();
        return 0;
    }
    
    void fun1()
    {
         char path[108];
         strncpy(path, NUMBERS, sizeof(path) - 1);
         printf("%s\n", path); /* I forgot to include this line initially */
    }
    
    
    void fun2()
    {
        char path[20];
        strncpy(path, NUMBERS, sizeof(NUMBERS)-5);
        printf("length of path after: %lu\n", strlen(path));
        printf("%s\n", path);
        printf("%d\n", path[strlen(path)]);
        printf("%d\n", path[strlen(path)+1]);
        printf("%d\n", path[strlen(path)+2]);
        printf("%d\n", path[strlen(path)+3]);
    }
    
    
    void fun3()
    {
        char path[10];
        printf("size of NUMBERS: %lu\n", sizeof(NUMBERS));
        strncpy(path, NUMBERS, sizeof(NUMBERS));
        printf("length of path after: %lu\n", strlen(path));
        printf("%s\n", path);
        printf("%d\n", path[strlen(path)]);
        printf("%d\n", path[strlen(path)+1]);
        printf("%d\n", path[strlen(path)+2]);
        printf("%d\n", path[strlen(path)+3]);
    }

    C:\Users\jk\Desktop>stringsfun
    length of path after: 9
    /tmp/num$
    0
    0
    0
    -124
    size of NUMBERS: 13
    length of path after: 12
    /tmp/numbers
    0
    0
    -128
    18


    C:\Users\jk\Desktop>stringsfun
    /tmp/numbers
    length of path after: 8
    /tmp/num
    0
    0
    0
    0
    size of NUMBERS: 13
    length of path after: 12
    /tmp/numbers
    0
    0
    20
    0

    I mean, you might assume or even expect memory to be zero, but just look at how different it is for me.

    I won't lie, I probably handled compiling differently from you:
    gcc -ansi -O3 -s -o stringsfun.exe stringsfun.c

    But in a perfect world, and with robust code, it's not supposed to make a difference if I optimize or not, and it does.
    Last edited by whiteflags; 06-18-2016 at 09:25 PM.

  12. #12
    Registered User
    Join Date
    Apr 2016
    Posts
    17
    One last thing though. Take a look at this code. It's a modified version of sample 1.
    Code:
    #define NUMBERS "/tmp/numbers"
    int main() {
        char path[10];
        strncpy(path, NUMBERS, sizeof(path) - 1);
        return 0;
    }
    So here my dest is 10 bytes in size. I'm gonna attempt to copy maximum 9 bytes in there to make sure that at least the 10th byte is a zero. This makes sense, since I don't know how big my source is and whether or not it's null terminated. In this example, source is bigger than 9 bytes which means there is no way the first 9 bytes of dest are gonna contain a null, since only the first 9 bytes of source are gonna be copied. However, we don't explicitly fill the 10th byte of dest with a 0. Meaning that in this example dest is still not null terminated. That was my source of confusion regarding sizeof(path)-1. Is my confusion in any way justified?
    Last edited by Omar Sharaki; 06-26-2016 at 06:40 AM.

  13. #13
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,661
    You are correct.
    If you don't explicitly set a \0 when using strncpy, then you can't use that string with any function which expects a \0 to terminate the string.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Playing wav
    By uhhstepup in forum C++ Programming
    Replies: 4
    Last Post: 10-09-2009, 10:59 AM
  2. Playing with AI
    By Raigne in forum Game Programming
    Replies: 27
    Last Post: 11-19-2008, 11:33 PM
  3. Playing with the STL.
    By Neo1 in forum C++ Programming
    Replies: 2
    Last Post: 07-18-2007, 09:15 PM
  4. Playing a .mid
    By Cactus_Hugger in forum Windows Programming
    Replies: 2
    Last Post: 07-07-2003, 12:55 AM
  5. Playing with meh CDs
    By SMurf in forum Windows Programming
    Replies: 5
    Last Post: 11-30-2002, 09:48 PM

Tags for this Thread