Thread: concatenate string || better way

  1. #1
    Registered User
    Join Date
    Oct 2013
    Posts
    87

    concatenate string || better way

    Hi there,

    I am new to C from biology background.
    I have two char arrays I would like to concatenate. The expected output is two different strings. For example inputs are john and abbey.

    The outputs are:

    john*abbey
    abbey*john

    That is I need to add an '*' between the two strings.

    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <malloc.h>
    #include <string.h>
    
    
    #define MAX_LEN 256
    #define index_length 5 //decide size of array 
    
    oid test_cont(char *p1, char *p2)
    {
        int test = 0;
        char string1[200]; //store copy of john*abbey
        char *s1; //iterate over the array pointers to not to make any change in the original ones
        
        char string2[200]; //store copy of abbey*john
    
    
        //for loop
        for (s1 = p1; *s1 != '\0'; s1++)
        {
            string1[test] = *s1; //store value 
            test++;
        }
        string1[test] = '*'; //add an asterix 
    
    
        test = test + 1; //increase by one
    
    
        for (s1 = p2; *s1 != '\0'; s1++)
        {
            string1[test] = *s1;
            test++;
        }
        string1[test] = '\0'; //terminate string by null
    
    
        ////////////////////////////////////////////////////
        test = 0; //start again 
    
    
        for (s1 = p2; *s1 != '\0'; s1++)
        {
            string2[test] = *s1;
            test++;
        }
        //store first string 
        
        string2[test] = '*'; //add an asterix 
        test = test + 1;
    
    
        for (s1 = p1; *s1 != '\0'; s1++) 
        {
            string2[test] = *s1;
            test++;
        }
           string2[test] = '\0'; //terminate string by null
        printf("The two strings are %s %s\n", string1, string2);
    }
    
    
    
    
    int main(int argc, char *argv[])
    {
        char names[index_length][10] = {"john", "abbey", "mouse", "micky", "minney"};
        test_cont(names[0], names[1]);
        printf("name is %s\n", names[0]);
        
        return 0;
    }
    The code works. There are four for loops, with manual resetting of iterator, setting of null value in the strings.

    However, I would like to improve code, either by strcat or strcpy.
    Last edited by deathmetal; 02-27-2021 at 06:23 PM.

  2. #2
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    I might suggest something like this:
    Code:
    #include <stdio.h>
    #include <string.h>
    
    char *join_by_char(char *dest, size_t dest_maxlen, const char *left, char glue, const char *right)
    {
        size_t left_len = strlen(left);
        dest[0] = '\0';
        strncat(dest, left, dest_maxlen);
        if (dest_maxlen > left_len)
        {
            dest[left_len] = glue;
            dest[left_len + 1] = '\0';
            if (dest_maxlen > left_len + 1)
            {
                strncat(dest + left_len + 1, right, dest_maxlen - left_len - 1);
            }
        }
        return dest;
    }
    
    int main(void)
    {
        char names[][100] = {"john", "abbey", "mouse", "micky", "minney"};
        char result[200];
        for (size_t i = 0; i < sizeof(names) / sizeof(names[0]) - 1; ++i)
        {
            printf("name is %s\n", join_by_char(result, sizeof(result) - 1, names[i], '*', names[i + 1]));
            printf("name is %s\n", join_by_char(result, sizeof(result) - 1, names[i + 1], '*', names[i]));
        }
        return 0;
    }
    It actually does more work than your solution because it computes the length of the left input string, does length checking, and potentially appends the null character thrice instead of once. On the other hand, it is not constrained by the hard-coded array sizes and avoids buffer overflow, plus it is more reusable.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  3. #3
    Registered User awsdert's Avatar
    Join Date
    Jan 2015
    Posts
    1,733
    Quote Originally Posted by laserlight View Post
    I might suggest something like this:
    Code:
    #include <stdio.h>
    #include <string.h>
    
    char *join_by_char(char *dest, size_t dest_maxlen, const char *left, char glue, const char *right)
    {
        size_t left_len = strlen(left);
        dest[0] = '\0';
        strncat(dest, left, dest_maxlen);
        if (dest_maxlen > left_len)
        {
            dest[left_len] = glue;
            dest[left_len + 1] = '\0';
            if (dest_maxlen > left_len + 1)
            {
                strncat(dest + left_len + 1, right, dest_maxlen - left_len - 1);
            }
        }
        return dest;
    }
    
    int main(void)
    {
        char names[][100] = {"john", "abbey", "mouse", "micky", "minney"};
        char result[200];
        for (size_t i = 0; i < sizeof(names) / sizeof(names[0]) - 1; ++i)
        {
            printf("name is %s\n", join_by_char(result, sizeof(result) - 1, names[i], '*', names[i + 1]));
            printf("name is %s\n", join_by_char(result, sizeof(result) - 1, names[i + 1], '*', names[i]));
        }
        return 0;
    }
    It actually does more work than your solution because it computes the length of the left input string, does length checking, and potentially appends the null character thrice instead of once. On the other hand, it is not constrained by the hard-coded array sizes and avoids buffer overflow, plus it is more reusable.
    Slow and inefficient, at work so can't elaberate but there's a faster way with less calls

  4. #4
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by awsdert
    Slow and inefficient, at work so can't elaberate but there's a faster way with less calls
    You must have missed the part where I explained that it is less efficient than deathmetal's original code, which itself is not too bad where efficiency is concerned. If the aim is to be efficient while retaining some measure of buffer flow avoidance and reusability, I would go for something like this instead:
    Code:
    #include <assert.h>
    #include <stdio.h>
    #include <string.h>
    
    char *join_by_char(char *dest, size_t dest_maxlen, const char *left, char glue, const char *right, size_t left_len, size_t right_len)
    {
        assert(left_len + right_len < dest_maxlen);
        memcpy(dest, left, left_len);
        dest[left_len] = glue;
        memcpy(dest + left_len + 1, right, right_len);
        dest[left_len + right_len + 1] = '\0';
        return dest;
    }
    
    int main(void)
    {
        char names[][100] = {"john", "abbey", "mouse", "micky", "minney"};
        char result[200];
        for (size_t i = 0; i < sizeof(names) / sizeof(names[0]) - 1; ++i)
        {
            size_t len1 = strlen(names[i]);
            size_t len2 = strlen(names[i + 1]);
            printf("name is %s\n", join_by_char(result, sizeof(result) - 1, names[i], '*', names[i + 1], len1, len2));
            printf("name is %s\n", join_by_char(result, sizeof(result) - 1, names[i + 1], '*', names[i], len2, len1));
        }
        return 0;
    }
    But it lacks the kind of flexibility as with my earlier example where dest_maxlen can be used to limit the destination string rather than merely function as an assertion check. It's possible to re-insert the limit functionality, but then it'll basically be replacing the strncat calls in my earlier example with memcpy + the input string lengths, hence avoiding the extra length checking, null character search, and null character appending. Plausible, but ultimately demonstrating the use of strncat may be better for general use with null terminated strings.

    Of course, if you have an even more efficient solution, you should go ahead and let deathmetal know about it. After all, what I posted above does not result in fewer function calls.
    Last edited by laserlight; 02-28-2021 at 07:16 AM.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  5. #5
    Registered User
    Join Date
    Feb 2019
    Posts
    1,078
    Code:
    char str1[] = "john";
    char str2[] = "abbey";
    
    char *r1, *r2;
    
    // FIXME: Should test if asprintf returns negative value!
    // Must remember to free() the buffers afterwards.
    asprintf( &r1, "%s*%s", str1, str2 );
    asprintf( &r2, "%s*%s", str2, str1 );
    For Windows (which don't have asprintf):
    Code:
    #if defined(__WINNT__) || defined(_WIN32)
    int asprintf( char **pp, char *fmt, ... )
    {
        char *p;
        int size;
        va_list args;
    
        va_start( args, fmt );
    
        // Just get the string size.
        if ( ( size = vsnprintf( NULL, 0, fmt, args ) ) < 0 )
        {
            va_end( args );
            return -1;
        }
    
        if ( ! ( p = malloc( size + 1 ) ) )
        {
            va_end( args );
            return -1;
        }        
    
        // FIX: Must reset va_list.
        va_end( args );
        va_start( args, fmt );
    
        vsprintf( p, fmt, args );
        *pp = p;
    
        va_end( args );
    
        return size;
    }
    #endif

  6. #6
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Since deathmetal appears to want to loop and seems to have a maximum length in mind, wouldn't it make more sense to use snprintf if you're taking that approach rather than just concatenating?
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  7. #7
    Registered User
    Join Date
    Feb 2019
    Posts
    1,078
    Quote Originally Posted by laserlight View Post
    Since deathmetal appears to want to loop and seems to have a maximum length in mind, wouldn't it make more sense to use snprintf if you're taking that approach rather than just concatenating?
    My approach is only to show that he doesn't need to use strlen(), strcpy(), strcat() approach and to limit the size of output string... But, of course, if both original strings are limited to 100 chars (what will happen if a name has 101 chars? no NUL char!) and output string is limited to 200 chars, in that case, snprintf() is a good choice... Remembering, if needed, that snprintf() will NOT return the number of chars writen.

  8. #8
    Registered User awsdert's Avatar
    Join Date
    Jan 2015
    Posts
    1,733
    Quote Originally Posted by laserlight View Post
    You must have missed the part where I explained that it is less efficient than deathmetal's original code, which itself is not too bad where efficiency is concerned. If the aim is to be efficient while retaining some measure of buffer flow avoidance and reusability, I would go for something like this instead:
    Code:
    #include <assert.h>
    #include <stdio.h>
    #include <string.h>
    
    char *join_by_char(char *dest, size_t dest_maxlen, const char *left, char glue, const char *right, size_t left_len, size_t right_len)
    {
        assert(left_len + right_len < dest_maxlen);
        memcpy(dest, left, left_len);
        dest[left_len] = glue;
        memcpy(dest + left_len + 1, right, right_len);
        dest[left_len + right_len + 1] = '\0';
        return dest;
    }
    
    int main(void)
    {
        char names[][100] = {"john", "abbey", "mouse", "micky", "minney"};
        char result[200];
        for (size_t i = 0; i < sizeof(names) / sizeof(names[0]) - 1; ++i)
        {
            size_t len1 = strlen(names[i]);
            size_t len2 = strlen(names[i + 1]);
            printf("name is %s\n", join_by_char(result, sizeof(result) - 1, names[i], '*', names[i + 1], len1, len2));
            printf("name is %s\n", join_by_char(result, sizeof(result) - 1, names[i + 1], '*', names[i], len2, len1));
        }
        return 0;
    }
    But it lacks the kind of flexibility as with my earlier example where dest_maxlen can be used to limit the destination string rather than merely function as an assertion check. It's possible to re-insert the limit functionality, but then it'll basically be replacing the strncat calls in my earlier example with memcpy + the input string lengths, hence avoiding the extra length checking, null character search, and null character appending. Plausible, but ultimately demonstrating the use of strncat may be better for general use with null terminated strings.

    Of course, if you have an even more efficient solution, you should go ahead and let deathmetal know about it. After all, what I posted above does not result in fewer function calls.
    Yeah I did miss that part, trying to read in forums while working doesn't go well, anyways my computer decided to ........ itself while updating so now I'm struggling to get linux to boot, anyways here's the best psuedocode I can do on my phone:
    Code:
    join_by_char( dst, dlen, cap, first, flen, glue, after, alen)
      nxt = dlen + 1
      nxt = ((nxt >= cap) * (cap-1)) + ((nxt < cap) * nxt)
      dst[dlen] = glue
      dst[nxt] = 0
      strncat( dst, first, cap )
      nxt += len
      nxt = ((nxt >= cap) * (cap - 1)) + ((nxt < cap) * nxt)
      dst[nxt] = glue
      strncat( dst, after, cap )
      nxt += alen
      nxt = ((nxt >= cap) * (cap -1)) + ((nxt < cap) * nxt)
      dst[nxt] = 0
    More often than not the caller already has the needed lengths so just provide parameters for them, typing the above was a pain so I skipped the nxt + 1 that would've been after the 2nd glue placement, just copy the code pattern for that. the longer statements tell the compiler that there's no need to branch, just use math trickery to nullify the unwanted resulted from being placed in nxt.

    I haven't checked the whole thread so I don't know if unable to modify the function parameters but I'm sure you understand why this currently isn't my priority.

  9. #9
    Registered User
    Join Date
    Dec 2017
    Posts
    1,632
    Quote Originally Posted by flp1969 View Post
    snprintf() will NOT return the number of chars writen.
    What do you mean? snprintf() always returns the number of characters that were (or would be) written. Why wouldn't it?
    A little inaccuracy saves tons of explanation. - H.H. Munro

  10. #10
    Registered User
    Join Date
    Feb 2019
    Posts
    1,078
    Quote Originally Posted by john.c View Post
    What do you mean? snprintf() always returns the number of characters that were (or would be) written. Why wouldn't it?
    "would be"...

  11. #11
    Registered User
    Join Date
    Oct 2013
    Posts
    87
    Thank you laserlight.

    I have folowing questions:
    1) I cannot understand why dest[0] = '\0'; is needed in the first place.

    2) is it OK to mix size_t with C code? As far as I remember C doesn't allow initializing a variable in the for loop.
    for (size_t i = 0; i < sizeof(names) / sizeof(names[0]) - 1; ++i)


    I have no idea what sprintf in windows is. I compile code as:

    gcc -Wextra -Wall get_2Darrays.c -o array

    Sorry for the late reply, after hitting reply the page keeps on loading.



    Quote Originally Posted by laserlight View Post
    I might suggest something like this:
    Code:
    #include <stdio.h>
    #include <string.h>
    
    char *join_by_char(char *dest, size_t dest_maxlen, const char *left, char glue, const char *right)
    {
        size_t left_len = strlen(left);
        dest[0] = '\0';
        strncat(dest, left, dest_maxlen);
        if (dest_maxlen > left_len)
        {
            dest[left_len] = glue;
            dest[left_len + 1] = '\0';
            if (dest_maxlen > left_len + 1)
            {
                strncat(dest + left_len + 1, right, dest_maxlen - left_len - 1);
            }
        }
        return dest;
    }
    
    int main(void)
    {
        char names[][100] = {"john", "abbey", "mouse", "micky", "minney"};
        char result[200];
        for (size_t i = 0; i < sizeof(names) / sizeof(names[0]) - 1; ++i)
        {
            printf("name is %s\n", join_by_char(result, sizeof(result) - 1, names[i], '*', names[i + 1]));
            printf("name is %s\n", join_by_char(result, sizeof(result) - 1, names[i + 1], '*', names[i]));
        }
        return 0;
    }
    It actually does more work than your solution because it computes the length of the left input string, does length checking, and potentially appends the null character thrice instead of once. On the other hand, it is not constrained by the hard-coded array sizes and avoids buffer overflow, plus it is more reusable.

  12. #12
    Registered User
    Join Date
    Oct 2013
    Posts
    87
    I have never used strncat. I think the third parameter is for the number of characters to append. That makes me ask
    in dest_maxlen - left_len - 1

    is -1 at the end for the null character?

    Thank you,


    Quote Originally Posted by laserlight View Post
    I might suggest something like this:
    Code:
            if (dest_maxlen > left_len + 1)
            {
                strncat(dest + left_len + 1, right, dest_maxlen - left_len - 1

  13. #13
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by awsdert
    anyways my computer decided to ........ itself while updating so now I'm struggling to get linux to boot
    Ugh, hope you get that figured out. Between my personal laptop and my work laptop I'm effectively triple booting Mac, Linux, and Windows hahaha

    Quote Originally Posted by awsdert
    More often than not the caller already has the needed lengths so just provide parameters for them
    That's a good point, but it does involve a trade-off for those times when the caller doesn't have the needed lengths. Furthermore, if you're using a two strncat solution rather than a two memcpy solution, you only need the length of one of the input strings, i.e., to figure out where to insert the glue character. If you look at my code from post #2, you'll see that it takes advantage of this to avoid computing the input string lengths twice, hence it is not less efficient than your code in this respect.

    If you're going to require both the input string lengths, then you might as well use memcpy since the point of strncat is to avoid buffer overflow by accounting for both the destination string's maximum length and terminating at the first null character of the input string, but if you already know the input string length, then you don't need strncat to search for that null character while copying.

    Quote Originally Posted by awsdert
    the longer statements tell the compiler that there's no need to branch, just use math trickery to nullify the unwanted resulted from being placed in nxt.
    That's where we disagree: this is premature optimisation since we don't have deathmetal's actual code to work with, thus we have no way to measure to figure out if this is a bottleneck that should be highly optimised, and if manual branch elimination will actually be a significant improvement in the first place. It's plausible that this is a bottleneck, of course, since deathmetal appears to want to use this in a loop, but until we have the code to work with, it is better to go for what is easy to understand than "math trickery" that requires extra effort to verify and which might not be an overall win.

    Quote Originally Posted by awsdert
    I haven't checked the whole thread so I don't know if unable to modify the function parameters
    You should read the whole thread though, otherwise it can be difficult to uh, untangle the threads of discussion
    I was the one who came up with the function as an example, so of course we're free to change whatever we want.

    Quote Originally Posted by john.c
    What do you mean? snprintf() always returns the number of characters that were (or would be) written. Why wouldn't it?
    flp1969 was being extremely precise

    Quote Originally Posted by deathmetal
    I have folowing questions:
    1) I cannot understand why dest[0] = '\0'; is needed in the first place.
    It's because I chose to use strncat rather than strncpy. strncat's original purpose is to concatenate to the end of an existing string, hence the existing string must exist, so dest needs to be null terminated to become an empty string. If you use strncpy, you don't need to do this, but the catch with strncpy is that you have to be careful to null terminate if dest_maxlen == left_len as strncpy will not do so for you. So you could say it is a matter of picking your poison.

    Quote Originally Posted by deathmetal
    2) is it OK to mix size_t with C code? As far as I remember C doesn't allow initializing a variable in the for loop.
    This becomes a question of what version of the C standard are you compiling against. If you have to work with an earlier version of C, then you just need to adjust your code accordingly.

    Quote Originally Posted by deathmetal
    I have no idea what sprintf in windows is. I compile code as:

    gcc -Wextra -Wall get_2Darrays.c -o array
    You should be able to use sprintf with any C standard library implementation. The problem comes if you're trying to use snprintf or asprintf: the former might not be supported if your standard library version is too old, and the latter is an extension to the standard library hence flp1969's comment about Windows not supporting it. You're not specifying any particular C standard in the options to gcc, so the question of standard library version becomes a matter of what version of gcc you're using.

    Quote Originally Posted by deathmetal
    I have never used strncat. I think the third parameter is for the number of characters to append. That makes me ask
    in dest_maxlen - left_len - 1

    is -1 at the end for the null character?
    No, it is for the glue character, i.e., the asterisk. The idea is that you only want to append up to the maximum length of the destination string, so you have to exclude what is already in the destination string, i.e., the left string and the glue character.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  14. #14
    Registered User
    Join Date
    Oct 2013
    Posts
    87
    Thank you laser~!

  15. #15
    Registered User
    Join Date
    Oct 2013
    Posts
    87
    I would like to store the returned in the result array, so I make changes as below after //
    It gets annoyed with an error:


    Quote Originally Posted by laserlight View Post
    Code:
    #include <stdio.h>
    #include <string.h>
    
    char *join_by_char(char *dest, size_t dest_maxlen, const char *left, char glue, const char *right)
    {
        size_t left_len = strlen(left);
        dest[0] = '\0';
        strncat(dest, left, dest_maxlen);
        if (dest_maxlen > left_len)
        {
            dest[left_len] = glue;
            dest[left_len + 1] = '\0';
            if (dest_maxlen > left_len + 1)
            {
                strncat(dest + left_len + 1, right, dest_maxlen - left_len - 1);
            }
        }
        return dest;
    }
    
    int main(void)
    {
        char names[][100] = {"john", "abbey", "mouse", "micky", "minney"};
        char result[200];
        for (size_t i = 0; i < sizeof(names) / sizeof(names[0]) - 1; ++i)
        {
    //        printf("name is %s\n", join_by_char(result, sizeof(result) - 1, names[i], '*', names[i + 1]));
    
    //make change as
    
      result=join_by_char(result, sizeof(result) - 1, names[i], '*', names[i + 1]) ;
    
            printf("name is %s\n", join_by_char(result, sizeof(result) - 1, names[i + 1], '*', names[i]));
        }
        return 0;
    }

    if I make above change I get error as
    error: assignment to expression with array type
    I cannot understand why:
    1) result is a char (pointer).
    2) join_by_char returns a char pointer.

    One way could be to strcpy:
    Code:
         strcpy( result,join_by_char(result, sizeof(result) - 1, names[i], '*', names[i + 1]) );
    This works completely.

    However, I would like to know why returning doesn't work in the first place. My guess would be that the pointer pointing to the array dies as the function ends being local?
    Thank you
    Last edited by deathmetal; 03-01-2021 at 05:17 PM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 3
    Last Post: 12-25-2020, 09:11 AM
  2. I can't concatenate a string
    By qqsszz in forum C Programming
    Replies: 9
    Last Post: 06-29-2018, 03:55 PM
  3. How do you concatenate a char to a string
    By sigur47 in forum C++ Programming
    Replies: 15
    Last Post: 02-20-2012, 01:45 PM
  4. Concatenate string and int
    By millsy5 in forum C Programming
    Replies: 1
    Last Post: 01-28-2010, 04:43 AM
  5. Concatenate a String?
    By sharkbate24 in forum C++ Programming
    Replies: 7
    Last Post: 01-02-2009, 05:16 PM

Tags for this Thread