Thread: Escaping non-printable characters in strings

  1. #1
    Registered User
    Join Date
    Apr 2011
    Posts
    1

    Question Escaping non-printable characters in strings

    I am trying to escape all of the non-printable characters in a given string.
    For example:
    input to function -"\x01 foo \xFE"
    Should output -"#01 foo #FE"

    However the output I receive is "ミ>#01 foo #FFFFFFFE", so basically it outputs garbage at the beginning and end.
    I have the feeling that the problem is related to my string handling but I am not sure.
    Here is my code:
    Code:
    #include <stdlib.h>
    #include <stdio.h>
    #include <string.h>
    #include <ctype.h>
    #include <assert.h>
    
    char *escape_string(char *value)
    {
        /* Escapes a string */
        
        char *result = (char*)malloc(sizeof(char) * strlen(value));
        int i;
        for(i = 0; i < strlen(value); i++)
        {
            char tempval = value[i];
            //If character isn't printable
            if(isprint(tempval) != 0)
            {
                strncat(result, &tempval, sizeof(tempval)); 
                printf("Contents: %s\n", &tempval);
            }
            else
            {
                char *buf;
                size_t sz = 0;
                sz = snprintf(NULL, 0, "#%02X", tempval);
                buf = malloc(sizeof(tempval) * sz+1);
                snprintf(buf, sz+1, "#%02X", tempval);
                
                strncat(result, buf, strlen(buf)-1);
                printf("Contents: %s\n", buf);
                free(buf);
            }
        }
        char *buffer = (char*)malloc(sizeof(char) * strlen(result));
        strncat(buffer, result, strlen(result));
        free(result);
        printf("Final buffer contents: %s\n", buffer);
        return buffer;
    }
    
    int main()
    {
        char *testvalue = "\x01 foo \xFE";
        char *testescape = "#01 foo #FE";
        assert(escape_string(testvalue) == testescape);
        return 0;
    }
    It would be greatly appreciated if someone could point out what 'is' or 'could' be wrong with my code

    Thanks in advance
    Last edited by HAKUtora; 04-18-2011 at 08:24 AM.

  2. #2
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,612
    Code:
    strncat(result, &tempval, sizeof(tempval));
    That's very wrong.

    strncat is for strings. You should not take the address of a mundane character and expect it to work because the source should be zero terminated.

    If value[i] is printable and part of the result, you should do

    Code:
    *result++ = tempval;
    Unless you want to make tempval a zero terminated string. If you do make tempval a string, then the last argument to strncat will be the length of the appended string, just as it always is. sizeof is almost never the right thing, it definitely isn't here, and if it happens to be right some other occasion, it is a coincidence.

    As for the other time, when tempval is not printable I have to wonder why you need to do this so often:
    Code:
    sz = snprintf(NULL, 0, "#%02X", tempval);
    buf = malloc(sizeof(tempval) * 4);
    You should do this exactly once, not once per every instance of non-printable things. And do it right:

    Code:
    sz = strlen(value);
    buf = malloc(sz * 4 + 1);
    Assuming malloc succeeds in returning buf, then you have a string 4 times the size of the value argument, plus 1 for the zero character.

  3. #3
    Registered User
    Join Date
    Sep 2007
    Posts
    1,012
    Code:
    char *result = (char*)malloc(sizeof(char) * strlen(value));
    There is not enough space here to hold your string. First, if you were just copying a string, you'd need strlen(value)+1 bytes; the 1 is required because strings must be terminated by a null character, which strlen() does not count. Next, you're expanding the string. \x01 is a single byte, but you're replacing it with #01, which is three bytes. Your target buffer, worst case, needs to be three times the size of the original buffer (strlen()*3 + 1, to be exact).
    Code:
    strncat(result, &tempval, sizeof(tempval));
    strncat() requires a string as the first argument. A string means there must be a null character. This is required because, otherwise, strncat() would not know where to start appending the new string. When you call malloc(), the contents of the memory are unspecified; they need not be zeroes, which means you can't use it as the target of a strncat(). You can fix this easily enough by setting the first byte of “result” to zero.
    Code:
    sz = snprintf(NULL, 0, "#%02X", tempval); 
    buf = malloc(sizeof(tempval) * 4);
    snprintf(buf, sz+1, "#%02X", tempval);
    If you're going to use snprintf() to determine how much space is required to store the resulting string, you ought to pass that value to malloc().
    As for why you're getting FFFFFFF: %X expects an unsigned int. \xfe, if chars are signed, is very likely going to have the value -2. When this is converted to unsigned int, you get a very large number. You should cast tempval to an unsigned char, forcing it into the range 0 to 255 (assuming an 8-bit char). This result, in all reality, should probably then be converted to an unsigned int, but in practice it will work fine as-is.
    Code:
    strncat(result, buf, strlen(buf)-1);
    Because strlen() doesn't count the null character, you're chopping off the last (non-null) byte of your string. Don't subtract 1 here. There's no need for strncat(), either. Just use strcat().
    Code:
    char *buffer = (char*)malloc(sizeof(char) * strlen(result));
    strncat(buffer, result, strlen(result));
    free(result);
    printf("Final buffer contents: %s\n", buffer);
    return buffer;
    Again, strncat() requires the target to be a string. But you don't want strncat() here. You want strcpy().

    However, I'm not sure why you're copying the string into a new buffer. You could just return “result”; or if you're trying to save space by returning a buffer of exactly the proper size, you can just use realloc() to re-size “result”.
    Code:
    assert(escape_string(testvalue) == testescape);
    You cannot compare strings with ==. Use strcmp().

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. awk -- escaping the delimiter
    By Kennedy in forum Linux Programming
    Replies: 7
    Last Post: 06-07-2010, 12:14 AM
  2. Counting printable and non-printable chars
    By dmux5 in forum C Programming
    Replies: 7
    Last Post: 07-03-2007, 09:07 PM
  3. ASCII Printable Characters
    By dalek in forum C# Programming
    Replies: 3
    Last Post: 08-11-2003, 04:13 AM
  4. Characters in strings
    By Unregistered in forum C++ Programming
    Replies: 8
    Last Post: 06-13-2002, 02:04 PM
  5. Printable version of the C++ tutorials
    By Datamike in forum A Brief History of Cprogramming.com
    Replies: 0
    Last Post: 01-04-2002, 07:30 AM

Tags for this Thread