Thread: Weird realloc issue

  1. #1
    Registered User
    Join Date
    Apr 2010
    Posts
    10

    Weird realloc issue

    Hey everyone. Been a while since I posted last. I decided to brush up on my C programming skills and try to code a little program for me and my friends to use for a role playing game that we play. Anyways, here's the problem:

    Basically what I'm trying to do is write my own version of sprintf that will dynamically allocate memory for the output string based on the length of the input strings. The purpose for this is to make building SQL statements easier; I don't want to have to worry about how long some of the table name, and other values are and allocate unnecessarily large spaces for each statement.

    The my_sprintf function works by first reallocating space for the first input string into the variable str. When it encounters a % character it reallocates space for the first and second input strings and replaces the % character with the second input string. When it encounters another % character it reallocates space for the (newly lengthened) str and the third input string and replaces the % character with the third input string. And so on.

    The function is only intended to work with strings for simplicity. I'll just use itoa() when any integers get involved. The program below should print out "SELECT 'John Doe' FROM 'Company'".

    The problem is that realloc is returning a NULL pointer the next time it is run in the for loop. I can't for the life of my figure out why. Also, this is my first time creating a function with a variable number of inputs. If I've made a mistake in that area, I wouldn't be surprised.

    I apologize for that lack of comments in the code. I've been meaning to go back and comment it, but haven't gotten around to it. And personally, I find short bits of code like this easier to read when there isn't a comment every other line.
    Code:
    int my_sprintf(char **str, char *format, ...);
    
    int main() {
        char *sql_string=malloc(sizeof(char));
        char name[]="John Doe";
        char people[]="Company";
    
        my_sprintf(&sql_string,"SELECT '%' FROM '%'",name,people);
        printf("%s\n",sql_string);
    
        free(sql_string);
        return 0;
    }
    
    int my_sprintf(char **str, char *format, ...) {
        va_list ap; 
        char *first_string;
        char *next_arg;
        
        //Allocate space for "SELECT '%' FROM '%'" and write it into str
        va_start(ap, format);
        first_string=malloc((strlen(format)+1)*sizeof(char));
        (*str)=realloc((*str),(strlen(format)+1)*sizeof(char));
        for (int i=0;i<=strlen(format);i++) {
            first_string[i]=format[i];
        }
        //i is the position in the first string
        //j is the position in the entire output string including arguments that are inserted in place of %'s
        //k is the position in each of the arguments that will be replacing the %'s
        int i=0;
        int j=0;
        for (;i<=strlen(first_string);i++, j++) {
            if (first_string[i]=='%') {
                //The first run will allocate space for "SELECT 'John Doe' FROM '%'"
                //The second run will allocate space for "SELECT 'John Doe' FROM 'Company'"
                next_arg=va_arg(ap, char *);
                (*str)=realloc((*str),(strlen((*str))+strlen(next_arg)+1)*sizeof(char)); //The problem is right here. realloc returns NULL the second time the for loop runs.
                for (int k=0;k<strlen(next_arg);k++, j++) {
                    (*str)[j]=next_arg[k];
                    (*str)[j+1]='\0';
                }
                j--;
            }
            else {
                //For all other non-% characters, write the next character from the first string into str
                (*str)[j]=first_string[i];
            }
        }
        va_end(ap);
        
        free(first_string);
        return 0;
    }

  2. #2
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Wouldn't it be a lot simpler to start off with a more than large enough string buffer, use sprintf() to format your text into it, then realloc based on the resulting strlen() after the fact?

  3. #3
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    It works^H^H^H^H^Hgives the expected output here, which doesn't help you at all unfortunately. Can you check what strlen(next_arg) works out to be when it fails?

    (Also: why are you reimplementing strcpy with your loop that puts format into first_string?)

  4. #4
    VIM addict
    Join Date
    May 2009
    Location
    Chennai, India
    Posts
    43
    I can see two problem in your code

    1. Why are you appending '\0' after every byte assignment. It makes no sense
    Code:
    for (int k=0;k<strlen(next_arg);k++, j++) {
                    (*str)[j]=next_arg[k];
                    (*str)[j+1]='\0';
                }
    2. Because of the above code the strlen(*str) value won't change between loops. Try printing its *str and its strlen value
    Code:
    (*str)=realloc((*str),(strlen((*str))+strlen(next_arg)+1)*sizeof(char));

  5. #5
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    I think you have that backwards -- the first line you marked in red is there precisely so that the strlen that you marked in red in the second bit works. (Admittedly he could do it after the loop instead of inside the loop, right before the j--. Also admittedly, j is strlen so it doesn't need to be called.)

  6. #6
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    Your inner realloc is using the wrong strings.

    *str is a short string, so if you have ABC%DEF you'll allocate enough for ABC and the argument. You're buffer overrunning when you get to DEF

    This code keeps track of
    - the length of the format
    - the length of ALL the arguments already added to the string.

    If you want to be pedantic, the amount of memory allocated is over what is needed by the number of % characters in the format string.

    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <stdarg.h>
    
    int my_sprintf(char **str, char *format, ...);
    
    int main() {
        char *sql_string=NULL;
        char name[]="John Doe";
        char people[]="Company";
    
        my_sprintf(&sql_string,"SELECT '%' FROM '%'",name,people);
        printf("%s\n",sql_string);
    
        free(sql_string);
        return 0;
    }
    
    int my_sprintf(char **str, char *format, ...) {
        va_list ap;
        char *next_arg;
        size_t  formatLen = strlen(format); // length of format
        size_t  allArgLen = 0;  // length of ALL arguments added so far
    
        va_start(ap, format);
    
        //Allocate space for "SELECT '%' FROM '%'"
        (*str)=realloc((*str),(formatLen+1)*sizeof(char));
        (*str)[0] = '\0';
    
        //i is the position in the first string
        //j is the position in the entire output string including arguments that are inserted in place of %'s
        //k is the position in each of the arguments that will be replacing the %'s
        int i=0;
        int j=0;
        for (i=0;i<formatLen;i++) {
            if (format[i]=='%') {
                //The first run will allocate space for "SELECT 'John Doe' FROM '%'"
                //The second run will allocate space for "SELECT 'John Doe' FROM 'Company'"
                next_arg=va_arg(ap, char *);
                size_t  nextArgLen = strlen(next_arg);
                (*str)=realloc((*str),(formatLen+allArgLen+nextArgLen+1)*sizeof(char));
                for (int k=0;k<nextArgLen;k++) {
                    (*str)[j++]=next_arg[k];
                }
                allArgLen += nextArgLen;
            }
            else {
                //For all other non-% characters, write the next character from the first string into str
                (*str)[j++]=format[i];
            }
        }
        (*str)[j]='\0';  // no longer strlen this at all, so we can \0 at the end
        va_end(ap);
    
        return 0;
    }
    
    
    
    $ gcc -std=c99 -Wall -g foo.c
    $ valgrind ./a.out
    ==2974== Memcheck, a memory error detector
    ==2974== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.
    ==2974== Using Valgrind-3.6.0.SVN-Debian and LibVEX; rerun with -h for copyright info
    ==2974== Command: ./a.out
    ==2974== 
    SELECT 'John Doe' FROM 'Company'
    ==2974== 
    ==2974== HEAP SUMMARY:
    ==2974==     in use at exit: 0 bytes in 0 blocks
    ==2974==   total heap usage: 3 allocs, 3 frees, 83 bytes allocated
    ==2974== 
    ==2974== All heap blocks were freed -- no leaks are possible
    ==2974== 
    ==2974== For counts of detected and suppressed errors, rerun with: -v
    ==2974== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 4 from 4)
    Oh, and for extra bonus points, the realloc should always be done as
    Code:
    void *temp = realloc( p, size );
    if ( temp != NULL ) {
      p = temp;
    } else {
      // panic!
      free( p );
    }
    Trashing your actual pointer results in a memory leak if realloc happens to fail.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  7. #7
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by jtay View Post
    Basically what I'm trying to do is write my own version of sprintf that will dynamically allocate memory for the output string based on the length of the input strings. The purpose for this is to make building SQL statements easier; I don't want to have to worry about how long some of the table name, and other values are and allocate unnecessarily large spaces for each statement.
    Minimizing mem usage is a good thing, but not when taken to this extreme, because what you will really be doing with this is wasting a lot of processor cycles on nothing.

    I suppose SQL statements and table names can be of any length, but realistically, one statement is unlikely to be more than 1kB long, and certainly less than 4kB unless you are completely deranged.

    The system allocates memory in "page" size blocks, on most desktops 4096 bytes. It also probably gives you default stack space well in excess of a single page, more like megabytes, most of which is unused.

    Ie, the idea that you can shave a few hundred or even a few thousand bytes off your mem usage is false. Unless you know there are going to be tables with paragraph long names, just use a 1kB local stack buffer. If you are really worried, you could check the arg lengths first and return an error -- the point at which your SQL queries have hit 1024 characters might be the point at which you should consider why that is necessary at all.
    Last edited by MK27; 06-30-2011 at 09:18 AM.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  8. #8
    Registered User
    Join Date
    Apr 2010
    Posts
    10
    Thanks for all the help everyone. I'll definitely try out your code, Salem, when I get home today.

    I guess I heard a lot of answers that I more or less expected to hear. I realize that allocating a large buffer size for each string and calling it a day might have been the best option for this project. I just got so attached to the whole dynamic reallocation thing. Plus, sometime in the future I might try and write my text game again, in which case there will be paragraph sized entries in table fields (for room descriptions), so this will be a useful bit of code for me.

  9. #9
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by jtay View Post
    I realize that allocating a large buffer size for each string and calling it a day might have been the best option for this project.
    Just to re-iterate, 1000 or 2000 or 4000 characters is larger than a tweet, but don't consider it "a large buffer size". 4kB is a small amount of memory.

    I just got so attached to the whole dynamic reallocation thing. Plus, sometime in the future I might try and write my text game again, in which case there will be paragraph sized entries in table fields (for room descriptions), so this will be a useful bit of code for me.
    Ah, for adding data. Good point, silly me. However, you still don't have to do this, unless you want just one function for all possible statements -- inserts, selects, etc -- which is a bit silly. Consider:

    Code:
    int SQLinsertText (char *table, char *field, char *data);  // prototype
    char data[] = "blah blah blah as much as you like";
    SQLinsertText("some_table", "some_field", data);
    If you need to concat those strings inside the function for a call to your SQL API, you can just use strlen to get the total size of the arguments and malloc a buffer for that. I think you will find this much simpler; there is no need to spend a lot of time on unnecessary complications
    Last edited by MK27; 06-30-2011 at 10:45 AM.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. bandwidth issue / network issue with wireless device communication
    By vlrk in forum Networking/Device Communication
    Replies: 0
    Last Post: 07-05-2010, 11:52 PM
  2. Some weird issue.
    By dbzx in forum C Programming
    Replies: 7
    Last Post: 04-12-2009, 04:10 PM
  3. use of realloc
    By alzar in forum C Programming
    Replies: 3
    Last Post: 09-17-2007, 12:18 PM
  4. weird pointer issue
    By Chaplin27 in forum C++ Programming
    Replies: 5
    Last Post: 08-01-2006, 09:20 AM
  5. Weird gethostbyname() issue
    By Xterria in forum Networking/Device Communication
    Replies: 1
    Last Post: 03-26-2006, 09:44 PM