Thread: sprintf and arrays

  1. #1
    Registered User
    Join Date
    Aug 2008
    Posts
    129

    sprintf and arrays

    I have a function which takes a format string and an array of unknown length, then needs to format the array into the string. For example, the calling code could look like this:
    Code:
    char *result = malloc(21);
    char *format = "%s is not %s.";
    char *array[2] = {"Foo", "bar"};
    asprintf(result, format, array);
    The first way I thought of to accomplish this was to loop through the array and replace one format field at a time. But this only works for the first field; all the rest are replaced with "(null)" in GCC. The standard doesn't define any behavior for this case, which seems really weird to me; obviously, the language should do what would make my project easier. But seriously, I see no use for putting "(null)" as a placeholder everywhere; I think the standard should say to ignore excess fields...

    Anyway, what else can I try? I thought of splitting the string just before each percent-sign and formatting each piece individually before reassembling the whole thing, but that's a nasty hack.

    EDIT: I just thought of another way... I can double each successive percent-sign, so the above string becomes "%s is not %%s." But then I wonder whether I should do that with a loop or recursion...

    Thanks.
    Last edited by Jesdisciple; 07-14-2009 at 11:37 PM.

  2. #2
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Code:
    foo( s, f, a )
        x = 0
        while *f
            if *f is not %
                *s++ = *f++
            else
                strcat s, a[ x ]
                s += strlen a[ x++ ]
                f+=2
    That should do it. Although you might want to allow for the possibility of %%.


    Quzah.
    Hope is the first step on the road to disappointment.

  3. #3
    Guest Sebastiani's Avatar
    Join Date
    Aug 2001
    Location
    Waterloo, Texas
    Posts
    5,708
    I actually thought this would be a lot simpler than it turned out to be. Oh well, it was a fun exercise, anyway.

    Code:
    char* foo( char* dst, char* fmt, char** argv, size_t max )
    {
    	char
    		* buf,
    		* ptr, 
    		** args,
    		pct = '%',
    		tag = 0x3, // randomly chosen
    		sav = 0x4; // randomly chosen
    	size_t
    		len = strlen( fmt ) + 1;
    	for( args = argv; *args; ++args )
    		len += strlen( *args );
    	if( len > max )
    		return 0;
    	buf = malloc( len );
    	strcpy( dst, fmt );
    /*
    	Hide '%' symbols
    */	
    	for( ptr = dst; *ptr; ++ptr )
    	{		
    		if( *ptr == pct )
    		{
    			if( *( ptr + 1 ) == pct )
    				*ptr = *( ptr + 1 ) = sav;
    			else
    				*ptr = tag;
    		}		
    	}
    	while( *argv )
    	{
    	/*
    		Reveal next '%' symbol
    	*/	
    		for( ptr = dst; *ptr; ++ptr )
    		{		
    			if( *ptr == tag )
    			{
    				*ptr = pct;
    				break;
    			}
    		}	
    		sprintf( buf, dst, *argv++ );
    		strcpy( dst, buf );
    	}
    	free( buf );	
    /*
    	Convert remaining hidden symbols that were originally 
    	escaped in the format string into a single '%' symbol
    */	
    	for( ptr = buf = dst; *buf; ++buf, ++ptr )
    	{	
    		if( *buf == sav )
    		{
    			*ptr = pct;
    			++buf;
    		}	
    		else
    			*ptr = *buf;
    	}	
    	*ptr = 0;	
    	return dst;
    }
    Example:

    Code:
    int main( void )
    {	
    	const size_t
    		max = 1024;
    	char
    		buf[ max ],
    		* args[ ] = 
    	{
    		"one", "two", "three", NULL
    	};		
    	foo( buf, "one: %s, none: %%s, two: %s, three: %s\n", args, max );
    	puts( buf );	
    	return 0;
    }
    Code:
    #include <cmath>
    #include <complex>
    bool euler_flip(bool value)
    {
        return std::pow
        (
            std::complex<float>(std::exp(1.0)), 
            std::complex<float>(0, 1) 
            * std::complex<float>(std::atan(1.0)
            *(1 << (value + 2)))
        ).real() < 0;
    }

  4. #4
    Registered User
    Join Date
    Aug 2008
    Posts
    129
    I didn't mean to imply that strings are all the function takes; I would rather allow any data type, probably by using the real sprintf. But both of you interpreted me that way, so I obviously messed up in my description. (I also didn't expect to get actual code, but at least it leaves no room for me to misinterpret you.)

    Quzah: I thought I was looking at Python for a second. I was able to add support for %% rather easily.

    Sebastiani: I wonder if that could be modified to not use str* functions so argv can be an array of void pointers without causing a segfault?

    Initially, I was thinking of iterating through each format field with i=0...n-1 and multplying each % by 2^i, then looping through the arguments to populate the new format string. But this causes the length of the string to literally grow exponentially, which isn't a great idea in any language let alone in C. Sebastiani's approach was definitely better than this.

    Maybe the best thing would be to convert each argument to a string with its corresponding format field.

  5. #5
    Registered User
    Join Date
    Sep 2008
    Location
    Toronto, Canada
    Posts
    1,834
    How about something like this:
    Code:
    void asprintf(char *result, char *format, ...) {
    va_list args;
    va_start(args, format);
    vsprintf(result, format, args);
    va_end(args);
    }
    ...
    asprintf(result, format, array[0], array[1], array[3], array[4], array[5], array[6], array[7]);
    ... will take a maximum of eight arguments from the array[], as needed according to the number of format specifiers.

  6. #6
    Registered User
    Join Date
    Aug 2008
    Posts
    129
    nonoob: Inverted, that could be (EDIT: Well, there would have to be some safety checks):
    Code:
    void asprintf(char *result, char *format, void **array){
        sprintf(result, format, array[0], array[1], array[3], array[4], array[5], array[6], array[7]);
    }
    This is a good last resort, but I would really rather retain all the flexibility of the builtin printf's. IMO, the builtins should have started from this idea and wrapped it in the current va_list implementation, which would have been much easier than the other way around.
    Last edited by Jesdisciple; 07-15-2009 at 12:33 PM.

  7. #7
    Registered User
    Join Date
    Dec 2008
    Location
    Black River
    Posts
    128
    Maybe something like:

    Code:
    #include <stdlib.h>
    #include <stdio.h>
    #include <stdarg.h>
    
    #define VSNBUFF   32
    
    int asprintf(char **dst, const char *fmt, ...) {
       size_t length = 2 * strlen(fmt);   /* Estimate total length */
       char *tmp;
       va_list arglist;
       int chk;
       if(length < VSNBUFF)
           length = VSNBUFF;
       tmp = (char*)malloc(length + 1);
       if(!tmp)
           return(-1);
       for(;;)
           {
           va_start(arglist, fmt);
           chk = vsnprintf(tmp, length, fmt, arglist);
           va_end(arglist);
           if((size_t)chk >= length)
               length += length;
           else
               break;
           tmp = (char*)realloc(tmp, length + 1);
           if(!tmp)
               return(-1);
           }
      tmp[chk] = 0;
      *dst = tmp;
      return(chk);
    }
    
    int main()
    {
       char *s;
       asprintf(&s, "%c %d %f %s", 'x', 10, 0.5, "abc");
       puts(s);
    }
    Of course, this totally depends on vsnprintf being in your library (Don't know if it's standard, but most implementations have it).
    Last edited by Ronix; 07-15-2009 at 01:47 PM.

  8. #8
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    You'll need something like va_args if you plan on passing countless random data types to it. Then you'll want to add a case statement, and process the character after the % sign to see what type of data it is, and append that instead of just doing a strcat. (Actually, you can still do a strcat if you want. Just sprintf to a temporary buffer, and strcat that.)


    Quzah.
    Hope is the first step on the road to disappointment.

  9. #9
    Registered User
    Join Date
    Aug 2008
    Posts
    129
    Looks like I need a new name for my function, instead of 'asprintf'... C99-snprintf Ronix: That links says (v)snprintf is in C99.

    Ronix and Quzah: What am I gaining over the standard functions if I use a va_list parameter at all? This is essentially a compatibility layer between sprintf and any arbitrary function that returns an array, so that would defeat the whole idea I started out with.

    But I really think a loop can replace the need for a va_list, if it's used right. I wish I could get a clear explanation of the GNU algorithm so I wouldn't have to wrap it...

  10. #10
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Well you can use an array of void pointers instead, so long as you set up dereferencing them correctly through a switch or what not. But va_args was basically designed for just that sort of job.


    Quzah.
    Hope is the first step on the road to disappointment.

  11. #11
    Registered User
    Join Date
    Dec 2008
    Location
    Black River
    Posts
    128
    I guess I'm not really sure what you intend to do at all. For example, what would be the function signature? Is it variadic, or do you plan to use an array of pointers like quzah says?

    Quote Originally Posted by Jesdisciple View Post
    I wish I could get a clear explanation of the GNU algorithm so I wouldn't have to wrap it...
    For the printf like functions? I honestly wouldn't bother. This family of functions tend to have a rather long and complicated implementation.

  12. #12
    Registered User
    Join Date
    Aug 2008
    Posts
    129
    Ronix: Yeah, I'm going for a void**. Although the combination of void** and unlimited length may be too ambitious for anyone who's not a genius and/or billionaire.

    Quzah: I was really hoping to simply wrap the standard sprintf and not mess with the switch and whatnot, but that may be completely impossible for anything short of a preprocessor (and a preprocessor designed around its product language's library would be a little weird). As I noted in an earlier post, I think this should have been in the standard...

    Function signature, possibly with an additional argument for the length of 'args', or a requirement that it be null-terminated:
    Code:
    void _sprintf(char *result, char *format, void **args);
    (I'm not sure what should replace the underscore.)
    Last edited by Jesdisciple; 07-15-2009 at 07:47 PM.

  13. #13
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    It depends really on what, or rather how, you are getting 'args', and how you expect to pass those. Is there some reason you need an array of random stuff to pass as an argument?

    Quzah.
    Hope is the first step on the road to disappointment.

  14. #14
    Registered User
    Join Date
    Dec 2008
    Location
    Black River
    Posts
    128
    Well, I gave it a try. Although it's by no means perfect, and I still think the variadic form is far superior.

    Code:
    #include <stdio.h>
    #include <string.h>
    
    int qualifier(char c) {
        static char lookup[] = { 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
            0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1,
            1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1
        };
        return(c >= 'E' && c <= 's'&& lookup[c - 'E'] == 1);
    }
    
    #define LOWER(C)  (((C) >= 'A' && (C) <= 'Z')? (C) - 'A' + 'a' : (C))
    
    int xsprintf(char *dst, const char *fmt, void **args) {
        char *end = dst;
        const char *fmt_begin;
        void *arg;
        if(!dst || !fmt || !args)
            return(-1);
        while(*fmt)
            {
            if(*fmt == '%')
                {
                fmt_begin = fmt++;
                if(*fmt == '%')
                    *end++ = '%';
                else
                    {
                    char fmt2[8] = { '%' };
                    int len;
                    for(; !qualifier(*fmt); ++fmt);
                    len = fmt - fmt_begin;
                    if(len)
                        memcpy(fmt2 + 1, fmt_begin + 1, len);
                    fmt2[len] = *fmt, fmt2[len + 1] = 0;
                    arg = *args++;
                    switch(*fmt)
                        {
                        case 'c':
                            *end++ = *(char*)arg;
                            break;
                        case 'x':
                        case 'X':
                        case 'o':
                        case 'i':
                        case 'd':
                            {
                            char buffer[sizeof(long) * 8];
                            unsigned long val;
                            val = *(fmt - 1) == 'h'? *(short*)arg :
                                *(fmt - 1) == 'd'? *(int*)arg :
                                *fmt == 'p'? (unsigned long)arg : *(long*)arg;
                            len = sprintf(buffer, fmt2, val);
                            if(len < 0)
                                return(-1);
                            memcpy(end, buffer, len);
                            end += len;
                            }
                            break;
                        case 'e':
                        case 'E':
                        case 'f':
                        case 'F':
                        case 'g':
                        case 'G':
                            {
                            char buffer[512];
                            long double val = LOWER(*(fmt - 1)) == 'h'?
                                *(float*)arg : LOWER(*(fmt - 1)) == 'L'?
                                *(long double*)arg : *(double*)arg;
                            fmt2[len] = 'L', fmt2[len + 1] = 'f', fmt2[len + 2] = 0;
                            len = sprintf(buffer, fmt2, val);
                            if(len < 0)
                                return(-1);
                            memcpy(end, buffer, len);
                            end += len;
                            }
                            break;
                        case 's':
                            {
                            if(*fmt_begin == '.')
                                {
                                len = (int)strtol(fmt_begin, NULL, 10);
                                strncpy(end, (const char*)arg, len);
                                }
                            else
                                {
                                len = (int)strlen((const char*)arg);
                                memcpy(end, (const char*)arg, len);
                                }
                            end += len;
                            }
                            break;
                        case 'n':
                            *(int*)arg = end - dst;
                            break;
                        default:
                            return(-1);
                        }
                    }
                }
            else
                *end++ = *fmt;
            ++fmt;
            }
        *end = 0;
        return(end - dst);
    }
    
    int main()
    {
        char s[100];
        int i = 10;
        double d = 0.5;
        char c = 'y';
        void *p[] = { &i, &d, &c, "Foo" };
        xsprintf(s, "%3d %.2f %c %s %%100", p);
        puts(s);
    }
    There are several problems with my function: There's no format string checking (You could achieve this via compiler builtins, like the ones included in gcc), there's no support for some of the newest additions ($ position modifiers; 'z', 'j' and 't' qualifiers; and lack of locale-dependant grouping) and there may be an overflow with very large floating point values (Don't know if 512 characters is enough).

    Eventually, you could lose all the sprintf calls and use your own functions to convert numeric variables into strings.
    Last edited by Ronix; 07-15-2009 at 10:24 PM.

  15. #15
    Registered User
    Join Date
    Aug 2008
    Posts
    129
    Thanks for that. I feel a bit guilty that I haven't made an attempt yet, but maybe I'm better off learning from yours.

    Why is fmt_begin constant? And being so, how does it change at the beginning of each loop iteration?

    Is fmt2 going to hold the entire format field? If so, why is it only eight characters long? Based on the WP entry, I think a field could theoretically stretch beyond any hard limit we might set, mostly because of unlimited decimal numbers. Even if the best way is to assume a hard limit, is eight sufficiently absurd to allow all reasonable (and some slightly unreasonable) usages?

    What are format string checking and locale-dependent grouping?

    I'll have to ask qualifier-specific questions in other posts, mainly because I don't have time to understand the cases right now.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Help: Outputting to char array with sprintf
    By todd_v in forum C Programming
    Replies: 3
    Last Post: 04-11-2009, 04:16 PM