Thread: writing a pack-style function, any advices?

  1. #1
    Registered User
    Join Date
    Jul 2006
    Posts
    7

    writing a pack-style function, any advices?

    Hello there,

    Great forum, I don't know how I didn't find it before.
    So my problem is, I'm trying to write a C version of perl's pack function.

    It's not a homework or something, I was trying to create an ICQ client, and found myself over and over again manually creating the packets.
    So I thought in creating a pack function, which would make my life easier.

    Anyway, I wrote one, but I feel it can be really improved.
    And here's where I need some help - improving this code.
    Any advice will really help.
    So I wont post all the code, because it's pretty useless to do so.
    I removed similar cases and some error handling, to make the code easier to be read.

    Code:
    
    #include <stdio.h>
    #include <stdlib.h>
    #include <stdarg.h>
    
    char *pack( char *format, ... )
    {
       va_list parameters;
       char *buffer = NULL;
       char *pBuffer = buffer;
       char *pFormat = format;
       long lSize = 0;
       char *tmp;
       
       va_start( parameters, format );
       while( *pFormat != '\0' )
         {
            if( *pFormat == 'C' )
              {
                 /*
                  * signed char
                  */
                 signed char c;
                 c = va_arg( parameters, int );
                 
                 ++lSize;
                 if( ( tmp = realloc( buffer, lSize ) ) == NULL )
                   {
                      perror( "realloc: " );
                      free( buffer );
                      return NULL;
                   }
                 else
                   {
                      buffer = tmp;
                      pBuffer = buffer+lSize-1;
                      *pBuffer = c;
                   }
              }
            else if( *pFormat == 'c' )
              {
                 /*
                  * unsigned char
                  */
                 unsigned char c;
                 c = va_arg( parameters, int );
                 
                 ++lSize;
                 if( ( tmp = realloc( buffer, lSize ) ) == NULL )
                   {
                      perror( "realloc: " );
                      free( buffer );
                      return NULL;
                   }
                 else
                   {
                      buffer = tmp;
                      pBuffer = buffer+lSize-1;
                      *pBuffer = c;
                   }
                 
              }
            else if( *pFormat == 'n' )
              {
                 /*
                  * unsigned short, 16bits, big endian
                  */
                 unsigned short n = va_arg( parameters, int );
                 long lComplement = 2;
                 
                 lSize += lComplement;
                 if( ( tmp = realloc( buffer, lSize ) ) == NULL )
                   {
                      perror( "realloc: " );
                      free( buffer );
                      return NULL;
                   }
                 else
                   {
                      buffer = tmp;
                      pBuffer = buffer+lSize-lComplement;
                      *pBuffer = n >> 8;
                      ++pBuffer;
                      *pBuffer = n & 0x00ff;
                   }
              }
            else if( *pFormat == 'v' )
              {
                 /*
                  * unsigned short, 16 bits, little endian
                  */
                 unsigned short n = va_arg( parameters, int );
                 long lComplement = 2;
                 
                 lSize += lComplement;
                 if( ( tmp = realloc( buffer, lSize ) ) == NULL )
                   {
                      perror( "realloc: " );
                      free( buffer );
                      return NULL;
                   }
                 else
                   {
                      buffer = tmp;
                      pBuffer = buffer+lSize-lComplement;
                      *pBuffer = n & 0x00ff;
                      ++pBuffer;
                      *pBuffer = n >> 8;
                   }
                 
              }
            else if( *pFormat == 'a' )
              {
                 /*
                  * NUL padded string
                  */
                 char *s = va_arg( parameters, char* );
                 long lComplement = strlen( s );
                 
                 lSize += lComplement;
                 if( ( tmp = realloc( buffer, lSize ) ) == NULL )
                   {
                      perror( "realloc: " );
                      free( buffer );
                      return NULL;
                   }
                 else
                   {
                      buffer = tmp;
                      pBuffer = buffer+lSize-lComplement;
                      memcpy( pBuffer, s, lComplement );
                   }
              }
            else if( *pFormat == 'A' )
              {
                 /*
                  * Space padded string
                  */
                 char *s = va_arg( parameters, char* );
                 char *pString = s;
                 long lComplement = 0;
                 
                 while( *pString != ' ' )
                   {
                      ++lComplement;
                      ++pString;
                   }
                 lSize += lComplement;
                 if( ( tmp = realloc( buffer, lSize ) ) == NULL )
                   {
                      perror( "realloc: " );
                      free( buffer );
                      return NULL;
                   }
                 else
                   {
                      buffer = tmp;
                      pBuffer = buffer+lSize-lComplement;
                      memcpy( pBuffer, s, lComplement );
                   }
              }
            else if( *pFormat == 'N' )
              {
                 /*
                  * unsigned long, 32bits, big endian
                  */
                 //0x12345678
                 unsigned long n = va_arg( parameters, unsigned long );
                 long lComplement = 4;
                 
                 lSize += lComplement;
                 if( ( tmp = realloc( buffer, lSize ) ) == NULL )
                   {
                      perror( "realloc: " );
                      free( buffer );
                      return NULL;
                   }
                 else
                   {
                      buffer = tmp;
                      pBuffer = buffer+lSize-lComplement;
                      *pBuffer = n >> 24;
                      ++pBuffer;
                      *pBuffer = ( ( n & 0xff0000) >> 16 );
                      ++pBuffer;
                      *pBuffer = ( ( n & 0xff00 ) >> 8 );
                      ++pBuffer;
                      *pBuffer = n & 0xff;
                   }
              }
            
            ++pFormat;
         }
       
       /*
        * before returning, add a NUL char at the end
        */
       ++lSize;
       tmp = realloc( buffer, lSize );
       if( tmp == NULL )
         {
            perror( "realloc: " );
            return NULL;
         }
       else
         {
            buffer = tmp;
            pBuffer = buffer+lSize-1;
            *pBuffer = '\0';
         }
       va_end( parameters );
       return buffer;
    }
    Any advice will be appreciated.
    Thanks in advance, and have a nice day.

    PS: I'm sorry the code alignment isn't that good, I'll try to do better next time.

  2. #2
    Yes, my avatar is stolen anonytmouse's Avatar
    Join Date
    Dec 2002
    Posts
    2,544
    Your code looks excellent. In fact, when you've finished, you might consider posting the entire function as it is something that would be useful for other programs. I only have a few very minor suggestions:
    • The realloc code could be put in a seperate function, so that instead of:
      Code:
                   if( ( tmp = realloc( buffer, lSize ) ) == NULL )
                     {
                        perror( "realloc: " );
                        free( buffer );
                        return NULL;
                     }
                   else
                     {
                        buffer = tmp;
                        pBuffer = buffer+lSize-1;
                        *pBuffer = c;
                     }
      you have something like:
      Code:
      if ( !resize_buffer(&buffer, lSize) )
      {
         // error exit function
      }
      else
      {
         pBuffer = buffer+lSize-1;
         *pBuffer = c;
      }
    • va_end should be called whether the function is successful or not. I would argue that it would be an appropriate use of goto to jump to cleanup code rather than repeating the cleanup code throughout the function.
    • If efficiency is an important issue, you could consider a more aggressive memory scheme. For example the buffer size could be doubled each time realloc is called, so that realloc needs to be called less often.

    PS: I'm sorry the code alignment isn't that good, I'll try to do better next time.
    ?? Your code alignment is perfect!

  3. #3
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,656
    Another suggestion

    > if( *pFormat == 'C' )
    Make each one of these a separate function, to keep the main pack() function from becoming exessively long
    Code:
    if( *pFormat == 'C' ) {
      signed char c;
      c = va_arg( parameters, int );
      addSignedChar( c, &buffer );
    }
    And echo anonytmouse's suggestions on the buffer handling.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  4. #4
    Registered User
    Join Date
    Jul 2006
    Posts
    7
    Hello,

    First of all, thanks both of you for the advices.
    I've implemented what both of you said.

    The main reason that I've asked for the advices, was that I felt that the excessives calls for realloc() were not a good thing to go with,
    but I just couldn't find a good way to handle the problem.
    So, really thanks for the memory scheme advice, I'll keep that in mind when writing similar codes.
    (In fact, I've found the most of my packets sizes are within the range 128<x<1024, with few exceptions)

    About the cleanup code and the use of goto.
    To tell you the truth, I thought about that before, but whenever I feel using it, I always recall the articles about goto being evil.
    Anyway, every rule has its exception, I guess this stands also for goto

    Thanks again for the advices and the compliments.
    I'll post the code when it's ready
    Have a nice day.

  5. #5
    Yes, my avatar is stolen anonytmouse's Avatar
    Join Date
    Dec 2002
    Posts
    2,544
    I missed the most important issue. The returned packet may not be a valid string (it may contain nul characters). Therefore, to be of use, the function must also return the size of the completed packet. To reflect that the return value is an array of bytes rather than a character array, the return type should also be unsigned char rather than plain char.

  6. #6
    Registered User
    Join Date
    Jul 2006
    Posts
    7
    anonytmouse:
    Oh, you're right.
    How could I forget that.

    But wait, should I return the size, and change the function to receive a pointer to a buffer
    or should I insert the size lets say on the first element of the buffer and then return the buffer?
    Or maybe, a better solution would be to return a structure?

    And btw, I think with some pointer arithmetic, it is possible to discover the real size of the buffer.
    I mean:
    Code:
    char *data = pack( "nva", 0x1234, 0x5678, "dude" );
    char *p = strrchr( buffer, '\0' );
    ....
    The length would be the difference between data and p
    But that's probably a work-around and not the right way to do so.
    Thanks for pointing that out!
    Last edited by isaac_s; 07-08-2006 at 04:28 PM.

  7. #7
    Yes, my avatar is stolen anonytmouse's Avatar
    Join Date
    Dec 2002
    Posts
    2,544
    But wait, should I return the size, and change the function to receive a pointer to a buffer
    or should I insert the size lets say on the first element of the buffer and then return the buffer?
    Or maybe, a better solution would be to return a structure?
    I would probably pass a pointer to a variable to receive the size as the first argument.
    Code:
    size_t sz;
    char *data = pack(&sz, ...);
    However, using a structure or passing a pointer to the buffer pointer are also fine. I would avoid putting the size at the start of the buffer because it side-steps the type system for no good reason.
    And btw, I think with some pointer arithmetic, it is possible to discover the real size of the buffer.
    I mean:
    Code:
    char *data = pack( "nva", 0x1234, 0x5678, "dude" );
    char *p = strrchr( buffer, '\0' );
    ....
    The length would be the difference between data and p
    But that's probably a work-around and not the right way to do so.
    Thanks for pointing that out!
    No, strrchr is not magic. strrchr finds the end of the string in the same manner as strlen, by looking for the first nul character. In a data packet, the real end of the data may be the first nul byte or the tenth.

  8. #8
    Registered User
    Join Date
    Jul 2006
    Posts
    7
    anonytmouse, thanks again!
    Going to work on that

  9. #9
    int x = *((int *) NULL); Cactus_Hugger's Avatar
    Join Date
    Jul 2003
    Location
    Banks of the River Styx
    Posts
    902
    The idea behind that code is amazing. I never thought to use a variable length function to create packets. All my code/libraries use indiviual functions, *Write[Type](Type t)... wow.
    How often does the network forum see code that suffers from people sending data without accounting for structure padding, endianness, etc. I would second the idea of archiving this somewhere for later referral, but I'd worry people would then repost it with questions "why this not work" ("you're using it wrong")...

    How does the other side know the length of the packet? (I usually send it first with either a 16/32 bit number - which I usually write to the packet.)
    Under type 'a', you have "NUL padded string". So far these comments have indicated what is being put into the packet - but no nul is being copied there, is there? (But this is a comment, so feel free to ignore me) Furthermore, how will the other side determine the string's length, given no nul? (I'm assuming ICQ provides for this, perhaps a 16bit number or something which could be packed before the string. Don't know ICQ though.)
    long time; /* know C? */
    Unprecedented performance: Nothing ever ran this slow before.
    Any sufficiently advanced bug is indistinguishable from a feature.
    Real Programmers confuse Halloween and Christmas, because dec 25 == oct 31.
    The best way to accelerate an IBM is at 9.8 m/s/s.
    recursion (re - cur' - zhun) n. 1. (see recursion)

  10. #10
    Registered User
    Join Date
    Jul 2006
    Posts
    7
    Cactus_Hagger, first of all thanks for the reply.

    /* How does the other side know the length of the packet? (I usually send it first with either a 16/32 bit number - which I usually write to the packet.) */

    That's not a problem that pack() should handle.
    Anyway, that's not a problem at all.
    You could build the data buffer using the pack function,
    send the server the length of the data buffer, and then send the data.

    /* So far these comments have indicated what is being put into the packet - but no nul is being copied there, is there? (But this is a comment, so feel free to ignore me) */
    Why ignore you? you got a point here, I need to be more specific.
    You're right, the NUL byte isn't copied into the data buffer.
    I still need to check some documentation wether to include the NUL or not.

    /* (I'm assuming ICQ provides for this, perhaps a 16bit number or something which could be packed before the string. Don't know ICQ though.) */
    Exactly.

    Anyway, pack( ) has one purpose only, to pack data given a certain format.
    It shouldn't be used solo to build network packets, because that's not what this function is for.
    Another thing is pack's sister function, unpack, which uses the same format convention to extract data from a buffer.
    So using this two functions together, you can solve inumerous problems.

  11. #11
    Registered User
    Join Date
    Jul 2006
    Posts
    7
    Hmm, so I've just finished now some changes.
    I've uploaded some part of the code.

    pack.h
    pack.c

    Thanks again for helping me out on this one.
    I'll probably cut down some of the code by grouping all the parts, which checks if a resize is needed, into one function.

    Anyway, I still have one little question.
    Now that I've implemented the memory scheme that anonytmouse suggested, when allocating memory to a string, I need to check wether the string length is greater than MIN_ALLOC.

    Well I've managed to solve that, but I dont know if it is the right way to do so, so I'll post here this part of the code:
    Code:
    if( *pFormat == 'a' || *pFormat == 'A' )
              {
                 /*
                  * NUL padded string or space padded
                  */
                 char *string = va_arg( parameters, char* );
                 unsigned long lStringLen = strlen( string );
     
                 if( *pFormat == 'A' )
                   {
                      /*
                       * remove the space from the string length
                       */
                      --lStringLen;
                   }
                 if( lSize + lStringLen > lAlloced )
                   {
                      /*
                       * we need to allocate more memory
                       */
                      double dRatio = 1.0;
     
                      if( lStringLen > MIN_ALLOC )
                        {
                           /*
                            * Oh well, get the ratio
                            */
                           dRatio = lStringLen / MIN_ALLOC;
                           dRatio = ceil( dRatio );
                        }
                      if( resizeBuffer( &buffer, lAlloced + ( MIN_ALLOC * (int)dRatio ) ) == NULL )
                        {
                           goto _cleanup;
                        }
                      else
                        {
                           lAlloced += MIN_ALLOC * (int)dRatio;
                        }
     
                   }
                 addString( buffer + lSize, string, lStringLen );
                 lSize += lStringLen;
              }
    Thanks in advance.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Compiling sample DarkGDK Program
    By Phyxashun in forum Game Programming
    Replies: 6
    Last Post: 01-27-2009, 03:07 AM
  2. dllimport function not allowed
    By steve1_rm in forum C++ Programming
    Replies: 5
    Last Post: 03-11-2008, 03:33 AM
  3. Screwy Linker Error - VC2005
    By Tonto in forum C++ Programming
    Replies: 5
    Last Post: 06-19-2007, 02:39 PM
  4. C++ compilation issues
    By Rupan in forum C++ Programming
    Replies: 1
    Last Post: 08-22-2005, 05:45 AM
  5. Replies: 5
    Last Post: 02-08-2003, 07:42 PM