Thread: Parsing string

  1. #16
    Registered User
    Join Date
    May 2011
    Posts
    68
    Thanks a lot for your input. I null terminated strings for debug purpose, to display it in terminal.

  2. #17
    Registered User
    Join Date
    May 2011
    Posts
    68
    actually you are right, no need for those two lines.
    com[strlen(com) + 1] = '\0';
    val[strlen(val) + 1] = '\0';

  3. #18
    Registered User
    Join Date
    Nov 2012
    Posts
    1,393
    The lines which you wrote at the beginning
    Code:
    char com[20] = {0};
         char val[20] = {0};
    will initialize all elements to 0, so this effectively pre-null-terminates them. It is guaranteed to be null-terminated as long as you never touch index 19.

    Notice if you tried to do
    Code:
    com[strlen(com) + 1] = '\0';
     val[strlen(val) + 1] = '\0';
    Now the thing will only work properly if both of your arguments are each 18 characters or fewer. Consider if you have com with 19 characters:
    Code:
    {'1', '2', '3', ..., '8', '9', '0', '1', ..., '8', '9', '\0'}
    strlen(com) returns 19 so adding 1 means you try to access element index 20 which outside the bounds of the array and subject to undefined operation.

  4. #19
    Registered User
    Join Date
    May 2011
    Posts
    68
    That's the problem to work with predetermined array like com[20] instead of char *com.
    On the other hand it' s very easy to run over the memory size especially on microcontroller. So I'd rather check it for valid length, more code but more safe.

  5. #20
    Ticked and off
    Join Date
    Oct 2011
    Location
    La-la land
    Posts
    1,728
    If you use this for changing parameters via command strings, I recommend merging the parameter name and value parsing.

    Here, perhaps an example program will illustrate the idea better:
    Code:
    #include <stdio.h>      /* Needed only by main() */
    #include <limits.h>
    
    static int imin = 0;
    static int imax = 0;
    
    static char stralteq(const char *const str1,
                         const char *const str2upper,
                         const char *const str2lower,
                         const size_t n)
    {
        size_t  i;
    
        for (i = 0; i < n; i++)
            if (str1[i] != str2upper[i] && str1[i] != str2lower[i])
                return 0;
    
        return 1;
    }
    
    
    static const char *strskipws(const char *str)
    {
        while (*str == '\t' || *str == '\n' || *str == '\v' ||
               *str == '\f' || *str == '\r' || *str == ' ')
            str++;
    
        return str;
    }
    
    
    typedef enum {
        OK = 0,
        EMPTY,          /* Nothing to parse. */
        UNKNOWN,        /* Name not recognized. */
        NOSEPARATOR,
        NOVALUE,        /* No value specified. */
        TRAILING,       /* Trailing garbage. */
        OVERFLOW        /* Value too large in magnitude. */
    } parsing_t;
    
    
    static parsing_t parse(const char *str)
    {
        int          *ptr = NULL;
        const char   *endstr;
        unsigned int  value;
        char          negative = 0;
    
        if (!str)
            return EMPTY;
    
        /* Skip whitespace. */
        str = strskipws(str);
    
        /* Nothing to parse? */
        if (*str == '\0')
            return EMPTY;
    
        /* Compare to known names. */
        if (stralteq(str, "IMIN", "imin", 4)) {
            str += 4;
            ptr = &imin;
    
        } else
        if (stralteq(str, "IMAX", "imax", 4)) {
            str += 4;
            ptr = &imax;
    
        } else
            return UNKNOWN;
    
        /* Remember where the name ended. */
        endstr = str;
    
        /* Allow whitespace in between. */
        str = strskipws(str);
    
        /* Allow = or : in between, too. */
        if (*str == '=' || *str == ':')
            str = strskipws(str + 1);
    
        /* No separator after name? */
        if (str == endstr)
            return NOSEPARATOR;
    
        /* Signed? */
        while (*str == '-' || *str == '+')
            negative ^= (*(str++) == '-');
    
        /* At least one decimal digit is required. */
        if (!(*str >= '0' && *str <= '9'))
            return NOVALUE;
    
        value = *(str++) - '0';
        while (*str >= '0' && *str <= '9') {
            const unsigned int  oldvalue = value;
    
            value = 10U * value + (*(str++) - '0');
    
            /* Overflow? */
            if (value / 10U != oldvalue)
                return OVERFLOW;
        }
    
        str = strskipws(str);
    
        /* Not end of string? */
        if (*str != '\0')
            return TRAILING;
    
        /* Assign value. */
        if (negative) {
    
            if (value > -(unsigned int)INT_MIN)
                return OVERFLOW;
    
            *ptr = -(int)value;
    
        } else {
    
            if (value > (unsigned int)INT_MAX)
                return OVERFLOW;
    
            *ptr = (int)value;
        }
    
        /* Success. */
        return OK;
    }
    
    int main(void)
    {
        char     buffer[128];
        char    *line;
    
        printf("imin = %d, imax = %d. Input strings, or empty line to exit.\n", imin, imax);
        fflush(stdout);
    
        while ((line = fgets(buffer, sizeof buffer, stdin)) != NULL) {
            switch (parse(line)) {
    
            case OK:
                printf("\tOk.\n");
                break;
    
            case EMPTY:
                printf("\tEmpty line. Exiting.\n");
                return 1;
    
            case UNKNOWN:
                printf("\tUnknown name.\n");
                break;
    
            case NOSEPARATOR:
                printf("\tNo separator between name and value.\n");
                break;
    
            case NOVALUE:
                printf("\tNo value specified.\n");
                break;
    
            case TRAILING:
                printf("\tIgnoring due to trailing garbage.\n");
                break;
    
            case OVERFLOW:
                printf("\tIgnoring due to value overflow.\n");
                break;
            }
    
            printf("\timin = %d, imax = %d\n", imin, imax);
            fflush(stdout);
        }
    
        printf("No more input.\n");
        return 0;
    }
    stralteq(str1, str2upper, str2lower, n)
    This function returns nonzero if the first n characters in str1 match either str2upper or str2lower. I use this kind of a function to allow for alternate versions of the name, for example [Ii][Mm]in . Above, I used it as a case-insensitive comparison. You could use strncmp() instead.

    strskipws(str)
    Returns a pointer to after the leading ASCII whitespace.

    parse(str)
    If str contains a known name, followed by whitespace and/or = or :, followed by a (signed) integer, this function updates the internal state (variable) accordingly. See the example code for the return values.

    Hope you find this useful.

  6. #21
    Registered User
    Join Date
    May 2011
    Posts
    68
    Thanks Nominal Animal! It looks very interesting, I'll try to implement it in my chip.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. String Parsing
    By Shaun32887 in forum C++ Programming
    Replies: 11
    Last Post: 07-02-2008, 02:15 PM
  2. parsing a string
    By Coding in forum C++ Programming
    Replies: 10
    Last Post: 02-24-2008, 06:01 AM
  3. String parsing(parsing comments out of HTML file)
    By slcjoey in forum C# Programming
    Replies: 0
    Last Post: 07-29-2006, 08:28 PM
  4. string parsing
    By misplaced in forum C++ Programming
    Replies: 6
    Last Post: 09-30-2004, 11:50 PM
  5. String parsing
    By Cbuild in forum C Programming
    Replies: 5
    Last Post: 05-03-2004, 05:15 PM