How get all numbers by regex?

This is a discussion on How get all numbers by regex? within the C Programming forums, part of the General Programming Boards category; How get all numbers by regex? If I input string like "34df4354sdf234ds23", after regexec I get only: submatch: 1 match ...

  1. #1
    Registered User
    Join Date
    Feb 2011
    Posts
    11

    Red face How get all numbers by regex?

    How get all numbers by regex?

    If I input string like "34df4354sdf234ds23", after regexec I get only:
    submatch: 1
    match 0: 34


    How get something like this:
    submatch: 4
    match 0: 34
    match 0: 4354
    match 0: 234
    match 0: 23

    ???


    Code:
        
        regex_t p;
        regmatch_t *pmatch;
        int rcomp_err, rexec_err;
        char string[BUFSIZ+1];
        int i;
    
        rcomp_err = regcomp(&p, "([0-9])*",
                    REG_EXTENDED | REG_NEWLINE);
    
        pmatch = alloca(sizeof(regmatch_t) * (p.re_nsub+1));
        if(!pmatch) {
            perror("alloca");
        }
    
        printf("Input a string: ");
        fgets(string, sizeof(string), stdin);
    
        rexec_err = regexec(&p, string, p.re_nsub, pmatch, 0);
        printf("submatch: %i\n",p.re_nsub);
        if(rexec_err) {
            printf("errrror");
        } else {
            /* match succeeded */
            for(i = 0; i < p.re_nsub; i++) {
                /* print the matching portion(s) of the string */
                if(pmatch[i].rm_so != -1) {
                    char * submatch;
                    size_t matchlen = pmatch[i].rm_eo - pmatch[i].rm_so;
                    submatch = malloc(matchlen+1);
                    strncpy(submatch, string+pmatch[i].rm_so, matchlen);
                    submatch[matchlen] = '\0';
                    printf("match %i: %s\n", i, submatch);
                    free(submatch);
                }
            }
        }

  2. #2
    cas
    cas is offline
    Registered User
    Join Date
    Sep 2007
    Posts
    993
    regexec() will not find multiple instances of your parenthesized expression. There are multiple elements in pmatch because you can have multiple sets of parentheses, not for any other reason.

    What it comes down to is that you'll have to keep calling regexec() until you run out of matches. Keep a pointer to the string you're searching. When you find a match, extract it, then advance the pointer past that match. Call regexec() again with the new value that is past the old match.

    For your code you don't need to use parentheses at all. pmatch's first element (assuming a match was found) will be the substring that matched the entire regular expression. And speaking of regular expressions, you want [0-9]+, not [0-9]*, being that the latter will always match (the string "abc", after all, does start with zero or more digits).

  3. #3
    Registered User
    Join Date
    Feb 2011
    Posts
    11

    Post

    Ok, I write some code for get ALL numbers from string:

    Code:
     int error = 0;
     regex_t re;
     size_t a = 2;
     regmatch_t arrayOfMatches[10];
     int counter=0;
     int *Matches;
     char currentSearch[9]="([0-9]+)";
     char file[32] = "hello777world99yew 8 I8909do23!";
     char tagType[6] = "start";
    
     if(regcomp(&re, currentSearch, REG_EXTENDED|REG_ICASE) != 0)
     {
      printf("ERROR\n");
      exit(1);
     }
    
     if((Matches = malloc(sizeof(int))) == NULL)
     {
      fprintf(stderr, "Unable to reallocate %d bytes of memory for current search\n", sizeof(int));
      exit(1);
     }
    
     /* Finds the matches on the line */
     error = regexec (&re, file, a, arrayOfMatches, 0);
    
     /* while matches found */
     while (error == 0)
     {
      if(strcmp(tagType,"start") == 0)
      {
       Matches[counter] = arrayOfMatches[counter].rm_so;
      }
      else
      {
       Matches[counter] = arrayOfMatches[counter].rm_eo;
      }
    
      /* This call to regexec() finds the next match */
      counter++;
      void *temp = realloc(Matches, (counter+1) * sizeof *Matches);
      if (temp != NULL)
       Matches = temp;
      else
      {
       fprintf(stderr, "Unable to reallocate %d bytes of memory for file\n", sizeof(int));
       exit(1);
      }
    
      error = regexec (&re, file + arrayOfMatches[counter].rm_eo, counter, arrayOfMatches, REG_NOTBOL);
    
      printf("%i\n",arrayOfMatches[counter].rm_eo);
     }
    
     regfree(&re);

    But, I can't find error in this code, in my terminal I get this:

    Code:
    grytskiv@ZXDSL831II:~/socket/proxy$ gcc client.c -o client && ./client
    8
    0
    0
    0
    0
    0
    0
    0
    0
    140
    grytskiv@ZXDSL831II:~/socket/proxy$
    Last edited by grytskiv; 03-10-2011 at 09:18 AM.

  4. #4
    cas
    cas is offline
    Registered User
    Join Date
    Sep 2007
    Posts
    993
    You are completely overthinking this.
    Code:
    regex_t re;
    char file[] = "hello777world99yew 8 I8909do23!";
    const char *p = file;
    regmatch_t match;
    
    if(regcomp(&re, "[0-9]+", REG_EXTENDED) != 0) exit(1);
    
    while(regexec(&re, p, 1, &match, 0) == 0)
    {
      printf("%.*s\n", (int)(match.rm_eo - match.rm_so), &p[match.rm_so]);
      p += match.rm_eo; /* or p = &p[match.rm_eo]; */
    }
    That's all you need. Inside the loop, rm_eo - rm_so is the length of the match, and &p[rm_so] is the start of the match. The entire match (the whole expression) is always the first element in “match”.

    As I told you before, you don't need parentheses for this. You only care about whether the entire expression matches.

  5. #5
    Registered User
    Join Date
    Feb 2011
    Posts
    11
    Quote Originally Posted by cas View Post
    You are completely overthinking this.
    Code:
    regex_t re;
    char file[] = "hello777world99yew 8 I8909do23!";
    const char *p = file;
    regmatch_t match;
    
    if(regcomp(&re, "[0-9]+", REG_EXTENDED) != 0) exit(1);
    
    while(regexec(&re, p, 1, &match, 0) == 0)
    {
      printf("%.*s\n", (int)(match.rm_eo - match.rm_so), &p[match.rm_so]);
      p += match.rm_eo; /* or p = &p[match.rm_eo]; */
    }
    That's all you need. Inside the loop, rm_eo - rm_so is the length of the match, and &p[rm_so] is the start of the match. The entire match (the whole expression) is always the first element in “match”.

    As I told you before, you don't need parentheses for this. You only care about whether the entire expression matches.




    THANKS for example!!!

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Matching numbers
    By kirksson in forum C Programming
    Replies: 7
    Last Post: 07-23-2008, 02:51 PM
  2. Question about random numbers
    By Kempelen in forum C Programming
    Replies: 2
    Last Post: 07-02-2008, 07:28 AM
  3. the definition of a mathematical "average" or "mean"
    By DavidP in forum A Brief History of Cprogramming.com
    Replies: 7
    Last Post: 12-03-2002, 11:15 AM
  4. Line Numbers in VI and/or Visual C++ :: C++
    By kuphryn in forum C++ Programming
    Replies: 2
    Last Post: 02-10-2002, 10:54 PM
  5. A (complex) question on numbers
    By Unregistered in forum C++ Programming
    Replies: 8
    Last Post: 02-03-2002, 06:38 PM

Tags for this Thread


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21