Strangely I haven't used regexp's much in C but I have kept this on file in case I have to do it again, if you want a working example:
Code:
#include <stdio.h>
#include <stdlib.h>
#include <regex.h>
char *regexp (char *string, char *patrn, int *begin, int *end) {
int i, w=0, len;
char *word = NULL;
regex_t rgT;
regmatch_t match;
regcomp(&rgT,patrn,REG_EXTENDED);
if ((regexec(&rgT,string,1,&match,0)) == 0) {
*begin = (int)match.rm_so;
*end = (int)match.rm_eo;
len = *end-*begin;
word=malloc(len+1);
for (i=*begin; i<*end; i++) {
word[w] = string[i];
w++; }
word[w]=0;
}
regfree(&rgT);
return word;
}
int main() {
int b,e;
char *match=regexp("this and7 that","[a-z]+[0-9]",&b,&e);
printf("->%s<-\n(b=%d e=%d)\n",match,b,e);
return 0;
}
beware that regexp.h uses POSIX style notation (no \d or \w, etc). Supposedly, there is stuff like [:digit:] but if I substitute this
Code:
char *match=regexp("this and7 that","[:alpha:]+[:digit:]",&b,&e);
on my system, instead of the predicatable:
->and7<-
(b=5 e=9)
I get the bizarre and inexplicable:
->hi<-
(b=1 e=3)
which does not look like a number of letters and a digit to me!