Hi, Does anyone know a flag so that regexec can search a string and match
a regex exactly once?
I have a string for example 123456 and a regular expression of
[0-9]+ but this also matches 123456hello since a decimal is present!
Thanks
Printable View
Hi, Does anyone know a flag so that regexec can search a string and match
a regex exactly once?
I have a string for example 123456 and a regular expression of
[0-9]+ but this also matches 123456hello since a decimal is present!
Thanks
Sorry I actually meant a integer but I am not sure how you would use this?
If I have [+-]?[0-9]+ where do I put the \b \b? I checked it is \b thanks for that.
Put the \b on either side of your complete expression, like
\b[-+]?[0-9]+\b
(Note I changed the order of the [-+]. The range character, -, has to go first.
I am now having problem that it doesnt match any of the integer values.
Just to check is this right:
regex expression:
"\b[+-]?[0-9]+\b";
Strings:
100
0
49
Would this have anything to do because of posix regex are greedy?
Post your code.
Sorry guys, my code is too large and scattered to post, I am sure it has to do with the regex,
the ^[0-9]+$ didnt solve it either. I really am thinking its greedy and tries to match as much
as it can, thats the difference between regex in C and the ones used in Java.
The way which I worked around is not very nice but it works.
I got the string lenght of the
captured group and compared it with the actual string. If it matched then its fine if not then
it didnt.
You dont really know which part of the group matched unless you capture and compare.
It's been a while, but I'm pretty sure Java regex are greedy too. I can't see how ^[0-9]+$ could possibly match 123456hello, though.
But the idea behind "post your code" is that you're supposed to make something small and complete that shows the problem. (Generally you don't end up posting it, because in the process you discover that function A actually performs steps x, y, and z, while you thought it performed steps x, z, and q.)
A simple example to show:
Output:Code:#include <stdio.h>
#include <regex.h>
int main(void) {
const char *string1 = "123456";
const char *string2 = "123456hello";
const char *pattern1 = "[0-9]+";
const char *pattern2 = "^[0-9]+$";
regex_t p1, p2;
if (regcomp(&p1, pattern1, REG_EXTENDED | REG_NOSUB)) {
printf("Pattern 1 did not compile.\n");
return(1);
}
if (regcomp(&p2, pattern2, REG_EXTENDED | REG_NOSUB)) {
printf("Pattern 2 did not compile.\n");
return(1);
}
if (regexec(&p1, string1, 0, NULL, 0)) {
printf("String 1 did not match pattern 1.\n");
} else {
printf("String 1 did match pattern 1.\n");
}
if (regexec(&p1, string2, 0, NULL, 0)) {
printf("String 2 did not match pattern 1.\n");
} else {
printf("String 2 did match pattern 1.\n");
}
if (regexec(&p2, string1, 0, NULL, 0)) {
printf("String 1 did not match pattern 2.\n");
} else {
printf("String 1 did match pattern 2.\n");
}
if (regexec(&p2, string2, 0, NULL, 0)) {
printf("String 2 did not match pattern 2.\n");
} else {
printf("String 2 did match pattern 2.\n");
}
regfree(&p1);
regfree(&p2);
return 0;
}
Code:$ ./numbers
String 1 did match pattern 1.
String 2 did match pattern 1.
String 1 did match pattern 2.
String 2 did not match pattern 2.
Here's a mod to tabstop's pattern that is more general purpose for pattern #2:
(Edit - for whatever reason, the \b word delimiters are not working at all. I even tried \\b, \< \\<, \y, \\y, and none work.)Code:const char *pattern2 = "(^| )[0-9]+( |$)";
If you want to use word delimiters, you should use the word delimiters:
Don't blame me, that's what my man regex on OS X says. (It also says that's an extension which may differ -- but I'm guessing that's what system Dino is on.)Code:const char *pattern2 = "[[:<:]][0-9]+[[:>:]]"
Ah. My regex book (Mastering Regular Expressions) doesn't point that out (that I saw). It does say those are the word boundaries for MySQL. I was just reading the man pages and I saw a reference to [[:<:]], but I didn't put 2 & 2 together that they also applied to C. Duh.
Thanks. Yes, I use OS X (XP too, but mostly OS X)
(Edit - here's the code with tabstop's tip)
Code:#include <stdio.h>
#include <regex.h>
int main(void) {
const char *string1 = "123456";
const char *string2 = "123456hello";
const char *pattern1 = "[0-9]+";
// const char *pattern2 = "(^| )[0-9]+( |$)";
const char *pattern2 = "[[:<:]][0-9]+[[:>:]]";
regex_t p1, p2;
// regmatch_t mymatch ;
int rc ;
rc = regcomp(&p1, pattern1, REG_EXTENDED | REG_NOSUB) ;
if (rc) {
printf("Pattern 1 did not compile.\n");
return(1);
}
rc = regcomp(&p2, pattern2, REG_EXTENDED | REG_NOSUB) ;
if (rc) {
printf("Pattern 2 did not compile.\n");
return(1);
}
rc = regexec(&p1, string1, 0, NULL, 0) ;
if (rc) {
printf("String 1 did not match pattern 1, rc=%d.\n", rc);
} else {
printf("String 1 did match pattern 1, rc=%d.\n", rc);
}
rc = regexec(&p1, string2, 0, NULL, 0) ;
if (rc) {
printf("String 2 did not match pattern 1, rc=%d.\n", rc);
} else {
printf("String 2 did match pattern 1, rc=%d.\n", rc);
}
rc = regexec(&p2, string1, 0, NULL, 0) ;
if (rc) {
printf("String 1 did not match pattern 2, rc=%d.\n", rc);
} else {
printf("String 1 did match pattern 2, rc=%d.\n", rc);
}
rc = regexec(&p2, string2, 0, NULL, 0) ;
if (rc) {
printf("String 2 did not match pattern 2, rc=%d.\n", rc);
} else {
printf("String 2 did match pattern 2, rc=%d.\n", rc);
}
regfree(&p1);
regfree(&p2);
return 0;
}