Thread: php preg_match equivalent for c

  1. #1
    Registered User
    Join Date
    Sep 2005
    Posts
    19

    php preg_match equivalent for c

    hello,

    I have many reg. expressions in a php-framework using preg_match, preg_replace und preg_match_all.
    I want to port parts of the framework to c++ and therefor I also need to take over these regular expressions.
    Is there an easy way to take over these expressions (I tried pcre but with pcre I must change all expressions and that's very much work)? I tried to take over some code from the php_pcre C implementation but the code is very wired and so I did not get further.

    Another solution would be just execute a php-script from C++ but one reason I changed from php to C++ is speed; and calling PHP again would result in a speed decrease again.

    Thanks,

  2. #2
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,412
    You could try PCRE.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  3. #3
    Registered User
    Join Date
    Sep 2005
    Posts
    19
    as I already said, I do not get my expressions working with pcre so I am search for an equivalent for preg_*.

  4. #4
    Registered User
    Join Date
    Sep 2005
    Posts
    19
    ok. here is a sample that works with preg_match_all but I don't get it working with pcre:

    Code:
     (PHP)
    preg_match_all("|<A class=\'res\'[[:space:]]*href=\'(.*)\'>(.*)</A><BR>.*<SPAN[[:space:]]*class=s>(.*)</SPAN>.*<SPAN[[:space:]]*class=ngrn>(.*)</SPAN>|Uis",$inputdata,$test);
    
                $ix = 0;
                for ($iy=0; $iy < count($test[1]); $iy++) {
                   $data[$ix]['A'] = $test[2][$iy];
                   $data[$ix]['B'] = $test[4][$iy];
                   $data[$ix]['C'] = $test[1][$iy];
                   $data[$ix]['D'] = $test[3][$iy];
                   $ix++;
                }
    Last edited by pixsta; 09-02-2005 at 01:31 AM.

  5. #5
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,412
    hmm... could it be that [:space:] is not recognised in PCRE, but the PHP port of PCRE allows it to be compatible with POSIX?

    You might try:
    Code:
    |<A class=\'res\'\s*href=\'(.*)\'>(.*)</A><BR>.*<SPAN\s*class=s>(.*)</SPAN>.*<SPAN\s*class=ngrn>(.*)</SPAN>|Uis
    if that is the case.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  6. #6
    Registered User
    Join Date
    Sep 2005
    Posts
    19
    When I use "\s" instead of "[:space:]" I get unknown escape sequence `\s'.


    here is the code I use (basically the sample code from the PCRE website):

    Code:
    	pcre *re;
    	char *pattern;
    	const char *error;
    	int erroffset;
    	int ovector[300];
    	char *subject = xmli->content;
    	int subject_length = strlen(subject);
    	int rc, i;
    	unsigned char *name_table;
    	int namecount;
    	int name_entry_size;
    
    	pattern = "|<A class=\'res\'[[:space:]]*href=\'(.*)\'>(.*)</A><BR>.*<SPAN[[:space:]]*class=s>(.*)</SPAN>.*<SPAN[[:space:]]*class=ngrn>(.*)</SPAN>|Uis";
    	//pattern = "|<A class=\'res\'\s*href=\'(.*)\'>(.*)</A><BR>.*<SPAN\s*class=s>(.*)</SPAN>.*<SPAN\s*class=ngrn>(.*)</SPAN>|Uis";
    	
    
    	re = pcre_compile(
    		pattern,              /* the pattern */
    		0,                    /* default options */
    		&error,               /* for error message */
    		&erroffset,           /* for error offset */
    		NULL);                /* use default character tables */
    
    	/* Compilation failed: print the error message and exit */
    
    	if (re == NULL)
    	{
    		printf("PCRE compilation failed at offset %d: %s\n", erroffset, error);
    		return;
    	}
    
    	rc = pcre_exec(
    		re,                   /* the compiled pattern */
    		NULL,                 /* no extra data - we didn't study the pattern */
    		subject,              /* the subject string */
    		subject_length,       /* the length of the subject */
    		0,                    /* start at offset 0 in the subject */
    		0,                    /* default options */
    		ovector,              /* output vector for substring information */
    		300);           /* number of elements in the output vector */
    	
    	if (rc < 0)
    	{
    		switch(rc)
    		{
    		case PCRE_ERROR_NOMATCH: printf("No match\n"); break;
    			/*
    			Handle other special cases if you like
    			*/
    		default: printf("Matching error %d\n", rc); break;
    		}
    		free(re);     /* Release memory used for the compiled pattern */
    		return;
    	}
    
    	if (rc == 0)
    	{
    		rc = 300/3;
    		printf("ovector only has room for %d captured substrings\n", rc - 1);
    	}
    
    	
    	for (i = 0; i < rc; i++)
    	{
    		char *substring_start = subject + ovector[2*i];
    		int substring_length = ovector[2*i+1] - ovector[2*i];
    		printf("%2d: %.*s\n", i, substring_length, substring_start);
    	}

  7. #7
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,412
    heh, I'm wrong. The [:space:] syntax is also supported by PCRE. The use of \s should work though, so I think that the error message you get is not from the regex engine, but from your compiler. "\\s" should be what you're looking to use.

    Anyway, this still doesnt help you. I dont have PCRE installed here, but what I suggest is working with a simpler pattern to be clearer on what is different. My impression is that the PHP port should be the same, since the intention is for consistency.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  8. #8
    Registered User
    Join Date
    Sep 2005
    Posts
    19
    I got the error: the parameters "Uis" are PHP specific and must be converted/parsed into PCRE options like "PCRE_CASELESS", "PCRE_UNGREEDY", ....

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. PHP installation
    By ssharish2005 in forum Tech Board
    Replies: 8
    Last Post: 11-23-2007, 09:42 PM
  2. Pointer equivalent to array notation
    By bekkilyn in forum C Programming
    Replies: 4
    Last Post: 12-06-2006, 08:22 PM
  3. PHP on my Computer!
    By xxxrugby in forum Tech Board
    Replies: 4
    Last Post: 03-15-2005, 09:34 AM
  4. Header File Question(s)
    By AQWst in forum C++ Programming
    Replies: 10
    Last Post: 12-23-2004, 11:31 PM
  5. Resource ICONs
    By gbaker in forum Windows Programming
    Replies: 4
    Last Post: 12-15-2003, 07:18 AM