Thread: Php regexp --> C++

  1. #1
    Registered User
    Join Date
    Feb 2005
    Posts
    2

    Php regexp --> C++

    Hi,

    For some reason, I need to find the equivalent in C++ of the following php code :
    ereg_replace("<[^>]*>","**",$string);

    This replaces all html tags in a string by "**".

    Is there any way to do the same thing in C++ ?

    Thanks for your help,

    OlgaM.

  2. #2
    S Sang-drax's Avatar
    Join Date
    May 2002
    Location
    Göteborg, Sweden
    Posts
    2,072
    boost.org has a regex library.
    Last edited by Sang-drax : Tomorrow at 02:21 AM. Reason: Time travelling

  3. #3
    Registered User
    Join Date
    Oct 2004
    Posts
    63
    iterate through each character of the string with a for loop
    if its a character you don't like, change it to a *
    Code:
    if(string[i] == "<" || string[i] == ">") {
    string[i] = "*";
    }
    -Webmaster-
    http://www.koaworld.com
    Pr0gr4m|\/|1Ng n00b

  4. #4
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Shouldn't that be single quotes?

    Quzah.
    Hope is the first step on the road to disappointment.

  5. #5
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    Yes, and it doesn't work anyway, because it doesn't replace the whole tag.

    Would be quite easy though to build an FSM to do this job. Whoppin' 2 states...
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  6. #6
    Registered User
    Join Date
    Feb 2005
    Posts
    2
    Quote Originally Posted by Philandrew
    iterate through each character of the string with a for loop if its a character you don't like, change it to a *
    Thanks, but this is not exactly what i need

    1st : I need to replace ANY html tag, this means <td> as well as < td class='classe'> or <anything> for example...

    2nd : I have to do this for a complete HTML document, approx 7000 lines or more...so iteration is definitely not a good solution

    That's why I need to use Regexp for they are fast... the problem is I know how to code this in PHP, but I'm a noob in C++ on this particular point.

    Quote Originally Posted by Sang-drax
    boost.org has a regex library.
    I have seen that. But I would really appreciate something more precise, more simple

    Thanks for your help !

  7. #7
    Carnivore ('-'v) Hunter2's Avatar
    Join Date
    May 2002
    Posts
    2,879
    I have to do this for a complete HTML document, approx 7000 lines or more...so iteration is definitely not a good solution
    Well, any regex library you use is going to have to do the same processing anyway, perhaps a little less efficiently since they're built for more generic use.

    Try something like:
    Keep a bool called inTag or something, initial state false. Iterate through the string, and if inTag is true, replace the character with a '*', otherwise write it as is. Whenever you hit a '<', set inTag to true, when you hit '>' set inTag to false.

    If you're a stickler for speed, it might improve performance to read large blocks of data at a time and search through a buffer instead of reading on a byte-by-byte basis.
    Last edited by Hunter2; 02-07-2005 at 11:37 AM.
    Just Google It. √

    (\ /)
    ( . .)
    c(")(") This is bunny. Copy and paste bunny into your signature to help him gain world domination.

  8. #8
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    Actually, the iteration solution is considerably faster than any regex. Although, with such a simple regex, the limiting factor will always be I/O speed.

    As I said, a primitive finite state machine would also do.

    Code:
    state = TEXT;
    iterate over characters {
      if character is '<' then {
        state = TAG;
        output '*'
      } else if character is '>' then {
        state = TEXT;
        output '*';
      } else if state == TEXT {
        output character;
      }
    }
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  9. #9
    Registered User major_small's Avatar
    Join Date
    May 2003
    Posts
    2,787
    HTML.cpp

    here's an old program I just dug up for you - it goes through an HTML document and converts everything in the tags (with the exception of string literals) into uppercase letters. for example:

    Code:
    <a href="AaBb">AaBb</a>
    will become
    Code:
    <A HREF="AaBb">AaBb</A>
    I don't know how well it works (I haven't looked at in in a while), but I've used it on all my webpages...


    looking back at the known bugs section I wrote myself, I could probably write a quick fix for them if I really wanted to...
    Join is in our Unofficial Cprog IRC channel
    Server: irc.phoenixradio.org
    Channel: #Tech


    Team Cprog Folding@Home: Team #43476
    Download it Here
    Detailed Stats Here
    More Detailed Stats
    52 Members so far, are YOU a member?
    Current team score: 1223226 (ranked 374 of 45152)

    The CBoard team is doing better than 99.16% of the other teams
    Top 5 Members: Xterria(518175), pianorain(118517), Bennet(64957), JaWiB(55610), alphaoide(44374)

    Last Updated on: Wed, 30 Aug, 2006 @ 2:30 PM EDT

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. small -> big -> bigger -> bigger than bigger -> ?
    By happyclown in forum A Brief History of Cprogramming.com
    Replies: 9
    Last Post: 03-11-2009, 12:12 PM
  2. Need regexp help
    By garton in forum C Programming
    Replies: 34
    Last Post: 09-05-2008, 11:07 PM
  3. PHP installation
    By ssharish2005 in forum Tech Board
    Replies: 8
    Last Post: 11-23-2007, 09:42 PM
  4. PHP on my Computer!
    By xxxrugby in forum Tech Board
    Replies: 4
    Last Post: 03-15-2005, 09:34 AM
  5. PHP 4.3.0 released
    By codingmaster in forum A Brief History of Cprogramming.com
    Replies: 5
    Last Post: 12-30-2002, 07:40 AM