Thread: GCC regex broken

  1. #1
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229

    GCC regex broken

    I am facing some weird bugs with using regex on GCC.

    Code:
    #include <iostream>
    #include <regex>
    
    int main()
    {
            std::regex pattern("[abc][123]", std::regex_constants::extended);
            std::cout << std::regex_match("a", pattern) << std::endl;
    }
    GCC produces "1". LLVM produces "0".

    Am I misunderstanding something or is GCC's regex support horribly broken?

    This is with GCC 4.8.3 on OSX.

    Thanks

  2. #2
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    O_o

    I imagine you are seeing a library error.

    The "libstdc++" shipping with "GCC 4.8.3" has incomplete and buggy "regex".

    [Edit]
    If you want to use "GCC" for other reasons, you should be able to link with "libc++" and easily get library only stuff.
    [/Edit]

    [Edit]
    The version of "libstd++" shipping with "GCC 4.9.x" has better "regex" support so you could upgrade if that is an option.
    [/Edit]

    Soma
    “Salem Was Wrong!” -- Pedant Necromancer
    “Four isn't random!” -- Gibbering Mouther

  3. #3
    Registered User MutantJohn's Avatar
    Join Date
    Feb 2013
    Posts
    2,665
    I can confirm phantom's idea about gcc 4.9.x being better.

    I used 4.9.1 and my output was 0, matching your LLVM output.

  4. #4
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    Aah I see. Thanks for checking. I will upgrade to 4.9 then.

    The reason I need to support GCC is because I want my code to run on Linux also (though I'm developing on OSX).

    Very disappointed in GCC/libstdc++. They shipped something that is obviously broken with no warnings or errors, and just return wrong values. It took me half an hour of debugging to narrow it down to regex since I assumed GCC would be able to get something so simple right and there was no warnings or errors.

    The least they could have done is add a "#warning" whenever someone included <regex>. Or maybe throw an exception whenever any regex stuff is used.

  5. #5
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    They shipped something that is obviously broken with no warnings or errors, and just return wrong values.
    O_o

    To be fair, they also documented online that "regex" was incomplete which had about a dozen footnotes.

    They should have thrown a firewall over "regex" (Requiring a `CXX11EXPERIMENTAL" macro or similar to suppress the warnings.) Given the state of C++11 support in general, I'm not really surprised specific warnings weren't offered.

    Soma
    “Salem Was Wrong!” -- Pedant Necromancer
    “Four isn't random!” -- Gibbering Mouther

  6. #6
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    Quote Originally Posted by phantomotap View Post
    O_o

    To be fair, they also documented online that "regex" was incomplete which had about a dozen footnotes.

    They should have thrown a firewall over "regex" (Requiring a `CXX11EXPERIMENTAL" macro or similar to suppress the warnings.) Given the state of C++11 support in general, I'm not really surprised specific warnings weren't offered.

    Soma
    Yeah I'm pretty disappointed by libstdc++'s C++11 support in general. C++11 isn't even that new. It has been standardized for 3 years already, and as C++0x for another few years.

    I independently discovered a serious bug in std::this_thread::sleep_until a while ago, too. Basically if the end time has already passed when the function checks it due to scheduling, etc, it will sleep forever due to unsigned underflow. It has been more than a year already, and patches have been submitted. It's a small clean fix, and it's still not applied.
    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58038

    I am now thinking about switching to LLVM on Linux as well. It seems to be of generally higher quality at least for x86-64 and C++11. For most projects it seems to produce slightly faster binaries than GCC as well.

  7. #7
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    It's true that they do warn about regex being broken in docs (now that I checked), but most people wouldn't have guessed "incomplete" means "not even the simplest examples work". If it's broken to that level, I would think at least a "#warning <regex> support is highly experimental and will probably not work" in <regex> is warranted. It would just be 1 line of code for them to add.

  8. #8
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    And by the way, my code does work perfectly with GCC 4.9. Thanks!

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. regex in c (posix regex)
    By baxy in forum C Programming
    Replies: 1
    Last Post: 11-16-2012, 01:15 PM
  2. Rep Broken?
    By B0bDole in forum A Brief History of Cprogramming.com
    Replies: 27
    Last Post: 02-04-2005, 07:00 AM
  3. My fwrite() is broken???
    By Rouss in forum C Programming
    Replies: 5
    Last Post: 04-24-2004, 10:34 PM
  4. <regex.h> regex syntax in C
    By battersausage in forum C Programming
    Replies: 7
    Last Post: 03-24-2004, 01:35 PM
  5. How is a C++ Program broken down?
    By Fool in forum C++ Programming
    Replies: 14
    Last Post: 09-29-2001, 09:27 PM