Thread: Regex words with repeat characters

  1. #1
    Registered User
    Join Date
    Mar 2016
    Posts
    203

    Regex words with repeat characters

    I'm trying to regex match words with repeat characters using this link as one reference among others:
    regex - excluding words with consecutive repeat characters - Stack Overflow
    The link is not C++ tagged and I'm trying to make a C++ regex compliant version based on above (and other material) which I'm having a hard time with. The program below shows some of my efforts so far and any help would be much appreciated, thanks:
    Code:
    #include <iostream>
    #include <regex>
    #include <string>
    
    
    int main()
    {
       // std::regex re ("(A-Za-z0-9){1}");
        std::regex re ("(.)(?!\1)");
        std::string input;
        getline(std::cin, input);
        std::regex_match(input, re) ? std::cout << "Non-unique" : std::cout << "Unique";
        std::cout << '\n';
    }

  2. #2
    Registered User
    Join Date
    Jun 2015
    Posts
    1,640
    Your use of ?: is abnormal. Avoid that kind of use.

    To include a backslash in a standard string literal you need to use two backslashes. Normally, raw strings are used for regexes so that you don't have to do that: R"(string contents)"

    The other problem is that you're using regex_match, which only matches an entire string. You want regex_search, which matches anywhere in the string.
    Code:
    #include <iostream>
    #include <regex>
    #include <string>
     
    int main() {
        std::regex re (R"((.)\1)");
        std::string input;
        getline(std::cin, input);
        std::cout << (std::regex_search(input, re) ?
                      "Non-unique" : "Unique");
        std::cout << '\n';
    }

  3. #3
    Registered User
    Join Date
    Mar 2016
    Posts
    203
    regex_search() indeed! Aha!! many thanks

  4. #4
    Registered User
    Join Date
    Mar 2016
    Posts
    203
    One quick follow-up: I'm using the std::regex_constants::icase flag to turn off case sensitivity and yet std::string ("helLo") comes out as "Unique" in the below program:
    Code:
    #include <iostream>
    #include <regex>
    #include <string>
    int main()
    {
        // std::regex re ("(.)\\1");
        std::regex re ("(.)\\1", std::regex_constants::icase);
        std::string input;
        getline(std::cin, input);
        std::regex_search(input, re) ? std::cout << "Non-unique" : std::cout << "Unique";
        std::cout << '\n';
    }
    Any suggestions? Thanks

  5. #5
    Registered User
    Join Date
    Jun 2015
    Posts
    1,640
    That's a good question. Apparently, when the . matches a character, the \\1 must match exactly the same character.

    If you really need to do this, you could transform the string to all lower (or upper) case before the regex_search:
    Code:
        std::transform(input.begin(), input.end(), input.begin(), ::tolower);
    BTW, you are still using ?: abnormally. There's nothing to be gained. The normal usage is like this:
    Code:
        std::cout << (std::regex_search(input, re) ? "Non-unique" : "Unique");

  6. #6
    Registered User
    Join Date
    Mar 2016
    Posts
    203
    Thanks for the explanation, I've updated the ?: usage as well. You're quite right, less typing

  7. #7
    Registered User
    Join Date
    Jun 2015
    Posts
    1,640
    Quote Originally Posted by sean_cantab View Post
    Thanks for the explanation, I've updated the ?: usage as well. You're quite right, less typing
    The only extra bit is the parentheses which are needed due to the precedence of << being higher than ?: (which is one of the lowest precedences, only assignment and the comma operator being lower).

    I would've thought that the icase would have worked the way you used it, actually. It could have gone either way. Maybe someone else will have a better solution.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 3
    Last Post: 09-14-2013, 08:16 AM
  2. Counting rows, words and characters in C
    By Makara in forum C Programming
    Replies: 8
    Last Post: 01-06-2012, 01:42 PM
  3. Parsing First Characters From Sequence of Words
    By Cero.Uno in forum C Programming
    Replies: 6
    Last Post: 04-19-2008, 11:58 PM
  4. Hexadecimal Characters in RegEx
    By maestro371 in forum C Programming
    Replies: 10
    Last Post: 04-13-2008, 04:27 PM
  5. Counting Characters And Words
    By SNaRe in forum C Programming
    Replies: 6
    Last Post: 05-04-2005, 10:48 AM

Tags for this Thread