Thread: binary_search() advice

  1. #1
    Registered User
    Join Date
    Aug 2007
    Location
    U.K.
    Posts
    148

    binary_search() advice

    Hi,

    I have a string and I'm trying to replaces unwanted characters with a space character.

    So if have the string "A string of TEXT!" and I replace the chars "sT" I end up with "A tring of EX !".

    I came up with a method, but it was inefficient, so I've tried to give the binary_search() method a go, but am a little stuck.

    My problem is that the all the elements in the vector need to be sorted for a binary_search() to work, but if I sort the vector which holds the sentence, then all the letters will become jumbled up, and so will make no sense to the reader.

    Here is my program:

    Code:
    #include <iostream>
    #include <string>
    #include <vector>
    #include <algorithm>
    using namespace std;
    
    void main()
    {
    	/***********************************************************
    	put each char from a string into a vector
    	************************************************************/
    	
    	//the string to use
    	string str = "A string of T3XT!";
    
    	//the string will go into this vector
    	vector<char>strVec1;
    
    	//populate that vector with the string
    	for(unsigned int i = 0; i < str.length(); ++i)
    	{
    		strVec1.push_back(str[i]);
    	}
    
    	/**************************************************************
    	put each char from a second string into a second vector
    	***************************************************************/
    
    	//the second string to use
    	string charsToReplace = "sT";
    	
    	//the second string will go into this vector
    	vector<char>strVec2;
    
    	//populate the second vector with the second string
    	for(unsigned int ii = 0; ii < charsToReplace.size(); ++ii)
    	{
    		strVec2.push_back(charsToReplace[ii]);
    	}
    
    
    	/**************************************************************
    		sort both vectors so they can be binary_search'd
    	**************************************************************/
    
    	//Problem, if I sort() the first vector, then all the chars(ie: the
    	//letters in the sentence will be out of order, and so it will 
    	//make no sense
    
    	/**************************************************************
    	replace any unwanted chars with a space
    	***************************************************************/
    	//if any of the chars in the first vector match any of the chars 
    	//in the second vector, replace the char in the 
    	//first vector with a space.
    
    	/**************************************************
    		old inefficient method:
    	**************************************************/
    
    	//for(unsigned int a = 0;a<strVec1.size();++a)
    	//{       
    	//        for(unsigned int j = 0; j < strVec2.size(); ++j)
    	//        {       
    	//                if(strVec1[a] == strVec2[j])//if the two chars match
    	//                {
    	//                        strVec1[a] = ' ';//replace the char from the second with a space
    	//                        cout << "0";
    	//                }
    	//        }
    	//}
    
    
    	/**************************************************
    		new binary search method:
    	**************************************************/
    	for(unsigned int a = 0;a<strVec1.size();++a)
    	{	
    		if(binary_search(strVec1.begin(), strVec1.end(), strVec2[a]))
    		{
    			strVec1[a] = ' ';
    		}
    	}
    
    	/************************************************************
    		put all the remaining chars back into the original string
    	*************************************************************/
    	
    	//empty the old string variable
    	str = "";
    
    	//put the remaining chars back into the string
    	for(unsigned int b = 0; b < strVec1.size(); ++b)
    	{
    		str += strVec1[b];
    		cout << str << endl;
    	}
    	
    	cout << "all unwanted characters removed from the string." << endl;
    	
    	cin.get();
    }
    I'm looking to test for the 255 ASCII characters.

    How do you think I should tackle this problem?

    Your advice is really appreciated.

    Many thanks!

  2. #2
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    I suggest something like this:
    Code:
    #include <string>
    #include <algorithm>
    #include <iostream>
    
    class MatchChars
    {
    public:
        explicit MatchChars(const std::string& chars) : chars(chars) {}
    
        bool operator()(char c) const
        {
            return chars.find(c) != std::string::npos;
        }
    private:
        std::string chars;
    };
    
    int main()
    {
        std::string str = "A string of TEXT!";
        std::replace_if(str.begin(), str.end(), MatchChars("sT"), ' ');
        str.erase(std::unique(str.begin(), str.end()), str.end());
        std::cout << str << std::endl;
    }
    My idea is to first replace all those characters with spaces, and then remove duplicate spaces.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  3. #3
    Registered User
    Join Date
    Aug 2007
    Location
    U.K.
    Posts
    148
    Thanks laserlight!

    I'm going to spend some time going though it so I understand it properly.

    Thanks for helping me out

  4. #4
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    By the way, you can still use binary search in the implement of MatchChars since the string object in MatchChars can have its characters sorted, except that it probably is not worth it if you only have a handful of unwanted characters to match with.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  5. #5
    Registered User
    Join Date
    Nov 2008
    Posts
    30
    Quote Originally Posted by laserlight View Post
    I suggest something like this:
    Code:
    #include <string>
    #include <algorithm>
    #include <iostream>
    
    class MatchChars
    {
    public:
        explicit MatchChars(const std::string& chars) : chars(chars) {}
    
        bool operator()(char c) const
        {
            return chars.find(c) != std::string::npos;
        }
    private:
        std::string chars;
    };
    
    int main()
    {
        std::string str = "A string of TEXT!";
        std::replace_if(str.begin(), str.end(), MatchChars("sT"), ' ');
        str.erase(std::unique(str.begin(), str.end()), str.end());
        std::cout << str << std::endl;
    }
    My idea is to first replace all those characters with spaces, and then remove duplicate spaces.
    Hi Laserlight,

    I know you said 'something like this' about your solution, but couldn't help nitpicking.
    Your solution will remove all duplicate characters in the string (with unique and erase). If you want to remove all duplicate spaces later, why to put them in the first place? We can just use remove_if algorithm instead of replace_if, unique and erase.

  6. #6
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by salquestfl
    Your solution will remove all duplicate characters in the string (with unique and erase).
    Oh yes, that is a major bug. One can fix it by using the version of std::unique that takes a predicate.

    Quote Originally Posted by salquestfl
    If you want to remove all duplicate spaces later, why to put them in the first place? We can just use remove_if algorithm instead of replace_if, unique and erase.
    The problem is that Swerve does not want to remove them; Swerve wants to replace each such region with a space.

    Actually, there is a related problem: even with my suggested fix, consecutive spaces from the original text will be reduced to a single space. If this is not desirable, a better fix would be to first use std::unique to reduce all the unwanted character regions into a single unwanted character, and then replace each unwanted characters with a space.

    Note that erase must still be used in any case, unless you only deal with the iterators.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  7. #7
    Registered User
    Join Date
    Nov 2008
    Posts
    30
    Quote Originally Posted by laserlight View Post
    Oh yes, that is a major bug. One can fix it by using the version of std::unique that takes a predicate.
    Still, why to do replace_if, unique and erase when remove_if is what is needed?

    Quote Originally Posted by laserlight View Post
    The problem is that Swerve does not want to remove them; Swerve wants to replace each such region with a space.
    Then what is the need of unique and erase?
    You either define the problem as
    1. Replace unwanted characters by spaces OR
    2. Remove unwanted characters

    Your code and comments suggest you are trying to do problem 2.


    Quote Originally Posted by laserlight View Post
    Actually, there is a related problem: even with my suggested fix, consecutive spaces from the original text will be reduced to a single space.
    Again, this problem is because you are adding the spaces in the first place and then trying to remove the exact same spaces.
    Last edited by salquestfl; 03-11-2010 at 02:47 AM.

  8. #8
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by salquestfl
    Still, why to do replace_if, unique and erase when remove_if is what is needed?
    Because remove_if does not work. I know because I nearly posted such an example, until I noticed that Swerve's sample result was "A tring of EX !", not "A tring of EX!". The space in between the 'X' and the '!' is due to the replacement with a space.

    Quote Originally Posted by salquestfl
    Then what is the need of unique and erase?
    You either define the problem as
    1. Replace unwanted characters by spaces OR
    2. Remove unwanted characters

    Your code and comments suggest you are trying to do problem 2.
    You missed a third possibility, which is what Swerve stated in post #1: replace unwanted characters with a space character. That is, each region of consecutive unwanted spaces is to be replaced by a single space character.

    Quote Originally Posted by salquestfl
    Again, this problem is because you are adding the spaces in the first place and then trying to remove the exact same spaces.
    No, I am only trying to remove duplicate spaces that were the result of replacing each unwanted character with space. The problem is due to an oversight in the use of std::unique. Nonetheless, I have proposed a better fix if this turns out to be a problem (it might actually be desirable), i.e., remove the duplicate unwanted characters first, and only then replace each remaining unwanted character with a space.

    EDIT:
    If you want a code example, this would be it:
    Code:
    #include <string>
    #include <algorithm>
    #include <iostream>
    
    class MatchChars
    {
    public:
        explicit MatchChars(const std::string& chars) : chars(chars) {}
    
        bool operator()(char c) const
        {
            return chars.find(c) != std::string::npos;
        }
    
        bool operator()(char c1, char c2) const
        {
            return chars.find(c1) != std::string::npos
                && chars.find(c2) != std::string::npos;
        }
    private:
        std::string chars;
    };
    
    int main()
    {
        std::string str = "A string of TEXT!";
        const MatchChars match_chars("sT");
        str.erase(std::unique(str.begin(), str.end(), match_chars), str.end());
        std::replace_if(str.begin(), str.end(), match_chars, ' ');
        std::cout << str << std::endl;
    }
    But I am not sure if this is what Swerve wants because the given sample output does not match the sample output produced by this program. If I am correct, a correct solution would be:
    Code:
    #include <string>
    #include <algorithm>
    #include <iostream>
    
    class MatchChars
    {
    public:
        explicit MatchChars(const std::string& chars) : chars(chars) {}
    
        bool operator()(char c) const
        {
            return chars.find(c) != std::string::npos;
        }
    private:
        std::string chars;
    };
    
    class IsDuplicateSpace
    {
    public:
        bool operator()(char c1, char c2) const
        {
            return c1 == ' ' && c2 == ' '
        }
    };
    
    int main()
    {
        std::string str = "A string of TEXT!";
        std::replace_if(str.begin(), str.end(), MatchChars("sT"), ' ');
        str.erase(std::unique(str.begin(), str.end(), IsDuplicateSpace()), str.end());
        std::cout << str << std::endl;
    }
    Last edited by laserlight; 03-11-2010 at 06:05 AM.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  9. #9
    Registered User
    Join Date
    Nov 2008
    Posts
    30
    Quote Originally Posted by laserlight View Post
    Because remove_if does not work. I know because I nearly posted such an example, until I noticed that Swerve's sample result was "A tring of EX !", not "A tring of EX!". The space in between the 'X' and the '!' is due to the replacement with a space.


    You missed a third possibility, which is what Swerve stated in post #1: replace unwanted characters with a space character. That is, each region of consecutive unwanted spaces is to be replaced by a single space character.
    Great. I had missed that one. And brilliant solution, btw.
    boost::regex_replace is also an alternative. But it may be an overkill.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Advice on C Programming with MSVC++ 2008
    By IT_Guy in forum Windows Programming
    Replies: 1
    Last Post: 03-06-2009, 04:23 AM
  2. Soliciting Networking/System Configuration Advice I
    By Dave_Sinkula in forum Tech Board
    Replies: 11
    Last Post: 04-29-2007, 08:30 PM
  3. Resume advice
    By Zughiaq in forum A Brief History of Cprogramming.com
    Replies: 7
    Last Post: 04-15-2005, 02:16 PM
  4. girl friend advice (prob. the wrong place)
    By B0bDole in forum A Brief History of Cprogramming.com
    Replies: 15
    Last Post: 10-22-2004, 06:38 PM
  5. Advice for Beginner??
    By hawghauler in forum C++ Programming
    Replies: 6
    Last Post: 06-01-2003, 05:33 AM