Thread: simple string transformation function

  1. #1
    Registered User
    Join Date
    May 2009
    Posts
    242

    simple string transformation function

    In Prata, C++ Primer Plus, ch. 16 (STL, etc.) one of the exercises (ex. 3, p. 948) is to write a function converting a string to upper case. This is obviously not hard, but since my solution differs from his in the answers, I wondered which one you guys would view as superior:

    My solution:

    Code:
    void str_toupper(std::string& s)
    {
    	std::transform(s.begin(), s.end(), s.begin(), std::toupper);
    }
    My code does require the <algorithm> header, which his doesn't:
    Code:
    void ToUpper(string& str)
    {
         for (int i = 0; i < str.size(); i++)
              str[i] = toupper(str[i]);
    }
    Which version is better? Is one or the other going to be faster in handling a string of near maximum size, for example? Or is it just 6 of one, half-dozen of the other?

    In terms of lines of code, I get 1 line shorter inside the function but do have to #include <algorithm> for an extra line at the top, so I view that as a tie unless the inclusion of algorithm adds unnecessary overhead...

  2. #2
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    I don't know, but I there is one glaring inefficiency in the second one:
    Code:
     for (int i = 0; i < str.size(); i++)
    Now str.size() will be called every iteration (and it is probably the most expensive op in this loop, too). It should be:

    Code:
    int len = str.size();
    for (int i = 0; i < len; i++)
    Quote Originally Posted by Aisthesis View Post
    In terms of lines of code, I get 1 line shorter inside the function but do have to #include <algorithm> for an extra line at the top, so I view that as a tie unless the inclusion of algorithm adds unnecessary overhead...
    Lines of code is not a very significant measurement of anything besides style. Being concise and tidy is good, but methinks code base can easily (tho not necessarily) be inverse to code efficiency* -- eg, by adding a line or two above, the efficiency is improved.

    Including the library (algorithm) is only a cost in terms of mem usage, I think. I would also guess that using a "high level", generic/abstract rountine (such as algo lib functions) can never be more optimal than custom low level code, such as this variation:

    Code:
    void ToUpper(string& str)
    {
        int len = str.size();
        for (int i = 0; i < len; i++) 
                  if (str[i] > 96 && str[i] < 123) str[i] -= 32;
    }
    Still more code, still greater efficiency? But I maybe stand to be corrected...eg, the compiler may actually make up for most of this.

    *especially if shrinking the code size is over prioritized in your mind.
    Last edited by MK27; 02-28-2010 at 01:07 PM.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  3. #3
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by Aisthesis
    Which version is better?
    This depends on your judging criteria
    Personally, I prefer your version as the named algorithm gives a better description at a glance of what is being done than an explicit loop. On the other hand, I am still rather unsure of how stuff like locales come into play when you just pass std::toupper to std::transform.

    Quote Originally Posted by Aisthesis
    there is one glaring inefficiency in the second one: (...) Now str.size() will be called every iteration (and it is probably the most expensive op in this loop, too).
    It is probably not that bad though, since size() runs in constant time. But yeah, it might as well be pulled out of the loop.

    Quote Originally Posted by MK27
    I would also guess that using a "high level", generic/abstract rountine (such as algo lib functions) can never be more optimal than custom low level code, such as this variation:
    That is true, if you are able to take advantage of facts about your data/problem domain that generic algorithms are unaware of. In this case you are making a reasonable assumption concerning the values of the given letters in the character set. It might be a little more readable as:
    Code:
    void ToUpper(std::string& str)
    {
        for (std::string::size_type i = 0, len = str.size(); i < len; ++i) 
            if (str[i] >= 'a' && str[i] <= 'z')
                str[i] -= 'a' - 'A';
    }
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  4. #4
    Registered User
    Join Date
    May 2009
    Posts
    242
    tx for the helpful comments!

    My summary: Mine has some merit if you want to go high-level. Optimal is presumably laserlight's revision of Mk's solution, as it has the virtues of pulling size out of the loop (elegantly putting it in the initialization), going lower than toupper like MK and clarifying what's going on with the ASCII.

    Perhaps even one additional wrinkle:
    Code:
    void ToUpper(std::string& str)
    {
        const char DIFF = 'a' - 'A';
        for (std::string::size_type i = 0, len = str.size(); i < len; ++i) 
            if (str[i] >= 'a' && str[i] <= 'z')
                str[i] -= DIFF;
    }
    Last edited by Aisthesis; 02-28-2010 at 05:44 PM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Unable to compare string with 'getter' returned string.
    By Swerve in forum C++ Programming
    Replies: 2
    Last Post: 10-30-2009, 05:56 PM
  2. OOP Question DB Access Wrapper Classes
    By digioz in forum C# Programming
    Replies: 2
    Last Post: 09-07-2008, 04:30 PM
  3. Message class ** Need help befor 12am tonight**
    By TransformedBG in forum C++ Programming
    Replies: 1
    Last Post: 11-29-2006, 11:03 PM
  4. Replies: 4
    Last Post: 03-03-2006, 02:11 AM