Thread: String manipulation.

  1. #1
    Internet Superhero
    Join Date
    Sep 2006
    Location
    Denmark
    Posts
    964

    String manipulation.

    Im having problems with a function of mine. I was just practicing playing with c++ strings, and most of the program works just fine, but after i tried to add a function to calculate the Digit Sum of the numbers in the string, i encountered some weird behaviour.

    If i just input the string to be 1, the digit sum is 49, 2 is 50, 3 is 51, and so on. It doesn't really make sense, and the code looks fine to me.

    Code:
    #include <iostream>
    #include <string>
    
    int ZeroTest(std::string str);
    int EvenTest(std::string str);
    int OddTest(std::string str);
    int StrTotal(std::string str);
    
    int main()
    {
        int ZeroCount = 0, EvenCount = 0, OddCount = 0, Total = 0;
        std::string str;
        
        std::cout << "Enter string to test: ";
        std::cin >> str;
        std::cout << std::endl;
        
        ZeroCount = ZeroTest(str);
        EvenCount = EvenTest(str);
        OddCount = OddTest(str);
        Total = StrTotal(str);
        
        if((ZeroCount + EvenCount + OddCount) == str.size())
        {
                      std::cout << "Number of zeroes : " << ZeroCount << std::endl;
                      std::cout << "Number of evens : " << EvenCount << std::endl;
                      std::cout << "Number of odds : " << OddCount << std::endl << std::endl;
                      std::cout << "String total is : " << Total << std::endl;
                      std::cout << "String size is : " << str.size() << std::endl;
        }
        else
        {
                      std::cerr << "String size error" << std::endl;
        }
        std::cin.ignore(2);
        return 0;
    }
    
    int ZeroTest(std::string str)
    {
        int i = 0, ZeroCount = 0;
        
        for(i; i < str.size(); i++)
        {
               if(str[i] == '0')
               {
                         ZeroCount++;
               }
        }
        return(ZeroCount);
    }
    
    int EvenTest(std::string str)
    {
        int i = 0, EvenCount = 0;
        
        for(i; i < str.size(); i++)
        {
               if(str[i] % 2 == false && str[i] != '0')
               {
                         EvenCount++;
               }
        }
        return(EvenCount);
    }
    
    int OddTest(std::string str)
    {
        int i = 0, OddCount = 0;
        
        for(i; i < str.size(); i++)
        {
               if(str[i] % 2 == true)
               {
                          OddCount++;
               }
        }
        return(OddCount);
    }
    
    int StrTotal(std::string str)
    {
        int i = 0, Total = 0;
        
        while(i < str.size())
        {
               Total = Total + str[i];
               i++;
        }
        return(Total);
    }

  2. #2
    Registered User
    Join Date
    Jan 2005
    Posts
    7,366
    When you use a character as an integer, the character's character code is used instead. If you have a digit like '1', you have to convert it to the actual number 1 before you use it in your calculation.

    To do this, simply subtract the character '0' from the character in the string and it will give you the correct number.
    Code:
    int digit = str[i] - '0';
    By the way, you are just getting "lucky" with your even and odd test. The same problem is happening there, you are checking the character code to see if it is even or odd, not the number itself. On your machine, the character code is even if the digit is even, but that is not guaranteed to happen and a better idea would be to convert to the digit first and then do the work.
    Last edited by Daved; 06-14-2007 at 02:34 PM.

  3. #3
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by Daved View Post
    On your machine, the character code is even if the digit is even, but that is not guaranteed to happen and a better idea would be to convert to the digit first and then do the work.
    It's not really guaranteed that the characters '0' through '9' have contiguous codes, either. So strictly, subtracting '0' isn't guaranteed to work either. The most portable way to convert from characters to numbers is sscanf() or atoi(), but those work on strings, not single characters, so you have to stick the character in a null-terminated string first.

    Of course, I can't think of an example of a real-world character set where '0' through '9' are not contiguous and in-order, so it's not the biggest deal in the world.

  4. #4
    Registered User
    Join Date
    Jan 2005
    Posts
    7,366
    >> It's not really guaranteed that the characters '0' through '9' have contiguous codes, either.

    Actually it is. I don't have time to look it up right now, but the C++ standard guarantees exactly that.

  5. #5
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    What's not guaranteed is having '0' equal to 48, so use '0' instead of the numeric constant 48.
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  6. #6
    Registered User
    Join Date
    Oct 2001
    Posts
    2,934
    Quote Originally Posted by brewbuck View Post
    It's not really guaranteed that the characters '0' through '9' have contiguous codes, either. So strictly, subtracting '0' isn't guaranteed to work either.
    Actually they are guaranteed to be contiguous. You may be thinking of the letters 'A' through 'Z'.

  7. #7
    Dr Dipshi++ mike_g's Avatar
    Join Date
    Oct 2006
    Location
    On me hyperplane
    Posts
    1,218
    Actually they are guaranteed to be contiguous. You may be thinking of the letters 'A' through 'Z'.
    I think brewbuck was referring to character encodings other than ASCII. Tbh I dont see any reason why someone would want to encode numbers in a noncontiguous way, but it surely is possible.

  8. #8
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    No. As Daved and I have mentioned, '0' - '9' are guaranteed to be contiguous, no matter what the character encoding.

    Check out the last post of this thread. http://www.ozzu.com/ftopic23794.html
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  9. #9
    Dr Dipshi++ mike_g's Avatar
    Join Date
    Oct 2006
    Location
    On me hyperplane
    Posts
    1,218
    Yeah. I see its part of the C++ standard, but if someone was bent on creating some kind of character encoding that did not have contiguous numbers it still should be possible. Somehow. Not that its ever going to happen.

  10. #10
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    Apparently it isn't. You're right, though. The standard should have included a tonumber() macro/function, usually defined like this:
    Code:
    #define tonumber(c) (isdigit(c) ? (c) - '0' : -1)
    Well, they didn't. Too late now.

    And if '0' - '9' have to be contiguous, why can 'a' - 'z' be non-contiguous? . . .
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  11. #11
    Registered User
    Join Date
    Oct 2001
    Posts
    2,934
    Quote Originally Posted by mike_g View Post
    but if someone was bent on creating some kind of character encoding that did not have contiguous numbers it still should be possible.
    Then the C++ implementor (compiler writer) would not be able to use that character set.

  12. #12
    Registered User
    Join Date
    Oct 2001
    Posts
    2,934
    Quote Originally Posted by dwks View Post
    And if '0' - '9' have to be contiguous, why can 'a' - 'z' be non-contiguous? . . .
    IBM?

  13. #13
    Registered User
    Join Date
    Jan 2005
    Posts
    7,366
    >> And if '0' - '9' have to be contiguous, why can 'a' - 'z' be non-contiguous?
    Because there are many character sets that don't just use the english 'a' - 'z', but there aren't any number systems that use '0' - '9' in a different order or with extra characters inserted. I don't know if that is the real reason, but its good enough for me.

  14. #14
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    but there aren't any number systems that use '0' - '9' in a different order or with extra characters inserted.
    What about Roman Numerals? Okay, that's a bit far-fetched, but what about hexadecimal?

    I know what you mean, though.
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  15. #15
    Internet Superhero
    Join Date
    Sep 2006
    Location
    Denmark
    Posts
    964
    Quote Originally Posted by Daved
    When you use a character as an integer, the character's character code is used instead. If you have a digit like '1', you have to convert it to the actual number 1 before you use it in your calculation.

    To do this, simply subtract the character '0' from the character in the string and it will give you the correct number.
    Thanks alot, the Digit Sum function is working correctly now.

    Quote Originally Posted by brewbuck View Post
    It's not really guaranteed that the characters '0' through '9' have contiguous codes, either. So strictly, subtracting '0' isn't guaranteed to work either. The most portable way to convert from characters to numbers is sscanf() or atoi(), but those work on strings, not single characters, so you have to stick the character in a null-terminated string first.

    Of course, I can't think of an example of a real-world character set where '0' through '9' are not contiguous and in-order, so it's not the biggest deal in the world.
    Isn't there a function for converting a single char to an int? Or couldn't i just cast the char to an int? Would that help?

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. C++ ini file reader problems
    By guitarist809 in forum C++ Programming
    Replies: 7
    Last Post: 09-04-2008, 06:02 AM
  2. Compile Error that i dont understand
    By bobthebullet990 in forum C++ Programming
    Replies: 5
    Last Post: 05-05-2006, 09:19 AM
  3. Replies: 4
    Last Post: 03-03-2006, 02:11 AM
  4. Linked List Help
    By CJ7Mudrover in forum C Programming
    Replies: 9
    Last Post: 03-10-2004, 10:33 PM
  5. string manipulation
    By SPEKTRUM in forum Linux Programming
    Replies: 3
    Last Post: 01-26-2002, 11:41 AM