Thread: Casting a vector to a string

  1. #1
    すまん Hikaru's Avatar
    Join Date
    Aug 2006
    Posts
    46

    Casting a vector to a string

    I have a simple string class that uses a vector to hold the characters.
    Code:
    #include <algorithm>
    #include <cctype>
    #include <cstring>
    #include <iostream>
    #include <iterator>
    #include <vector>
    
    using std::istream;
    using std::ostream;
    using std::isspace;
    using std::copy;
    
    class String
    {
    public:
        friend istream& operator>>(istream& is, String& s);
        friend ostream& operator<<(ostream& os, const String& s);
    
        // include trailing 0 for c_str()
        String(const char* s = ""): str(s, s + (std::strlen(s) + 1)) {}
    
        char& operator[](int index) { return str[index]; }
        const char& operator[](int index) const { return str[index]; }
        String& operator+=(const String& s);
    
        // exclude trailing 0 in size
        int size() const { return str.size() - 1; }
    
        const char* c_str() const { return static_cast<const char*>(&str[0]); }
    private:
        std::vector<char> str;
    };
    
    istream& operator>>(istream& is, String& s)
    {
        // empty the existing string
        s.str.clear();
    
        char c;
    
        // read and discard leading whitespace
        while (is.get(c) && isspace(c))
        {
            // nothing to do
        }
    
        if (is)
        {
            do
            {
                s.str.push_back(c);
            }
            while (is.get(c) && !isspace(c));
    
            s.str.push_back(0);
    
            // if we read whitespace, put it back on the stream
            if (is)
            {
                is.unget();
            }
        }
    
        return is;
    }
    
    ostream& operator<<(ostream& os, const String& s)
    {
        // exclude trailing 0 in drawing
        copy(s.str.begin(), s.str.end() - 1, std::ostream_iterator<const char>(os));
        return os;
    }
    
    String& String::operator+=(const String& s)
    {
        str.pop_back(); // replace trailing 0
        copy(s.str.begin(), s.str.end(), std::back_inserter(str));
        return *this;
    }
    
    String operator+(const String& s, const String& t)
    {
        String r = s;
        r += t;
        return r;
    }
    It works great when I try it out, but I'm not sure if the way I do c_str() is right. Can I assume that a vector can be used like an array if I cast the first character into a pointer?
    Code:
    const char* c_str() const { return static_cast<const char*>(&str[0]); }

  2. #2
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    It works great when I try it out, but I'm not sure if the way I do c_str() is right. Can I assume that a vector can be used like an array if I cast the first character into a pointer?
    If I remember correctly, that should work.

    By the way, why not exclude the null character terminator from the vector of chars? You should be able to append it for c_str() when needed.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  3. #3
    Registered User
    Join Date
    Jan 2005
    Posts
    7,366
    >> Can I assume that a vector can be used like an array if I cast the first character into a pointer?
    Yes. http://www.parashift.com/c++-faq-lit....html#faq-34.3

    BTW, is the cast even necessary?
    Last edited by Daved; 08-20-2006 at 01:14 PM.

  4. #4
    すまん Hikaru's Avatar
    Join Date
    Aug 2006
    Posts
    46
    Quote Originally Posted by laserlight
    By the way, why not exclude the null character terminator from the vector of chars? You should be able to append it for c_str() when needed.
    I couldn't figure out how to append it for c_str() without causing problems the rest of the time.
    Quote Originally Posted by Daved
    BTW, is the cast even necessary?
    It compiles without the cast, but that might just be my compiler.

  5. #5
    Registered User
    Join Date
    Jun 2005
    Posts
    6,815
    Quote Originally Posted by Hikaru
    I couldn't figure out how to append it for c_str() without causing problems the rest of the time.
    The way you do it is for c_str() to return a pointer to something else that is managed by the class.

    For example;
    Code:
    class String
    {
         // all your other stuff
    
         private:
            std::vector<char> str;
            mutable std::vector<char> cstr;
    };
    
    // and have c_str() work like this
    
    const char *c_str() const
    {
         //   this implementation isn't particularly efficient, but shows the idea
         cstr = str;
         cstr.push_back('0');
         return &cstr[0];
    }
    There are advantanges and disdvantages of doing this. The obvious disadvantage is that your String class consumes more memory than it would overwise, and there is a performance hit every time you call the c_str() method (although it is possible to implement c_str() more smartly than I have here to reduce that). An obvious advantage is that it reduces --- although it doesn't eliminate --- the chances of malicious (or ignorant) code from overwriting your actual string data by using your c_str() method. For example;
    Code:
         //  initialise a String x containing "ABCDEFGHI"
         char *data = (char *)x.c_str();
         strcpy(data, "Hello");
         std::cout << x;       // will still print "ABCDEFGHI"
    It also gives a feature (which is either a benefit or a hindrance depending on how you look at it) that a previous version of the string data is saved through morphing operations. For example;
    Code:
         // assume we have a string x
         const char *data(x.c_str());
         std::cout << data << ' ' << x << '\n';    // will print contents of x twice
         x.Append('A');     // an operation that changes x
          // note we have not used the c_str() method again
         std::cout << data << ' ' << x << '\n';    //  will print old then new versions of x
    This can be useful if you want to implement fallback semantics if some operation fails.
    Quote Originally Posted by Hikaru
    It compiles without the cast, but that might just be my compiler.
    It is not just your compiler. The cast is not required.
    Last edited by grumpy; 08-20-2006 at 07:23 PM.

  6. #6
    Registered User
    Join Date
    Jan 2005
    Posts
    7,366
    I'd consider the extra vector for a debug version, but not for a release version. As laserlight suggested you can push_back the null if and only if the user calls c_str(). You can also call reserve when the string is lengthened to make sure room for at least one extra character is always available.

  7. #7
    Registered User
    Join Date
    Jun 2005
    Posts
    6,815
    Quote Originally Posted by Daved
    I'd consider the extra vector for a debug version, but not for a release version.
    I wouldn't agree with that. Having a design that is fundamentally different for debug and release versions is highly problematical.
    Quote Originally Posted by Daved
    As laserlight suggested you can push_back the null if and only if the user calls c_str(). You can also call reserve when the string is lengthened to make sure room for at least one extra character is always available.
    Sure, you can do it. But there are a few problems to be resolved. c_str() is a const method, which means it can't change any (non-mutable non-static) members of an object. str is not mutable, and str.push_back('\0') or equivalent changes the state of str. Calling reserve() does not change that.

    And then there is the problem of when you remove the trailing zero, particularly if some of your member functions assume it is not there.

  8. #8
    Registered User
    Join Date
    Jan 2005
    Posts
    7,366
    >> Having a design that is fundamentally different for debug and release versions is highly problematical.
    I would agree with that. I guess I wouldn't consider it for either then.

    >> str is not mutable,
    Make it mutable.

    >> And then there is the problem of when you remove the trailing zero.
    Once it is there, never until the string is modified again.

    I was under the impression that something like this is how most STL implementations make the STL string class. I'll be curious to see if that's the case.

  9. #9
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    Actually, most STL implementations (that is, the two I looked at: libstdc++ and MS's implementation) keep the string NUL-terminated all the time. It's just not worth the management effort not to do it.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  10. #10
    Registered User
    Join Date
    Jun 2005
    Posts
    6,815
    Quote Originally Posted by Daved
    >> str is not mutable,
    Make it mutable.
    A simple minded and sloppy answer.

    Having a class with a single data member, and having that member mutable, makes it pointless to use const as a qualifier on any functions. A mutable member can be modified regardless of const attributes of objects, member functions, or arguments of functions.

    And allowing str to be changed by any member function brings back the basic problem that every function (even const ones) needs to be designed to ensure it works correctly (i.e. all preconditions and postconditions for an operation hold).

    Having a function that adds a trailing zero for one purpose, and another function that assumes the training zero is not there is asking for trouble. That is a half-way approach between having it there all the time, or never having it there. If it is decided not to have the trailing zero internally [and that is an equally valid choice], but a user (eg of the c_str() method) requires it, then a mechanism like I suggested earlier is one way to achieve that.
    Quote Originally Posted by CornedBee
    Actually, most STL implementations (that is, the two I looked at: libstdc++ and MS's implementation) keep the string NUL-terminated all the time. It's just not worth the management effort not to do it.
    That is true, if all you're storing in a std::string is basic C-style strings.

    It changes a bit in more general cases though. The specification of std::string (or, more accurately std::basis_string, which std::string is a specialisation of) in the standard, allows a string to contain embedded zeros (in effect, with a bit of work, a std::string can be used to store multiple C-style strings end to end).

  11. #11
    Registered User
    Join Date
    Jan 2005
    Posts
    7,366
    >> A simple minded and sloppy answer.
    Really? Simple, maybe. Simple-minded? I'll assume you didn't mean that as an insult, normally I would read that as "an answer from a simple mind". Sloppy? Perhaps, although you failed to mention something I hadn't already considered, so perhaps not. I prefer to call it "to the point".

    >> Having a class with a single data member, and having that member mutable, makes it pointless to use const as a qualifier on any functions.
    The const qualifier on a member function really indicates whether that function will change the logical state of the object. So how does having only a single data member that is mutable make it pointless? The logical state of that object can still be left changed or unchanged, and the const qualifier still indicates which it will be.

    >> And allowing str to be changed by any member function brings back the basic problem that every function (even const ones) needs to be designed to ensure it works correctly (i.e. all preconditions and postconditions for an operation hold).
    True. Like any other options there are points for and against this approach. If this particular approach provides significant performance benefits, then you should be willing to accept the design deficiencies, especially for a class that will be extended or enhanced rarely if ever.

    >> If it is decided not to have the trailing zero internally [and that is an equally valid choice], but a user (eg of the c_str() method) requires it, then a mechanism like I suggested earlier is one way to achieve that.
    And my suggestion is another way. I honestly can't say that you've done anything to convince me that your suggestion is more appropriate than mine, and I'm not trying to imply that my suggestion is the best. In fact, I find it hard to believe that you actually think that your suggestion should even be considered. Do you realize how often strings are used in most code? Don't you think the performance penalty of doubling the amount of memory used is a bit much just for the sake of a c_str() function? Do you really feel that such a performance hit is less of a concern than the added complexity of having a mutable member?

    >> That is true, if all you're storing in a std::string is basic C-style strings. It changes a bit in more general cases though.
    How much more general can you get than an implementation of the actual standard C++ string class? CornedBee was referring to actual implementations of the standard C++ string class, which include the ability to contain embedded nulls. That doesn't have much relevance to the question of how to implement c_str().

    We're throwing out ideas here on how to do that implementation, and what way is the best. Obviously, the OPs way is certainly acceptable, seeing as it has been used by more than one popular library implementor. Whether the other the other suggestions are acceptable as well is still up for debate.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. char Handling, probably typical newbie stuff
    By Neolyth in forum C Programming
    Replies: 16
    Last Post: 06-21-2009, 04:05 AM
  2. String Class
    By BKurosawa in forum C++ Programming
    Replies: 117
    Last Post: 08-09-2007, 01:02 AM
  3. RicBot
    By John_ in forum C++ Programming
    Replies: 8
    Last Post: 06-13-2006, 06:52 PM
  4. Classes inheretance problem...
    By NANO in forum C++ Programming
    Replies: 12
    Last Post: 12-09-2002, 03:23 PM
  5. creating class, and linking files
    By JCK in forum C++ Programming
    Replies: 12
    Last Post: 12-08-2002, 02:45 PM