Thread: C++ string contents

  1. #1
    Registered User
    Join Date
    Jul 2006
    Posts
    10

    C++ string contents

    Code:
    #include <iostream>
    #include <string>
    using namespace std;
    
    int main() {
        string str = "Hello\0World";
        const char *c_str = str.c_str();
    
        for(int i = 0; i < 20; i++) {
            cout << i << ": " << *c_str++ << endl;
        }
    
        return 0;
    }
    I've noticed with this program that a C++ string won't include anything after the null character when creating a C string. But does the C++ string include the "world" substring to itself? Does the C++ string even include the '\0' to itself?

  2. #2
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    I believe the answer to your question is "it depends". The internal storage of a standard-class is not defined beyond the obvious - if it says that it's storing a string, it will store a string - but there's nothing saying how the string is actually stored - for all we know, "Helllo, World" could be stored as a linked list of one character per node, or it could be stored backwards. c_str() may well return a pointer to a different piece of data than the "actual string".

    Of course, most of what I describe is pretty unrealistic - but the fact is that YOU CAN'T RELY on what's inside a string class unless you write it yourself (besides the defined functionality of the functions and operators defined by the external class declaration of course). It may vary depending on which compiler/library you use. It can change from one version of compiler/library to another.

    It's quite possible that the string length is stored, and _UNLESS_ you call c_str(), there is no zero at the end of the string. Note that the c_str() returns a const - you can't CHANGE the content of the string with the returned value, so it's entirely possible for the implementation to make a duplicate of the "actual string" and return the address of this duplicate as the result to c_str()....

    --
    Mats

  3. #3
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    Actually, the requirements on string storage are pretty strict, due to the semantics of the data() function. In fact, there is a motion to make that explicit in the next standard.

    As it is now, string storage must be a continuous block of memory, and the string must be stored in normal order in there. No linked list, no reverse storing. In addition, the string must allow embedded nuls, meaning that it must store the length. There is no requirement that the string is nul-terminated, though. Only the one returned from c_str() must be.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  4. #4
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    I think I sort of misread you question: There's a NUL character in the middle of your string, between Hello and World - this terminates the C-string when used to construct the C++-string. It's not ever seeing the "world" bit of that string (it's not that it couldn't, if the constructor wanted to read beyond the '\0' - but what would happen if you actually just intended it to contain "Hello\0" and that is the very last byte of a valid memory region, with the very next address in memory resulting an invalid memory access?)

    So, the conclusion is: You can't store zero's inside a C++-string when you construct it with a string - it is possible that you can store a zero inside the string at other points (e.g. using the array operator method) - but it's actually fairly unlikely that this would be useful.

    --
    Mats

  5. #5
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    You can store them when you construct with a string, but you have to pass the length explicitly.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  6. #6
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by CornedBee View Post
    You can store them when you construct with a string, but you have to pass the length explicitly.
    Ah, ok, (and thanks for the clarficiation on the data() - whcih in the implementation I looked at actually calls "c_str()"). So one could do
    Code:
    string("Hello\0world", 11);
    --
    Mats

  7. #7
    Registered User
    Join Date
    Oct 2001
    Posts
    2,129
    so how would you get the full string back into a C-string?

  8. #8
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by robwhit View Post
    so how would you get the full string back into a C-string?
    You can't, if it contains zero bytes. By definition the zero byte terminates the C-string.

  9. #9
    Registered User
    Join Date
    Oct 2001
    Posts
    2,129
    well ok, an array.

  10. #10
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by robwhit View Post
    well ok, an array.
    The data() method returns a pointer to the data. Along with the size returned by size(), you can use this to get at the data.

  11. #11
    Registered User
    Join Date
    Oct 2001
    Posts
    2,129
    thanks.

  12. #12
    Registered User
    Join Date
    Jan 2005
    Posts
    7,366
    And if you want your own copy, then use memcpy to copy it, not strcpy which would stop at the first embedded null. You might be able to use strncpy, I'm not sure, but I'd stick with memcpy because you aren't copying a C string, so you shouldn't be using C string functions.

    Remember that if you want to modify the data, you need your own copy. You're not really allowed to modify the return value of data().

  13. #13
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    strncpy wouldn't work. The count passed to it is an upper limit. It would still stop at NUL.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  14. #14
    Registered User
    Join Date
    Jul 2006
    Posts
    10
    Thanks for the replies.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. char Handling, probably typical newbie stuff
    By Neolyth in forum C Programming
    Replies: 16
    Last Post: 06-21-2009, 04:05 AM
  2. String issues
    By The_professor in forum C++ Programming
    Replies: 7
    Last Post: 06-12-2007, 09:11 AM
  3. Linked List Help
    By CJ7Mudrover in forum C Programming
    Replies: 9
    Last Post: 03-10-2004, 10:33 PM
  4. Classes inheretance problem...
    By NANO in forum C++ Programming
    Replies: 12
    Last Post: 12-09-2002, 03:23 PM
  5. creating class, and linking files
    By JCK in forum C++ Programming
    Replies: 12
    Last Post: 12-08-2002, 02:45 PM