Thread: Modifying a string literal.

  1. #31
    The Dragon Reborn
    Join Date
    Nov 2009
    Location
    Dublin, Ireland
    Posts
    629
    Quote Originally Posted by laserlight View Post
    The C++ standard states that attempting to modify a string literal results in undefined behaviour. Attempting to modify a string literal means attempting to assign to any of its characters, including the null character. What else do you not understand?
    yes ok I understand, writing new values to existent data in a string is string modification. But why should we not modify it? It should be safe as long as the string does not go out of bounds.

    even if i had a pointer to a string as such
    Code:
       char *s = "String" 
    
          while(*s!='\0')
          {
    
              *s = 'H' ;
               s++ ;
          }
    that should rewrite the string contents to all 'H'....
    it didn't go out of bounds or access out of bound memory.
    The rule "You should not modify a string" seems to restrictive then, I mean the programmer should know he/she shouldn't go out of bounds or something....
    You ended that sentence with a preposition...Bastard!

  2. #32
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by Eman
    But I don't think that is the outcome you want, because you want to prove to me that a write to a string literal will not always give me a seg fault..
    Just accept that as true.

    Quote Originally Posted by Eman
    if Borland doesnt work should I try Dev C++ then? the last of the compiler
    Given enough time and motivation, I can implement a C++ compiler to do precisely that ("a write to a string literal will not always give me a seg fault" (or more generally, crash the program)), and yet conform to the C++ standard. But that would be pointless.

    Quote Originally Posted by Eman
    So a pointer to a string is a literal. But an array of characters or a string object is not a literal.
    No. A pointer to a string is not a string literal. An array of characters might be a string literal since all string literals are arrays of characters. A std::string object is not a string literal.

    Quote Originally Posted by Eman
    But why should we not modify it? It should be safe as long as the string does not go out of bounds.
    It is not safe because an implementation may place the contents of the string literal in memory that is read-only, or do various other things that assume that string literals are immutable. In other words, any write to a string literal is out of bounds, regardless of where you are writing.
    Last edited by laserlight; 12-27-2010 at 01:17 PM.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  3. #33
    Registered User
    Join Date
    Aug 2010
    Location
    Poland
    Posts
    733
    Quote Originally Posted by Eman View Post
    the code compiled perfectly, both of them. But on runtime it crashed.
    "Unhandled exception, Access violation.."
    But I don't think that is the outcome you want, because you want to prove to me that a write to a string literal will not always give me a seg fault..

    if Borland doesnt work should I try Dev C++ then? the last of the compiler :O haha
    Borland does, but it does not matter, just leave it, it shouldn't work anyway.

    Quote Originally Posted by Eman View Post
    So a pointer to a string is a literal. But an array of characters or a string object is not a literal.
    No, pointer to a string is not a literal. Constant string array, corresponding to a literal enclosed with quotes, stored in a memory is your literal. You can safely do:

    Code:
    const char* str = "Hello world";
    Unless you do something stupid like:

    Code:
    strcat(const_cast<char*>(str), "!!!!");
    Code:
    char* str = some_string;
    str does not point to a literal, unless some_string does.

    But why should we not modify it? It should be safe as long as the string does not go out of bounds.
    Because not. Treat The C++ standard as a bible. If it says you must not do something, you must not.

    EDIT:
    My example crashed, so why do you ask?
    Last edited by kmdv; 12-27-2010 at 01:23 PM.

  4. #34
    The Dragon Reborn
    Join Date
    Nov 2009
    Location
    Dublin, Ireland
    Posts
    629
    Quote Originally Posted by laserlight View Post
    No. A pointer to a string is not a string literal. An array of characters might be a string literal since all string literals are arrays of characters.
    if all string literals are an array of characters then..
    why is Str:

    Code:
         string str = "Hello" ;
    not a string literal? It stores an array of characters after all. And yet I know str is an object of type String..

    How do I differentiate both, what a string literal is?
    You ended that sentence with a preposition...Bastard!

  5. #35
    Registered User
    Join Date
    Aug 2010
    Location
    Poland
    Posts
    733
    Ok, I think you should get back to the first reply and read everything again. It has been explained two if not more times here.

    Character Sequences
    How to char* = "string literal" appropri... - C++ Forums

  6. #36
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by Eman View Post
    if all string literals are an array of characters then..
    why is Str:

    Code:
         string str = "Hello" ;
    not a string literal? It stores an array of characters after all. And yet I know str is an object of type String..

    How do I differentiate both, what a string literal is?
    Because "a string literal is a sequence of characters surrounded by double quotes". More generally, you are committing converse error: if x is a string literal, then x is an array of characters, but if x is an array of characters, it does not follow that x is a string literal. Actually, worse still: you are not even claiming that every array of characters is a string literal. You are claiming that every object that stores an array of characters is a string literal.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  7. #37
    The Dragon Reborn
    Join Date
    Nov 2009
    Location
    Dublin, Ireland
    Posts
    629
    Quote Originally Posted by kmdv View Post
    My example crashed, so why do you ask?
    You said an illegal write might not give a seg fault.
    Laserlight said I should just accept it, so I am just going to struggle with my partial OCD and just accept it (itching a lot ).

    still a string literal is something - an array of characters enclosed in double quotes.

    How do I tell which one can be written to.

    so
    char *s cannot be written to because it is definitely a read only memory.

    char s[] definitely not rewritable? compiler implicitly sets the length of the array. Although it is still an array but why is not modifiable.

    Oh my God. I am so confused now.

    I have done C assignments to replace a character in a string or to convert from lowercase to upper. Why was it safe to modify them..in fact, who said it was?

    which type of literals should I not modify?

    The ones where I didn't explicitly set the length?
    You ended that sentence with a preposition...Bastard!

  8. #38
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Quote Originally Posted by Eman View Post
    It stores an array of characters after all.
    That's a pretty big assumption to make. It's probably true, in that that's the most "natural" way for a std::string object to work, but in general the people who make the library can have it do whatever they want (as long as the interface works the way it's supposed to).

    If you mean that "I have an array of characters on the right", that's true -- the conversion is done by the assignment operator from an array of characters to a string object.

    And again: a string literal is a bunch of characters inside quote marks. Anything else = not a string literal. By definition any variable is not a string literal. (As mentioned above, a pointer variable can point to a string literal, or more precisely where a string literal is being stored in memory, but it itself is not a string literal.)

  9. #39
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by Eman
    so
    char *s cannot be written to because it is definitely a read only memory.

    char s[] definitely not rewritable? compiler implicitly sets the length of the array. Although it is still an array but why is not modifiable.
    Consider this program:
    Code:
    int main()
    {
        char name[] = "Eman";
        char* str = name;
    }
    "Eman" is a string literal, and it is an array of 5 const char. name is an array of 5 char that is initialised with "Eman". You can assign to the characters of name. str is a pointer to char points to the first character of name. Since name is mutable, you can modify the contents of name through str. Now consider:
    Code:
    int main()
    {
        char* str = "Eman";
    }
    "Eman" is a string literal, and it is an array of 5 const char. str is a pointer to char that points to the first character of "Eman". Since str points to the first character of a string literal, and string literals are immutable, you cannot modify the contents of the string literal through str, at least not without having undefined behaviour.

    Consequently, we would more correctly write:
    Code:
    int main()
    {
        const char* str = "Eman";
    }
    But that the missing const is allowed is a matter of historical reasons. Nonetheless observe that str is a pointer to char in both cases, yet in one case it can be used to write to an array of characters, and in the other case it cannot. It boils down to the fact that in the former, you are dealing with a mutable array of characters, but in the latter, you are dealing with an immutable string literal. The fact that a pointer is used did not matter.
    Last edited by laserlight; 12-27-2010 at 01:41 PM.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  10. #40
    The Dragon Reborn
    Join Date
    Nov 2009
    Location
    Dublin, Ireland
    Posts
    629
    hold on, maybe I get it.

    A string literal is one that is stored physically in memory, I don't know how to explain it.
    Yes it is an array of characters, but it isn't in a physical data structure like an array so it isn't safe to modify. It is directly in memory?

    But if a string is copied into an array data structure, since an array is mutable as you say it is safe to modify.

    aargh like in assembly..i did a little bit.

    You can store something into a register..and there are data that is in memory
    but you can't do this

    Code:
       mov  (%edi), (%esi)
    I hope you get what I mean...

    EDIT:
    maybe that doesn't make sense after all. Can't remember the code that was used, but it was similar to that
    You ended that sentence with a preposition...Bastard!

  11. #41
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by Eman
    A string literal is one that is stored physically in memory
    No. I repeat, "a string literal is a sequence of characters surrounded by double quotes". Certainly, the contents of a string literal would be stored physically in memory, but so would the contents of any other string.

    Quote Originally Posted by Eman
    Yes it is an array of characters, but it isn't in a physical data structure like an array so it isn't safe to modify.
    The problem is that you are delving into implementation details that can even depend on things like optimisation level. The point is, a string literal is of type const char[N]. Due to the const, trying to assign to the characters of a string literal results in undefined behaviour.

    Quote Originally Posted by Eman
    But if a string is copied into an array data structure, since an array is mutable as you say it is safe to modify.
    Yes, if the array is an array of non-const char.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  12. #42
    The Dragon Reborn
    Join Date
    Nov 2009
    Location
    Dublin, Ireland
    Posts
    629
    Quote Originally Posted by laserlight View Post
    No. I repeat, "a string literal is a sequence of characters surrounded by double quotes". Certainly, the contents of a string literal would be stored physically in memory, but so would the contents of any other string.


    The problem is that you are delving into implementation details that can even depend on things like optimisation level. The point is, a string literal is of type const char[N]. Due to the const, trying to assign to the characters of a string literal results in undefined behaviour.


    Yes, if the array is an array of non-const char.
    oh right in fact that kind of makes sense.

    it is funny though if the compiler implicitly says char *string is const char *string it should stop me from modifying the values as if i explicitly said it was a const.

    I just hope I remember all this . I don't know how you guys do it :P
    You ended that sentence with a preposition...Bastard!

  13. #43
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Quote Originally Posted by Eman View Post
    it is funny though if the compiler implicitly says char *string is const char *string it should stop me from modifying the values as if i explicitly said it was a const.
    It should do it. But unfortunately it doesn't. And the reasons are for historical reasons. That is, an exception. So you can forget about it and assume all string literals are const char.

    I will try to repeat what has been pointed out in order to hope bring you clarity.

    A string literal in an expressions that starts with a double quote, followed by a series of characters and ends with a double quote. Thus,

    "Hello World" is a string literal.
    char* s = "Hello World", s is not a string literal, and again, "Hello World" is a string literal.
    char s[] = "Hello World", s is not a string literal, and again, "Hello World" is a string literal.
    std::string s = "Hello World", s is not a string literal, and again, "Hello World" is a string literal.
    std::string s("Hello World"), s is not a string literal, and again, "Hello World" is a string literal.

    Now, onto modifying:

    Modifying basically means writing a new value to a memory location. It doesn't matter where. You are modifying address X if you are writing to address X.
    If something happens to be in that memory location, then you are modifying that something.
    If a string literal happens to be occupied a series of memory locations, including X, then you are modifying that string literal.
    So...

    Code:
    const_cast<char*>("Xello World")[0] = 'H';
    "Xello World" is a string literal. Its type is out const char[N]. Thus, we cannot modify it. Therefore, we have to remove the const to modify it. That's why we use const_cast. However, note again that we are modifying a string literal here! So this is undefined behavior. You may get a crash. But then again, maybe not. That's the meaning of undefined.

    Code:
    char* s = "Xello World";
    s[0] = 'H';
    Again, undefined behavior because s points to a string literal. That means we will try to modify a memory location occupied by the string literal. Hence, this is undefined behavior!

    Code:
    char s[] = "Xello World";
    s[0] = 'H';
    Code:
    std::string s = "Xello World";
    s[0] = 'H';
    Code:
    std::string s("Xello World");
    s[0] = 'H';
    In all of these examples, this is not undefined behavior. Let us examine why.
    In the first example, the compiler will copy the contents of the string literal into the buffer s. s cannot possibly be a string literal (see above). Therefore, it is safe.
    The two last examples are identical, just different syntax. But the point is that a std::string object guarantees that we can modify it. We really don't need to know why, but we shall look into one possible answer to that question anyway.
    In one possible implementation, the class could simply create an internal buffer of type char and copy over the string literal. Hence, we would be trying to modify a local buffer that contains the copy of the string literal. This local buffer cannot possibly be a string literal, hence it is safe to modify.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  14. #44
    The Dragon Reborn
    Join Date
    Nov 2009
    Location
    Dublin, Ireland
    Posts
    629
    That is awesome man I am pretty sure I get it.
    I was reading some online notes and it makes sense.
    The basic point I learnt is..if you want to modify a string literal then use a mutable data structure!
    to be honest i think in that case,
    char *s= "Hello World" is a waste of 4 bytes and also counting the length of the string.
    You ended that sentence with a preposition...Bastard!

  15. #45
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,612
    Pointing to string literals is not a crime.

    strncpy( char *dest, const char *source, size_t n );

    What is the difference if source is a command line argument, a string literal, or something else? It makes sense to waste four bytes here since source is a constant anyway.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. question about string literal
    By pangzhang in forum C Programming
    Replies: 6
    Last Post: 07-31-2010, 07:25 AM
  2. Polymorphism and generic lists
    By Shibby3 in forum C# Programming
    Replies: 9
    Last Post: 07-26-2010, 05:27 AM
  3. Replies: 60
    Last Post: 05-31-2010, 10:57 AM
  4. Classes inheretance problem...
    By NANO in forum C++ Programming
    Replies: 12
    Last Post: 12-09-2002, 03:23 PM
  5. Warnings, warnings, warnings?
    By spentdome in forum C Programming
    Replies: 25
    Last Post: 05-27-2002, 06:49 PM