Thread: Strings - char mystring[n] vs. string mystring

  1. #1
    Registered User
    Join Date
    Mar 2005
    Posts
    27

    Question Strings - char mystring[n] vs. string mystring

    hi,

    While working through various tutorials, the method of using strings in C++ has been to use an char type array, eg:

    char mystring[10];

    or

    char mystring[] = "the string";

    However, in the last thread i opened i found out that there is a string type available in C++, eg:

    string mystring;

    When including the cstring header file.

    This however leads me to some confusion, for example is there a particular reason why you should use one method in favour of the other? All of the tutorials i have looked through use the array method with char. Was the string type added later along the C timeline?

    I really would appreciate any input on this one, thank you.

  2. #2
    Registered User major_small's Avatar
    Join Date
    May 2003
    Posts
    2,787
    the first method is just simply a character array. the string method (found in <string>, not <cstring>) is c++-only, and is easier to manage. read this to get an idea of how powerful C++ strings really are.

    If you're coding in C++, I suggest learning and using the String class, unless you have specific needs for the array version. if you don't know what these specific needs are, then you probably don't have them.
    Join is in our Unofficial Cprog IRC channel
    Server: irc.phoenixradio.org
    Channel: #Tech


    Team Cprog Folding@Home: Team #43476
    Download it Here
    Detailed Stats Here
    More Detailed Stats
    52 Members so far, are YOU a member?
    Current team score: 1223226 (ranked 374 of 45152)

    The CBoard team is doing better than 99.16% of the other teams
    Top 5 Members: Xterria(518175), pianorain(118517), Bennet(64957), JaWiB(55610), alphaoide(44374)

    Last Updated on: Wed, 30 Aug, 2006 @ 2:30 PM EDT

  3. #3
    Registered User
    Join Date
    Mar 2005
    Posts
    27
    Thank you, using the array version for strings was a little inconvenient so it was quite relieving to find out there was a string type available in C++.

    Is there any common occurrences when a char array is likely to be more useful then using the string type? or does it simply come down to the focus of the programmers requirements (such as specifically needing a user inputted string to be separated into it's component characters in an array)?

  4. #4
    Registered User
    Join Date
    Sep 2004
    Posts
    197
    The only reason I can see to use the char array for strings, is when you need to work with C sourcecode, seeing as thats whats common there. even then, the standard string class, has a method that will convert it into a char array. I can't remeber off had what its called, if its yourstring.c_string, or something like that.
    If any part of my post is incorrect, please correct me.

    This post is not guarantied to be correct, and is not to be taken as a matter of fact, but of opinion or a guess, unless otherwise noted.

  5. #5
    Tropical Coder Darryl's Avatar
    Join Date
    Mar 2005
    Location
    Cayman Islands
    Posts
    503
    Quote Originally Posted by Xipher
    The only reason I can see to use the char array for strings, is when you need to work with C sourcecode, seeing as thats whats common there. even then, the standard string class, has a method that will convert it into a char array. I can't remeber off had what its called, if its yourstring.c_string, or something like that.
    it would be yourstring.c_str(), however this is read-only and there is no equivalent that allows you to write to it as a char array (c-string)

  6. #6
    Senior Member joshdick's Avatar
    Join Date
    Nov 2002
    Location
    Phildelphia, PA
    Posts
    1,146
    Quote Originally Posted by Darryl
    it would be yourstring.c_str(), however this is read-only and there is no equivalent that allows you to write to it as a char array (c-string)
    If you have a C-string that you want to make a C++-style string, that's quite easy.

    Code:
    char *s = "foo";
    string str(s);
    // or
    str = s;

  7. #7
    Hardware Engineer
    Join Date
    Sep 2001
    Posts
    1,398
    They always teach you the "hard way" first!

    When in doubt, use C++ type strings.

    The WinAPI uses C-style strings.

    I suppose they teach C-style strings first, because it's easier to understand what's going on "under the hood". (A series of sequential memory locations with a null-termination to mark the end.)

    Learning character arrays should enforce the point that a single variable can't hold an entire string... So, you need a pointer to the start of the string.



    In order to understand C++ type strings "under the hood", you have to understand objects. And, you usually learn strings before learning objects.

  8. #8
    Registered User
    Join Date
    Jan 2003
    Posts
    311
    Quote Originally Posted by Diablo84
    Is there any common occurrences when a char array is likely to be more useful then using the string type? or does it simply come down to the focus of the programmers requirements (such as specifically needing a user inputted string to be separated into it's component characters in an array)?
    You can address individual characters in a std::string with the [] operator, almost exactly like you would with a c-style null-terminated array. The only disadvantage to std::strings is that they requre three pointers of overhead, rather than just one, and they are always dynamic. c-style strings avoid this overhead by crashing your program in unexpected ways, although they manage to do so in a way that gives a determined user the ability to execute arbirary code. For example.
    Code:
    int main() {
        char name[80];
        std::cout << "What is your name?";
        std::cin >> name;
        std::cout << "Goodbye " << name << std::endl;
        return 0; 
    }
    By typing more than 80 characters I begin to overwrite sections of memory outside of name. If I know the details of your operating system I can often overwrite the address your program jumps to when it exits main, and replace it with the system call of my choice.

    The reason you should be regularly downloading updates is most commonly to fix problems like the above that have been written by people who are experts on computer security and have been using C and C++ for twenty years or more. They got their pHD's from MIT, where'd you get yours?
    Code:
    #include<string>
    int main {
        string name;
        std::cout << "What is your name?";
        std::cin >> name;
        std::cout << "Goodbye " << name << std::endl;
        return 0; 
    }
    With this all I can do is consume all available memory, when that happens string throws bad_alloc, the default handler catches it, frees all the programs resorces and exits with an error code. If the program was run with sufficent permisions then for a while the other programs will slow down. If the program has a resonable quota then you would hardly notice.

    std::vector and std::string are the go-to containers. When you are done with a program, and you want to make it faster and smaller, and you have tested it and seen that the overhead is happening within these structures, then, and only then, should you start to think about using lower level replacements. Even then, well placed calls to reserve() swap() and constructors that take a length (ie std::string filename("default.txt",11); vs std::string filename = "default.txt";) can make many bottlenecks go away.

    With all of that in mind, sometimes an array of const char *'s to string litterals is faster and easyer than other alternitives for things like short messages or tokens.
    Code:
    enum rgb {red, green, blue};
    const char *color_name[] = {"red","green","blue"};

  9. #9
    Registered User
    Join Date
    Apr 2003
    Posts
    2,663
    While working through various tutorials, the method of using strings in C++ has been to use an char type array, e.g.:
    Personally, I think char arrays are the most difficult topic in beginning C++, especially since to truly understand them, you have to understand pointers pretty well. When I learned about char arrays, I don't think enough distinction was made between the two types of char arrays:

    1) an array-of-chars

    and

    2) cstrings

    An array-of-chars is initialized like this:

    char letters[] = {'a', 'b', 'c'};

    An array-of-chars uses a list of single char's to initialize the array, and each char is surrounded by single quotes.

    On the other hand, a cstring is initialized with a string literal:

    char str[] = "some text";

    A string literal is something surrounded by double quotes. The thing that makes the two arrays different is that the str array actually ends with a '\0' character. How is that possible? After all there is no '\0' at the end of the string literal "some text"? The answer is: whenever C++ sees a string literal(i.e something between double quotes), it AUTOMATICALLY slaps a '\0' on the end of it.

    All the <cstring> functions that operate on cstrings have to traverse the char array in a loop, and the loop looks for that '\0' character to signal the end of the array:
    Code:
    int i = 0;
    while(str[i] != '\0')
    {
    	//do something with str[i]
    	i++;
    }
    If there were no '\0', the loop would run past the end of the array and go out of bounds causing all kinds of problems.

    So, to summarize the differences between the two kinds of char arrays:

    1)cstrings
    --You initialize them with string literals(double quotes).
    --They end in a '\0' character, so you can use them with the <cstring> functions.
    --You can output them with the <<operator, e.g.:

    char str[] = "some text";
    cout<<str<<endl;

    2) arrays-of-chars:
    --You intialize them with a list of single characters(single quotes)
    --They don't have a '\0' character at the end, so you can't use them with <cstring> functions
    --The <<operator will go out of bounds if you try to ouput an array-of-chars, e.g. this won't work
    Code:
    char letters[] = {'a', 'b', 'c'};
    cout<<letters<<endl; //abc#$%
    Instead, if you want to output an array-of-chars, you have to do this(just like with any other array):
    Code:
    for(int i=0; i<3; i++)
    {
    	cout<<letters[i];
    }
    If you are lucky, you are introduced to the string type fairly quickly, and you try to forget about char arrays because they are so confusing. However, a string type actually consists of a cstring internally, so you can't get away from them. But, the string class is programmed so that you can very easily do all kinds of things to the cstring, and the operations take place behind the scenes. For instance, look how simple it is to do these types of operations:
    Code:
    #include <string>
    ...
    ...
    
    string str1 = "some text";
    string str2 = "other text";
    
    string str3 = str1 + str2;
    cout<<str1[0]<<str3[2]<<endl;
    
    cout<<str3<<endl;
    
    cout<<str1.length()<<endl;
    Last edited by 7stud; 04-06-2005 at 03:34 PM.

  10. #10
    Registered User
    Join Date
    Mar 2005
    Posts
    27
    Thank you all for the input, the use of strings in C++ is much clearer to me now.

    Quote Originally Posted by 7stud
    Personally, I think char arrays are the most difficult topic in beginning C++, especially since to truly understand them, you have to understand pointers pretty well. When I learned about char arrays, I don't think enough distinction was made between the two types of char arrays
    The use of arrays wasn't a problem as my previous experience with PHP covered the theory side of things however the differences between the types of char arrays were very cloudy before reading through your post. Truth be told, i wasn't aware they was classed as two different things (array of chars/cstrings), i assumed they were just different ways of declaring an array of chars.

  11. #11
    Registered User
    Join Date
    Apr 2003
    Posts
    2,663
    Truth be told, i wasn't aware they was classed as two different things (array of chars/cstrings), i assumed they were just different ways of declaring an array of chars.
    Well, I think I made up that distinction, but unless you understand it, I don't think you will ever understand char arrays. You are right, they are just two different ways of declaring char arrays, but a cstring has a '\0' at the end of the array, and an array-of-chars doesn't. In fact, you can actually create an array-of-chars that is a cstring:

    char str[] = {'a', 'b', 'c', '\0'};

    and to test it out:

    cout<<str<<endl;

    You could also use all the <cstring> functions on str.

    The use of arrays wasn't a problem as my previous experience with PHP covered the theory side of things
    Every computer language has arrays, but with C++ you will learn about the 'real' theory behind them. In PHP, you can't really know how arrays are stored in memory because PHP doesn't have pointers. PHP is actually just a computer program written in C++ that provides you with an easier syntax.

    Here is a crucial piece of information I left out of my previous description: a char array name is actually a pointer. That's why at the beginning I said, you have to understand pointers to truly understand char arrays. A pointer variable is just a variable with a strange looking syntax that stores the address of some data in memory. Remember when I said that C++ automatically slaps a '\0' character onto the end of a string literal in this statement:

    char text[] = "some text";

    Well, I left out a step. C++ slaps a '\0' character onto the end of the string literal, and then it stores:

    some text\0

    in memory somewhere. After that, C++ assigns the address of that spot in memory to the variable name: text. However, it is very hard to know that. With a normal pointer variable, when you display it using cout<< you will get the address. For instance,
    Code:
    int num = 10;
    int* p;   //declare a pointer variable p(with that strange looking syntax)
    
    p can now store the address in memory of any int variable, like the variable num:
    
    p = &num;  //the & operator gets the address of num, which is then assigned to p
    Now if you use cout<< to display p:

    cout<<p;

    you will get some strange looking output, something like: 006BFDF4. That is the address in memory of the value 10. Normally, when you output a variable using cout<<, you expect to get whatever is stored in the variable displayed to the console. However, with a pointer to type char, the << operator in the instruction cout<< is defined NOT to output what is stored in the variable(which is the address)--instead it gets the address from the variable and goes to that location in memory and fetches the value stored there and displays that to the console.

    cout<<text; //some text

    For other pointer types, to get the value stored in memory at the address contained in a pointer variable, you have to do what's called "dereferencing the pointer":
    Code:
    int num = 10;
    int* p; //declare a pointer variable p(with that strange looking syntax)
    p = &num;  //the & operator gets the address of num, which is then assigned to p
    
    cout<<*p; //"dereference the pointer" which gets the address stored in p, goes
            //to that location in memory, and fetches the value there.
    That is why it is hard to tell that text is a pointer--the <<operator in the instruction cout<< automatically dererefences pointers to type char, and you can't stop cout<< from doing that. Since you can't display the address stored in the char array variable name, the name doesn't appear to be a pointer. However, you can see a glimmer that a char array name is a pointer when you try to do something like this:
    Code:
    char text[] = "some text"; //cstring
    if(text == "some text")
    {
    	cout<<"they are equal\n";
    }
    else 
    {
    	cout<<"NOT equal\n";
    }
    The reason the variable text and the string literal "some text" aren't equal is because text is a pointer that stores an address that looks like: 006BFDF4, so the if statement is really this:

    if(006BFDF4 == "some text")

    Just because the <<operator in the instruction cout<< automatically converts text, which is an address, to the value stored at the address doesn't mean all operators do that, and in fact the == operator does not make that conversion.

    Confusing, eh?
    Last edited by 7stud; 04-06-2005 at 09:18 PM.

  12. #12
    VA National Guard The Brain's Avatar
    Join Date
    May 2004
    Location
    Manassas, VA USA
    Posts
    903
    just out of curiosity while we are on this thread.. is there any known performance gain with using either cstrings or strings?
    • "Problem Solving C++, The Object of Programming" -Walter Savitch
    • "Data Structures and Other Objects using C++" -Walter Savitch
    • "Assembly Language for Intel-Based Computers" -Kip Irvine
    • "Programming Windows, 5th edition" -Charles Petzold
    • "Visual C++ MFC Programming by Example" -John E. Swanke
    • "Network Programming Windows" -Jones/Ohlund
    • "Sams Teach Yourself Game Programming in 24 Hours" -Michael Morrison
    • "Mathmatics for 3D Game Programming & Computer Graphics" -Eric Lengyel

  13. #13
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    A few. C strings are the raw method. They have whatever performance you code into them.

    std::strings are up to the library implementor. There are quite a few ways to implement a std::string, and each has its advantages and disadvantages. This can make them slower or faster than (averagely used) C strings, it's very hard to tell ahead of the fact.
    One example is copy-on-write strings. It's practically impossible to use copy-on-write strings in plain C without an absurd amount of additional coding. With C++ and a std::string that implements copy-on-write, it's hidden from the client code. Yet, copy-on-write can be orders of magnitude faster in the right situation - and crash in an unfriendly multi-threaded environment.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. char Handling, probably typical newbie stuff
    By Neolyth in forum C Programming
    Replies: 16
    Last Post: 06-21-2009, 04:05 AM
  2. String Class
    By BKurosawa in forum C++ Programming
    Replies: 117
    Last Post: 08-09-2007, 01:02 AM
  3. String Manipulation problems -_-
    By Astra in forum C Programming
    Replies: 5
    Last Post: 12-13-2006, 05:48 PM
  4. Half-life SDK, where are the constants?
    By bennyandthejets in forum Game Programming
    Replies: 29
    Last Post: 08-25-2003, 11:58 AM
  5. simulate Grep command in Unix using C
    By laxmi in forum C Programming
    Replies: 6
    Last Post: 05-10-2002, 04:10 PM