Thread: Extract integer value from STL string

  1. #1
    Chad Johnson
    Join Date
    May 2004
    Posts
    154

    Question Extract integer value from STL string

    I have a file cached in memory in a variable of type std::string. The first two bytes in hex are 12 00, which give the length of a substring within the buffer following these two bytes.

    How can I copy the value of these two bytes to an integer variable directly from the buffer (NOT from the original file), giving it the value 18? I don't want to use vectors or kind of special data structures besides a standard STL string. Just need a simple one or two liner to do it.

    I'm tempted to just say
    Code:
    memcpy(&len, buffer.c_str(), 2);
    Last edited by ChadJohnson; 03-14-2005 at 12:59 AM.

  2. #2
    Yes, my avatar is stolen anonytmouse's Avatar
    Join Date
    Dec 2002
    Posts
    2,544
    The string class includes a member function called substr. Given a starting position and a character count, this function returns a sub string. This will allow us to extract the first two characters of the string.

    The next problem is how to convert a string containing hexadecimal digits into an integer. Our first option is the C function strtol. This function, short for string to long, takes a string and a base and returns the converted integer.

    If we wanted to avoid the C function and use a C++ style technique, we could instead use the istringstream class. This class behaves similar to cin, but instead of extracting values from console input, it extracts values from a string.

    Once you have extracted a value, you should check to make sure it is valid and that you don't read past the end of your buffer.

    Three methods are demonstrated in the following sample:
    Code:
    #include <iostream>  // For cout
    #include <sstream>   // For istringstream
    #include <string>    // For string
    #include <iomanip>   // For setbase
    #include <cstdlib>   // For strtol
    using namespace std;
    
    int main(void)
    {
    	int    n;
    	string s = "12 00";
    
    	n = strtol(s.substr(0, 2).c_str(), NULL, 16);
    	cout << "Using strtol: " << n << endl;
    
    	istringstream instream(s.substr(0, 2));
    	instream >> setbase(16) >> n;
    	cout << "Using istringstream: " << n << endl;
    
    	istringstream(s.substr(0, 2)) >> setbase(16) >> n;
    	cout << "Using istringstream with inline construction: " << n << endl;
    
    	cin.get();
    }
    [edit]As pointed out by Sang-drax, this won't work if the data is in binary format. In my opinion, std::vector<unsigned char> is a more appropriate container for binary data than a std::string.[/edit]
    Last edited by anonytmouse; 03-14-2005 at 06:33 AM.

  3. #3
    S Sang-drax's Avatar
    Join Date
    May 2002
    Location
    Göteborg, Sweden
    Posts
    2,072
    anonytmouse, he was going to read binary data.

    Create a istringstream and use the .read() member function to read binary data.

    But if memcpy() works, why not use it?
    Last edited by Sang-drax; 03-14-2005 at 06:14 AM.
    Last edited by Sang-drax : Tomorrow at 02:21 AM. Reason: Time travelling

  4. #4
    Chad Johnson
    Join Date
    May 2004
    Posts
    154
    I dunno, does it really matter whether I use C or C++? They both work...are there any advantages/disadvantages of using either one at all? Could a buffer overflow be exploited?

    How exactly would I use a vector? What would be stored in each element? Assuming you mean each character of the buffer is an element, what if a value spanned across more than one element?
    Last edited by ChadJohnson; 03-14-2005 at 12:28 PM.

  5. #5
    Carnivore ('-'v) Hunter2's Avatar
    Join Date
    May 2002
    Posts
    2,879
    In case you didn't see in the other thread:
    Quote Originally Posted by Hunter2
    Well, first of all (as far as I can tell) your string will only contain these first two bytes - since the second byte is 0x00, i.e. 0 or NULL, the string is terminated after the first character. You should use a vector of char or unsigned char if you're planning on handling binary data, to avoid this sort of conflict.

    Anyway, assuming these 2 bytes represent a short (or unsigned short):
    Code:

    //Assuming the data's stored in a vector<char>:
    unsigned short substrLen = *(unsigned short*)(&(data[0]));

    This basically reinterprets the address of the first byte as a pointer to unsigned short, then dereferences it to retrieve the value of the first (sizeof(unsigned short)) bytes, interpreted as an unsigned short.

    You could also use reinterpret_cast<unsigned short*> or whatever instead of just (unsigned short*) for the typecast, but I'm not too familiar with exactly which cast does what, so I usually just stick with good 'ol C-style typecasting.
    >>does it really matter whether I use C or C++?
    No, except that C++ provides a lot of things that make life a little easier.

    >>Could a buffer overflow be exploited?
    For this? Probably not, if you do things properly. And using one language or the other won't make a difference.

    >>How exactly would I use a vector?
    See above. Also, you could use memcpy() instead of the *(unsigned short*) thing.

    >>What would be stored in each element?
    If it's a vector of char or unsigned char, then each element will hold one byte. Exactly the same as if you used a char* or unsigned char*, but safer.

    >>what if a value spanned across more than one element?
    If a value spans across more than one element, i.e. it's 4 bytes long, then it spans across however many elements it needs to. What more is there to say?

    **EDIT**
    >>Create a istringstream and use the .read() member function to read binary data.
    How does that work? Won't it still be truncated at the first 0?
    Just Google It. √

    (\ /)
    ( . .)
    c(")(") This is bunny. Copy and paste bunny into your signature to help him gain world domination.

  6. #6
    Chad Johnson
    Join Date
    May 2004
    Posts
    154
    cool. so how do I extract a string of variable length from the vector and put it into a std::string?

  7. #7
    Carnivore ('-'v) Hunter2's Avatar
    Join Date
    May 2002
    Posts
    2,879
    >>and put it into a std::string?
    That's the tricky part. Well, off the top of my head, you could do it like this:
    Code:
    std::vector<char> strBytes(size + 1, '\0');
    std::copy(data.begin() + offset, data.begin() + offset + size, strBytes.begin());
    
    std::string str(&(strBytes[0]));
    offset is the position in the data buffer that the string is located at, and size is the length of the string.

    **EDIT**
    If you prefer the C way:
    Code:
    std::vector<char> strBytes(size + 1, '\0');
    memcpy(&(strBytes[0]), &(data[0]), size);
    std::string str = strBytes;
    Last edited by Hunter2; 03-14-2005 at 07:41 PM.
    Just Google It. √

    (\ /)
    ( . .)
    c(")(") This is bunny. Copy and paste bunny into your signature to help him gain world domination.

  8. #8
    Carnivore ('-'v) Hunter2's Avatar
    Join Date
    May 2002
    Posts
    2,879
    Actually, I want to be copying size, since the string in the buffer isn't null-terminated So, it would seem I don't need the + 1.

    Also, it wasn't copying (size + 2) elements, because the 2nd parameter is supposed to point to 1 element past the last element you want copied (emulated .end() sort of thing) - so I was copying only 1 too much.

    I'll edit my original post.

    **EDIT**
    What the heck.. someone's post just vanished above this one...
    Just Google It. √

    (\ /)
    ( . .)
    c(")(") This is bunny. Copy and paste bunny into your signature to help him gain world domination.

  9. #9
    Chad Johnson
    Join Date
    May 2004
    Posts
    154

    Thumbs up

    thanks. After taking a long nap (it's spring break), I thought of another way myself.

    myStr = string(buffer.begin() + offset, buffer.begin() + offset + size);

    Well, I think I have all the info I need. I just have to make a choice of how I want to program and stick with it. These are my options as far as I can tell:

    use istringstream (I did get .read to work even with the 0 byte)

    use memset

    use vectors

  10. #10
    Carnivore ('-'v) Hunter2's Avatar
    Join Date
    May 2002
    Posts
    2,879
    >>myStr = string(buffer.begin() + offset, buffer.begin() + offset + size);
    Serves me right for not knowing my string constructors.

    Glad you figured it out.
    Just Google It. √

    (\ /)
    ( . .)
    c(")(") This is bunny. Copy and paste bunny into your signature to help him gain world domination.

  11. #11
    Registered User
    Join Date
    Aug 2003
    Posts
    470
    Yes, Hunter, I realized my mistake before you posted Forgot STL iterators want the end element.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Looking for constructive criticism
    By wd_kendrick in forum C Programming
    Replies: 16
    Last Post: 05-28-2008, 09:42 AM
  2. RicBot
    By John_ in forum C++ Programming
    Replies: 8
    Last Post: 06-13-2006, 06:52 PM
  3. Compile Error that i dont understand
    By bobthebullet990 in forum C++ Programming
    Replies: 5
    Last Post: 05-05-2006, 09:19 AM
  4. Adding an integer to the end of a string
    By virx61 in forum C++ Programming
    Replies: 14
    Last Post: 03-12-2006, 12:29 PM