Thread: Modify string in function

  1. #46
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,613
    >> First off, it is nonsense to argue that because some part of our program is C, all code around it should also follow C conventions.

    I don't understand. In a post I wrote about iterators, you claim erroneously that I'm following C conventions. At best, the iterator concept is one used namelessly in C and made a more type strict and formal concept in C++. But if you won't begrudgingly accept that, I don't think it will make a difference in the discussion.

    >> Does it make sense not to use them? Of course not. If we didn't use them, we might as well use C to begin with.

    I consider it a personal failure that I am communicating so poorly. I am not trying to tell you to use anything C like. I'm trying to teach you about forms of the conventions you are already using.

    >> We should instead abstract the C code with strongly typed C++ interfaces

    You said specifically that I could show you how to make a function interface work well with arrays and vector. This is abstracting C code and making type safe C++ code. Like I said, a pointer is a bidirectional iterator. By far, the easiest thing to do is to accept a range bound in the type of bidirectional_iterator<T>. This works seamlessly with vector (in fact I am almost certain begin() and end() resolve to this at some point) and arrays (as long as T is the element type).

    >> that perform a lot of static (and dynamic) checks to make sure we don't feed invalid input to the unsafe C code.

    If you use bidirectional_iterator with a STL container you can do all sorts of things. Honestly I'm not as familiar with the exceptions that they throw, but out_of_range() is in the problem domain. With C code, it is harder, but as I told you earlier:

    Quote Originally Posted by whiteflags
    If you new'd something, then you already have the size: Treat it as const where possible.
    I thought I told you about the sizeof trick too in that post, but I ended up with not. If you divide the size of a fixed array by one of its elements, you get the size of the array from the compiler. The array has to be in scope so where you do that division matters, but that is easy, if you follow the convention of declaring variables where you need them.

    >> Some STL algorithms don't work with raw pointers (can't recall any on the top of my head, though)!

    Don't suppose things that you can't support. The only algorithms that could not work with a pointer as an iterator are the same algorithms that don't accept a sequence. I know of precisely none in the problem domain. Pointers being an example of bidirectional iterators is a fact. They have the same functional behavior. So this is how you make an interface for arrays and STL containers. I'm sorry you don't like it, maybe you and you alone can do something else, but it's conventional, offers a lot considering how little you say C has, and the flexibility can't be beat.
    Last edited by whiteflags; 06-19-2012 at 11:53 AM. Reason: I made a ton of grammatical errors

  2. #47
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    pointers are an example of a bidirectional iterator
    Pointers are classified as random access iterators.

    Some STL algorithms don't work with raw pointers
    Every algorithm that works directly with iterators that can accept an iterator of random access category works with raw pointers by definition.

    We should instead abstract the C code with strongly typed C++ interfaces that perform a lot of static (and dynamic) checks to make sure we don't feed invalid input to the unsafe C code.
    I really don't understand what you are trying to argue. (If anything, you may just be agreeing in a "roundabout" fashion.) That's exactly what both whiteflags and laserlight are saying: write the code using a baseline of a pointer range or a pointer and a size where you can wrap it to behave "atuomagically" with some standard conforming containers and arrays.

    We just can't make sure the input arguments are sane.
    You can help clients debug, but you really shouldn't be trying to babysit your clients when it comes to C++.

    Once again I'm compelled to argue that you would find C# or Java more suitable to your nature.

    Thus, it would be much better and much safer to use C++ iterators (and disallow raw pointers) in the interface.
    O_o

    As above, pointers are iterators so that makes no sense.

    In any event, using the iterators from a standard conforming container doesn't buy you as much as you appear to be arguing.

    You can invalidate an iterator to a container in a bazillion ways where it would be impossible to check for validity.

    Soma

  3. #48
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,613
    >> Pointers are classified as random access iterators.

    Ouch. That's right. So that means you could use it as a bidirectional iterator too. Well thanks for correcting me.

  4. #49
    Registered User antred's Avatar
    Join Date
    Apr 2012
    Location
    Germany
    Posts
    257
    Quote Originally Posted by whiteflags View Post
    I thought I told you about the sizeof trick too in that post, but I ended up with not. If you divide the size of a fixed array by one of its elements, you get the size of the array from the compiler. The array has to be in scope so where you do that division matters, but that is easy, if you follow the convention of declaring variables where you need them.
    That trick is dangerous though, because as you've already alluded to, it'll silently compile even if the array has already decayed to a pointer, giving you a bogus result. A safer alternative is

    Code:
    template < typename ElementType, std::size_t N >
    inline std::size_t arraySize( const ElementType ( & )[ N ] ) { return N; }
    which refuses to even compile if fed with anything other than an actual array.

  5. #50
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,613
    Quote Originally Posted by antred
    Code:
    template < typename ElementType, std::size_t N >
    inline std::size_t arraySize( const ElementType ( & )[ N ] ) { return N; }
    That is very nice.

    But that is also never what Elysia actually does.

  6. #51
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Quote Originally Posted by whiteflags View Post
    >> First off, it is nonsense to argue that because some part of our program is C, all code around it should also follow C conventions.

    I don't understand. In a post I wrote about iterators, you claim erroneously that I'm following C conventions. At best, the iterator concept is one used namelessly in C and made a more type strict and formal concept in C++. But if you won't begrudgingly accept that, I don't think it will make a difference in the discussion.

    >> Does it make sense not to use them? Of course not. If we didn't use them, we might as well use C to begin with.

    I consider it a personal failure that I am communicating so poorly. I am not trying to tell you to use anything C like. I'm trying to teach you about forms of the conventions you are already using.
    Specifically, this part:
    Quote Originally Posted by whiteflags View Post
    Perhaps the best thing about the fact that this works is that you are able to support legacy code rather seamlessly. In my opinion, using pointers would be cleaner since we are using UNIX networking code, but either of the above options are preferable to the reference to an array.
    I interpret as you suggesting using pointers in the interface because the OP is using UNIX networking code (which is C).

    >> We should instead abstract the C code with strongly typed C++ interfaces

    You said specifically that I could show you how to make a function interface work well with arrays and pointers. This is abstracting C code and making type safe C code. Like I said, a pointer is a bidirectional iterator. By far, the easiest thing to do is to accept a range bound in the type of bidirectional_iterator<T>. This works seamlessly with vector (in fact I am almost certain begin() and end() resolve to this at some point) and arrays (as long as T is the element type).

    >> that perform a lot of static (and dynamic) checks to make sure we don't feed invalid input to the unsafe C code.

    If you use bidirectional_iterator with a STL container you can do all sorts of things. Honestly I'm not as familiar with the exceptions that they throw, but out_of_range() is in the problem domain. With C code, it is harder, but as I told you earlier:
    Well, I can't be quite certain on how you intend to do this, so if you don't mind, how would you create these bidirectional iterators with

    a) arrays
    b) dynamic allocation
    c) containers

    and how would you use them safely, including verifying (at compile time if possible) that they do not go out of range?

    I thought I told you about the sizeof trick too in that post, but I ended up with not. If you divide the size of a fixed array by one of its elements, you get the size of the array from the compiler. The array has to be in scope so where you do that division matters, but that is easy, if you follow the convention of declaring variables where you need them.
    I know of the trick, and I use whenever possible. It's an excellent trick, but that's not the problem.
    The problem is that you can't catch the parts where you don't use that trick to assume the size (or where you can't, like say, dynamic allocation).

    >> Some STL algorithms don't work with raw pointers (can't recall any on the top of my head, though)!

    Don't suppose things that you can't support. The only algorithms that could not work with a pointer as an iterator are the same algorithms that don't accept a sequence. I know of precisely none in the problem domain. Pointers being an example of bidirectional iterators is a fact. They have the same functional behavior. So this is how you make an interface for arrays and STL containers. I'm sorry you don't like it, maybe you and you alone can do something else, but it's conventional, offers a lot considering how little you say C has, and the flexibility can't be beat.
    Its flexibility cannot be disputed. That's great. But what about security? Security is extremely important, too, especially in desktop applications.

    I'm not we're at the same wavelength here.
    All I'm looking for is a as much as possible fool-proof way of avoiding buffer overruns with

    a) arrays
    b) dynamic allocation
    c) containers

    that catches these errors at compile time, or if not possible, at run-time.
    Take my example with the array as a baseline for defining how much "fool-proof" a method I look for.

    Quote Originally Posted by phantomotap View Post
    You can help clients debug, but you really shouldn't be trying to babysit your clients when it comes to C++.

    Once again I'm compelled to argue that you would find C# or Java more suitable to your nature.
    I am compelled to disagree slightly. You should babysit everyone when it comes to security. You can't trust programmers. You can't trust anyone. It will make your code more fool-proof. It can never be perfect, but it's better than nothing.
    I've used C#. It's pretty wonderful actually. There are lots of lovely things, and lots of annoying things. But still C++ is my favorite.
    And Java sucks IMO.

    O_o

    As above, pointers are iterators so that makes no sense.

    In any event, using the iterators from a standard conforming container doesn't buy you as much as you appear to be arguing.

    You can invalidate an iterator to a container in a bazillion ways where it would be impossible to check for validity.

    Soma
    I know they don't give much. In fact, they don't seem to give a lot at all. The fact that pointers can be iterators is dumb from a security standpoint.
    Anyway, most implementations from what I understand do sanity checks on iterators in debug mode, so you know if you do something stupid. I know no implementation that does that on raw pointers.
    Last edited by Elysia; 06-19-2012 at 12:13 PM.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  7. #52
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,613
    >> I interpret as you suggesting using pointers in the interface because the OP is using UNIX networking code (which is C).

    Well you never liked my opinions, but I'm irked that it caused you to utterly miss the point.

    >> But what about security?

    The code antred posted is what I would use if I were worried about mistakes related to the sizeof trick. I think I wrote a similar function in another thread, in a similar argument, to you, one time. I don't remember when I screwed up the trick so badly ever, but that can assuage your worry. I'd use that if I were you. It doesn't intrude on the rest of the program.

  8. #53
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Quote Originally Posted by whiteflags View Post
    The code antred posted is what I would use if I were worried about mistakes related to the sizeof trick. I think I wrote a similar function in another thread, in a similar argument, to you, one time. I don't remember when I screwed up the trick so badly ever, but that can assuage your worry. I'd use that if I were you. It doesn't intrude on the rest of the program.
    The problem is where the trick fails to work.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  9. #54
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,613
    The problem is where the trick fails to work
    Let me put it this way: if the trick fails to work you either:

    a) did not use it in the right place
    b) could not use it at all

  10. #55
    Registered User antred's Avatar
    Join Date
    Apr 2012
    Location
    Germany
    Posts
    257
    Are you guys, talking about the template function? If there are any cases where that approach gives a wrong result, I'd love to hear about it!

  11. #56
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    It does not ever give wrong results. It is fool-proof, and that's what's so nice about it.
    However, it doesn't work for non-C-arrays.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  12. #57
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    The problem is where the trick fails to work.
    This is precisely why trying to babysit a programmer is doomed to failure.

    You can't protect programmers from themselves.

    The same programmer who will abuse the calculated sizes of an array are the same ones who will insert casts over an invalidated iterator to "shut the compiler up".

    Here is a bit of real advice for you: don't code for the fool.

    Write your code to be correct when the inputs are correct. You can insert aides to help clients debug. I'm all for that. Just realize that such code will only ever help good programmers develop better code by understanding where they went wrong. (Which means such code can be eliminated in release builds if desired.) You aren't protecting anyone.

    The fact that pointers can be iterators is dumb from a security standpoint.
    The fact that you don't understand the core design principles of the C++ standard library after all this time is really disconcerting.

    You want a real good design for this situation?

    Write the actual implementation of the code assuming that the inputs are correct with an interface taking a contiguous buffer in the form of random access iterator.

    Why? So that you may code functions that do such validation as you desire as is appropriate to their nature.
    Why? So that a good programmer can insert the call to such a validation point at the best location possible.
    Why? So that a maintenance programmer can benefit from the validation function varying independently from the implementation.

    If there are any cases where that approach gives a wrong result, I'd love to hear about it!
    ^_^

    I can do it. It would make you and your compiler cry.

    Seriously though, this is a case of "A malicious programmer can break anything." so isn't all that interesting.

    That said, they are talking about calculating the "count" of an array via division of the size of the array and the size of an element. Your template code is as close to a "perfect" solution to the array "count" as is possible to express in C++.

    It is fool-proof, and that's what's so nice about it.
    It also binds your implementation, which only abstracts a few bits of underlying functionality, to template interfaces.

    It will not work without a true reference to an array so the implementation also needs a true reference to an array.

    That sort of function is exactly what makes people think of templates as "bloating binaries".

    If we use the "dim grey" a bit to separate functionality we don't pay so high a price and still get everything all of you seem to want without sacrificing the sanctity of the core implementation.

    Code:
    template
    <
        typename FElementType
      , std::size_t FSize
    >
    inline bool ValidateSize
    (
        const FElementType (&)[FSize]
      , const std::size_t fSize
    )
    {
        return(FSize == fSize);
    }
    
    template
    <
        typename FContainer
      , std::size_t FSize
    >
    inline bool ValidateSize
    (
        const FContainer & fContainer
      , const std::size_t fSize
    )
    {
        return(fContainer.size() == fSize);
    }
    Code:
    if(ValidateSize(sData, sSize))
    {
        DoSomething(sData, sSize));
    }
    Soma

  13. #58
    Registered User
    Join Date
    Dec 2007
    Posts
    930
    @whiteflags
    You got me, I rewrote it with std::string functions.

    @Elysia
    You are right about the size issue and its been bothering me how to solve this. I went with your proposition.

    @laserlight
    "In my opinion, if you want to design ReceiveDataWithHeader to have a flexible interface to cater for buffers of different sizes..."

    I hope you were talking about this solution that I post here otherwise I dont follow what you are saying, obviously I'm not on your level of knowledge.

    Code:
    in main()
    {
      vector< char > big( 5000, '\0' );
      ReceiveDataWithHeader(AcceptSocket, &big.at( 0 ), big.size());
    }
    
    // size of data in header
    int ReceiveDataWithHeader(SOCKET Socket, char* Buf, size_t len)
    {
        bool iGetSize = 1;
        short iReceived = 0;
        unsigned long DataSize=0, TotalSize=1;
        unsigned long Total=0, Digits=0;
        string str;
        vector<string>Tokens;
    
        while(TotalSize)
        {
            iReceived = recv(Socket, Buf, len, 0);
       
            if(iReceived > 0)
            {
                str += Buf;
                if(iGetSize)
                    Total += iReceived;
                else
                    TotalSize -= iReceived;
                if(iGetSize)
                {
                    size_t found=str.find_first_of("$");
                    if(found!=string::npos)
                    {
                        split(str, '$', Tokens);
    
                        // got DataSize
                        DataSize=atoi(Tokens[0].c_str());
    
                        Digits = GetNumberOfDigits(DataSize);
                        cout << "Digits " <<  Digits  << endl;
                        // NumberOfDigits + $ + DataSize
                        TotalSize = Digits + 1 + DataSize;
    
                        // substruct all that we got until now
                        TotalSize -= Total;
    
                        // dont come here anymore
                        iGetSize = 0;
                    }
                }
                cout << "TotalSize " <<  TotalSize  << endl;
            }
            else if ( iReceived == 0 )
            {
                printf("ReceiveDataWithHeader Connection closed\n");
                return 1;
            }
            else
            {
                ReportError("ReceiveDataWithHeader");
                return 1;
            }
        }
    
        split(str, '$', Tokens);
        strcpy(Buf, Tokens[1].c_str());
        return 0;
    }
    
    vector<string> &split(const string &str,
                          char delim,
                          vector<string> &elems)
    {
        // first token
        string item = str.substr(0,str.find(str.find('$')));
        elems.push_back(item);
    
        // second token
        size_t found=str.find_first_of(delim);
        elems.push_back(&str[found+1]);
    
        return elems;
    }
    Using Windows 10 with Code Blocks and MingW.

  14. #59
    Registered User antred's Avatar
    Join Date
    Apr 2012
    Location
    Germany
    Posts
    257
    Quote Originally Posted by phantomotap View Post
    Here is a bit of real advice for you: don't code for the fool.
    I just want throw in that generally, I like to code for the fool because often it turns out that I AM that fool myself.
    I tend to plaster my code with tons of assertions (usually a sort of HARD_ASSERT macro that just uses the normal assert in a debug build and raises an exception in a release build) and verify pretty much anything that can be verified. You would not believe the amount of stupid mistakes I've been able to catch early on this way.

  15. #60
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Quote Originally Posted by antred View Post
    I just want throw in that generally, I like to code for the fool because often it turns out that I AM that fool myself.
    I tend to plaster my code with tons of assertions (usually a sort of HARD_ASSERT macro that just uses the normal assert in a debug build and raises an exception in a release build) and verify pretty much anything that can be verified. You would not believe the amount of stupid mistakes I've been able to catch early on this way.
    This is what I like. Protect from mistakes.

    Anyway,
    @Ducky:
    There are just too many things wrong in the code. I can't be bothered to point them all out at the moment, but maybe I point out a few things.

    strcpy(Buf, Tokens[1].c_str());
    I chafe at this line. What happens if Tokens[i].c_str() is longer than the buffer? Same problem as before...

    str += Buf;
    I also chafe a this. What if the data you receive cannot fit in the buffer (ie it got truncated)? What if it the send data wasn't null terminated (due to some error or mischievous input)? Disaster!
    Then you work with the string as if it was cool. But what if it isn't?
    Do not mix strings with buffers! They do not match!

    A good advice for you is that all input is Evil™. It is unsafe. It must be properly sanitized and converted into safe input before you work with it.

    In your code, I also see other things such as

    Code:
    if(iGetSize)
    	Total += iReceived;
    else
    	TotalSize -= iReceived;
    I chafe at this too. If iGetSize is true, then you would have a underflow because you have an unsigned integer. The code seem to reuse variables in some way I have to figure out.
    Separate them and avoid unsigned integers. If they wrap around, you can be in serious trouble. Check for negative values with assertions instead.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. modify function pass it array
    By a.mlw.walker in forum C Programming
    Replies: 12
    Last Post: 08-01-2011, 04:03 AM
  2. Replies: 15
    Last Post: 05-11-2011, 05:06 PM
  3. how to modify strcmp function
    By asteroid1122 in forum C Programming
    Replies: 6
    Last Post: 08-23-2009, 12:24 AM
  4. modify pointer to a string/character constant.
    By xsouldeath in forum C Programming
    Replies: 12
    Last Post: 10-03-2007, 02:41 AM
  5. modify has function from string parameter to templates...
    By rusty0412 in forum C++ Programming
    Replies: 2
    Last Post: 01-13-2005, 08:02 PM