Thread: More questions (relating to global doubts)

  1. #1
    Registered User
    Join Date
    Mar 2008
    Location
    Coimbra, Portugal
    Posts
    85

    More questions (relating to global doubts)

    Hi again.

    Quite a long time since I last wrote here.

    Once again, and as my self-tutoring life progresses, I find myself with some questions which I will ask you.

    I've been trying to develop basic Standard C++ (and also C) library functions. About two seconds (wohoo) ago I finished string.h (or cstring or whatefver you may wish).

    During the developing of its functions I stumbled across some doubts.
    i.e. - In the following piece of code:

    Code:
    if(*--ptr_str2==*ptr_str1++) ptr_str2=(char*)str2;
    else return --ptr_str1-str1;
    Alright it's out of context, but picture that only. Is that piece of code faster than:

    Code:
    if(*--ptr_str2==*ptr_str1) { ptr_str1++; ptr_str2=(char*)str2; }
    else return ptr_str1-str1;
    Answering myself, I think it depends on the case. If the if statement becomes true, then I _think_ i spare time in doing what I do in the second sample of code, but, when it evaluates to false, I have to decrement it again (or maybe just subtract one to it in a logical expression...), lowering performance.

    I would like to post the whole code from all my functions and get an opinion from the 'experts', just to check if I'm going in the right direction. If you allow me to do it, in my next post I will paste the code here. For now, if it wouldn't bother you really much, I'd like you to comment on my implementation of strspn:

    Code:
    size_t strspn ( const char * str1, const char * str2 ) {
    char* ptr_str1=(char*)str1,* ptr_str2=(char*)str2;
    while(*ptr_str1) {
    while(*ptr_str2++!=*ptr_str1 && (*ptr_str2)) ;
    if(*--ptr_str2==*ptr_str1++) ptr_str2=(char*)str2;
    else return --ptr_str1-str1;
    }
    return ptr_str1-str1;
    }
    Do you think it is ok? How could it be improved? What are the major performance hits?

    Also (once again, not posting the WHOLE code), could you also comment on the following, please?

    Code:
    /* strrchr - Return a pointer to the last occurence of character in str. */
    const char * strrchr ( const char * str, int character ) {
    char* ptr_str=(char*)str;
    while (*ptr_str++) ; /* Go to the end of the string */
    while (*--ptr_str!=(char)character && (*ptr_str)) ;
    if(*ptr_str!=(char)character) return NULL;
    return ptr_str;
    }
    Thanks in advance, sorry if I sound too 'crazy' or 'rhetorical'.

  2. #2
    Banned master5001's Avatar
    Join Date
    Aug 2001
    Location
    Visalia, CA, USA
    Posts
    3,685
    Your optimizer makes it hard to give you a sure-fire answer that is fair. All things equal, however, the code that requires the least opcodes to run, and the least expensive combination of opcodes will surely be "faster"

  3. #3
    The larch
    Join Date
    May 2006
    Posts
    3,573
    Have you verified that your strrchr function works correctly? It seems to me that it returns NULL always.

    Code:
    while (*--ptr_str!=(char)character && (*ptr_str)) ;
    Not sure if referencing ptr_str twice here while decrementing it is undefined behaviour. In any case I don't see anything here that would stop it going beyond the start of the buffer if the character is not found anywhere in the string. C-style strings are not null-terminated at both ends...

    When corrected it would look perhaps something like:
    Code:
    for (--ptr_str; *ptr_str != character && ptr_str != str; --ptr_str);
    In which case it appears that it might be slightly slower than the following (though this might depend heavily on the type of input)
    Code:
    const char* StrrChr (const char* str, int character)
    {
        const char* result = 0;
        while (*str) {
            if (*str == character)
                result = str;
            ++str;
        }
        return result;
    }
    The worst case for both is reversed. When the string doesn't contain the character, this is about twice as fast, since it traverses the string only once. It the string contains only the searched characters, this is a bit slower, because it does more checks and assignments (but apparently not twice as slow). In the average case, with the character appearing occasionally, this might be slightly faster.
    Last edited by anon; 12-02-2008 at 06:13 PM.
    I might be wrong.

    Thank you, anon. You sure know how to recognize different types of trees from quite a long way away.
    Quoted more than 1000 times (I hope).

  4. #4
    Registered User
    Join Date
    Mar 2008
    Location
    Coimbra, Portugal
    Posts
    85
    Quote Originally Posted by anon View Post
    Have you verified that your strrchr function works correctly? It seems to me that it returns NULL always.

    Code:
    while (*--ptr_str!=(char)character && (*ptr_str)) ;
    Not sure if referencing ptr_str twice here while decrementing it is undefined behaviour. In any case I don't see anything here that would stop it going beyond the start of the buffer if the character is not found anywhere in the string. C-style strings are not null-terminated at both ends...

    When corrected it would look perhaps something like:
    Code:
    for (--ptr_str; *ptr_str != character && ptr_str != str; --ptr_str);
    In which case it appears that it might be slightly slower than the following (though this might depend heavily on the type of input)
    Code:
    const char* StrrChr (const char* str, int character)
    {
        const char* result = 0;
        while (*str) {
            if (*str == character)
                result = str;
            ++str;
        }
        return result;
    }
    The worst case for both is reversed. When the string doesn't contain the character, this is about twice as fast, since it traverses the string only once. It the string contains only the searched characters, this is a bit slower, because it does more checks and assignments (but apparently not twice as slow). In the average case, with the character appearing occasionally, this might be slightly faster.
    Hi, thanks for your answer.

    I did check the function and it worked. I've been implementing all of them based on the c++ reference at cplusplus.com. I've tested all of them with the given examples and corrected some bugs when I altered them. I also compared their behavior to the standard ones (talking about GCC and its libs here).

    You are right saying it returns NULL. I'm trying to think about what could have happened, I am totally sure it worked. Maybe I just changed the code somewhere, awkward...


    Looking at the code I get it. During some of my copy+pasting I added:
    Code:
     && (*ptr_str)
    Now since we were at 0 it couldn't ever iterate so it just stopped (well, you got it.)

    Relating the 'going past the buffer' issue, here's my workaround which I think fixes everything that's mentioned:

    Code:
    const char * strrchr ( const char * str, int character ) {
    char* ptr_str=(char*)str;
    while (*ptr_str++) ; /* Go to the end of the string */
    while (*--ptr_str!=(char)character && ((char*)str!=ptr_str)) ;
    if(*ptr_str==(char)character) 
    return ptr_str;
    else return NULL;
    }
    Do you think it is ok like this?

    Also, I do not think it is undefined behaviour when I think of it as assembly code. Reading from left to right, since I've got to the conclusion that both TenDRA and GCC do this (reading from left to right). I think saying x && y is not the same as y && x. (but that's just me).

    Thanks allot.

  5. #5
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    C++ guarantees that the left side of && is evaluated before the right, so the behavior is defined.

  6. #6
    The larch
    Join Date
    May 2006
    Posts
    3,573
    It appears that the problem was that you first loop sets ptr_str to point beyond the null-terminator (one beyond the array should be legal as long as you don't dereference it). So when you are coming back the first character you inspect is the terminator.

    My doubt was not about whether && is evaluated from left to right, but rather is it guaranteed that the side-effects are applied (str_ptr decremented) before the right-hand-value is evaluated. It seems that it would be.

    ----

    A style issue is with the casts. Why don't you declare your stuff with the correct type to begin with, e.g

    Code:
    //char* ptr_str=(char*)str;
    
    const char* ptr_str = str;
    I might be wrong.

    Thank you, anon. You sure know how to recognize different types of trees from quite a long way away.
    Quoted more than 1000 times (I hope).

  7. #7
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by anon
    My doubt was not about whether && is evaluated from left to right, but rather is it guaranteed that the side-effects are applied (str_ptr decremented) before the right-hand-value is evaluated. It seems that it would be.
    Yes, that is guaranteed.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  8. #8
    chococoder
    Join Date
    Nov 2004
    Posts
    515
    And always remember the main mantra of good development, which reads: never use globals.

  9. #9
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by jwenting
    And always remember the main mantra of good development, which reads: never use globals.
    I think that is a little too specific to be a "main mantra"

    Maybe "strive for loose coupling and high cohesion" would be more like it.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  10. #10
    Registered User
    Join Date
    Mar 2008
    Location
    Coimbra, Portugal
    Posts
    85
    Quote Originally Posted by Jorl17 View Post
    Hi, thanks for your answer.

    I did check the function and it worked. I've been implementing all of them based on the c++ reference at cplusplus.com. I've tested all of them with the given examples and corrected some bugs when I altered them. I also compared their behavior to the standard ones (talking about GCC and its libs here).

    You are right saying it returns NULL. I'm trying to think about what could have happened, I am totally sure it worked. Maybe I just changed the code somewhere, awkward...


    Looking at the code I get it. During some of my copy+pasting I added:
    Code:
     && (*ptr_str)
    Now since we were at 0 it couldn't ever iterate so it just stopped (well, you got it.)

    Relating the 'going past the buffer' issue, here's my workaround which I think fixes everything that's mentioned:

    Code:
    const char * strrchr ( const char * str, int character ) {
    char* ptr_str=(char*)str;
    while (*ptr_str++) ; /* Go to the end of the string */
    while (*--ptr_str!=(char)character && ((char*)str!=ptr_str)) ;
    if(*ptr_str==(char)character) 
    return ptr_str;
    else return NULL;
    }
    Do you think it is ok like this?

    Also, I do not think it is undefined behaviour when I think of it as assembly code. Reading from left to right, since I've got to the conclusion that both TenDRA and GCC do this (reading from left to right). I think saying x && y is not the same as y && x. (but that's just me).

    Thanks allot.
    Quote Originally Posted by anon View Post
    It appears that the problem was that you first loop sets ptr_str to point beyond the null-terminator (one beyond the array should be legal as long as you don't dereference it). So when you are coming back the first character you inspect is the terminator.

    My doubt was not about whether && is evaluated from left to right, but rather is it guaranteed that the side-effects are applied (str_ptr decremented) before the right-hand-value is evaluated. It seems that it would be.

    ----

    A style issue is with the casts. Why don't you declare your stuff with the correct type to begin with, e.g

    Code:
    //char* ptr_str=(char*)str;
    
    const char* ptr_str = str;
    Well, since nobody ever taught me programming and all i've learned came from the web I lack some basic stuff...

    I thought that by declaring it const we'd actually be saying "No changes may be made.". So i figured that declaring it const would just turn the compiler's pointing errors at me

    One more thing that I did not know was that while passing a pointer as const, incrementing it does not increment the 'real' pointer. Why does that happen, could you please tell me?

    I think it happens because what we pass is a copy of our pointer instead of it. So, we end up with a pointer that points to the same thing, except that it is not the previous one.

    This means that in my implementation of strncpy (for example):

    Code:
    char * strncpy ( char * destination, const char * source, size_t num ) {
    char* ptr_src=(char*)source,* ptr_dest=destination;
    while(num-- && (*ptr_dest++=*ptr_src++) && (*ptr_src) && (num)) ;
    while(num--) (*ptr_dest++)=0; /* padd remaining destination with 0's. */
    return destination;
    }
    I would be able to do something like:
    Code:
    char * strncpy ( char * destination, const char * source, size_t num ) {
    while(num-- && (*destination++=*source++) && (*source) && (num)) ;
    while(num--) (*destination++)=0; /* padd remaining destination with 0's. */
    return destination;
    }
    Right? Or am I wrong?
    Thanks allot.

  11. #11
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Parameters passed in are copies from the calling code, so you can do your second version of strncpy().

    Also, const char * means that you are not going to change whatever the pointer points at, not that you can't change the pointer itself. That would be "char * const".

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  12. #12
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by Jorl17
    One more thing that I did not know was that while passing a pointer as const, incrementing it does not increment the 'real' pointer. Why does that happen, could you please tell me?
    What do you mean?
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  13. #13
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by laserlight View Post
    What do you mean?
    I think Jorl doesn't understand "pass by value" of pointers. Of course, const has nothing to do with this concept at all.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  14. #14
    The larch
    Join Date
    May 2006
    Posts
    3,573
    I thought that by declaring it const we'd actually be saying "No changes may be made.". So i figured that declaring it const would just turn the compiler's pointing errors at me

    One more thing that I did not know was that while passing a pointer as const, incrementing it does not increment the 'real' pointer. Why does that happen, could you please tell me?

    I think it happens because what we pass is a copy of our pointer instead of it. So, we end up with a pointer that points to the same thing, except that it is not the previous one.
    With pointers there are two things, the pointer itself (address) and what is at the pointed address. When you pass a pointer to the function, it indeed receives a copy, so if you make the pointer itself point elsewhere, this will not affect the pointer in the main program.

    With const's there are several variants:

    Code:
    //what is pointed at cannot be modified
    const Type* p;
    Type const* p;
    
    //the pointer itself cannot be made to point elsewhere, the refered value can
    Type * const p;
    
    //neither can be modified
    const Type* const p;
    Type const* const p;
    Note that your C-style (this is C++ board) casts cast away constness. If you were to actually take advantage of the lack of const and actually attempt to modify the pointed-at string contents, your program might crash - e.g if the argument is a pointer to string literal. All the cast achieves is shut the compiler up, so it would allow you to make that mistake.
    I might be wrong.

    Thank you, anon. You sure know how to recognize different types of trees from quite a long way away.
    Quoted more than 1000 times (I hope).

  15. #15
    Registered User
    Join Date
    Mar 2008
    Location
    Coimbra, Portugal
    Posts
    85
    Quote Originally Posted by matsp View Post
    I think Jorl doesn't understand "pass by value" of pointers. Of course, const has nothing to do with this concept at all.

    --
    Mats
    Yes indeed, it hasn't got anything at all got to do with const, it was a totally unrelated querstion.
    Quote Originally Posted by anon View Post
    With pointers there are two things, the pointer itself (address) and what is at the pointed address. When you pass a pointer to the function, it indeed receives a copy, so if you make the pointer itself point elsewhere, this will not affect the pointer in the main program.

    With const's there are several variants:

    Code:
    //what is pointed at cannot be modified
    const Type* p;
    Type const* p;
    
    //the pointer itself cannot be made to point elsewhere, the refered value can
    Type * const p;
    
    //neither can be modified
    const Type* const p;
    Type const* const p;
    Note that your C-style (this is C++ board) casts cast away constness. If you were to actually take advantage of the lack of const and actually attempt to modify the pointed-at string contents, your program might crash - e.g if the argument is a pointer to string literal. All the cast achieves is shut the compiler up, so it would allow you to make that mistake.
    Thanks for your explanation.

    About posting C-style code in a c++ board is due to the fact that what I am trying to reimplement is the whole standard library (both c++ and c).

    Relating to the constness that really helped me, I think I never saw something as well explained and simple as that anywhere.

    Since I'm in a terrible wave of asking questions, in my following piece of code:

    Code:
    return ptr_str1-str1;
    This will return the difference between pointers. I got that, used it, ok. But why does this happen? I think it happens because what we are returning is the difference between the addresses of the pointers and, since they both started from the same place, we get the difference in a decimal number. Am I wrong?

    Also, if I am right, then what would happen if I hadn't used a char (1 byte, right?) but a dword for example. Would it still return the correct value? Or is it that the difference in memory would be four times as large?

    Really, really thanks again:

    Jorl

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. basic question about global variables
    By radeberger in forum C++ Programming
    Replies: 0
    Last Post: 04-06-2009, 12:54 AM
  2. Best way to avoid using global variables
    By Canadian0469 in forum C++ Programming
    Replies: 7
    Last Post: 12-18-2008, 12:02 PM
  3. A very long list of questions... maybe to long...
    By Ravens'sWrath in forum C Programming
    Replies: 16
    Last Post: 05-16-2007, 05:36 AM
  4. Global objects and exceptions
    By drrngrvy in forum C++ Programming
    Replies: 1
    Last Post: 09-29-2006, 07:37 AM
  5. Replies: 26
    Last Post: 03-20-2004, 02:59 PM