Thread: review this code

  1. #1
    Registered User
    Join Date
    Jan 2007
    Posts
    330

    review this code

    I came across the following code which is basically just a strcmp implementation.

    Code:
    int compare_name(const char *p,const char *q)
    {
    	int d;
    
    	for (;(d=tolower(*p)-tolower(*q))==0;++p,++q) {
    		if (!*p) break;
    	}
    	return d;
    }
    It never stops to amaze me in how many different ways algorithms are implemented in C

    What do you think of this code? it has UB? it works nice?

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    > It never stops to amaze me in how many different ways algorithms are implemented in C
    There's certainly a large number of broken ways, especially those which attempt to cram as much as possible into a single line for no good reason.

    What happens if p is longer than q, what memory gets accessed?
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    Anti-Poster
    Join Date
    Feb 2002
    Posts
    1,401
    If p is longer than q, the tolower comparison will catch it at q's null termination. The break inside the for loop only appears to matter if p and q are actually the same string.
    If I did your homework for you, then you might pass your class without learning how to write a program like this. Then you might graduate and get your degree without learning how to write a program like this. You might become a professional programmer without knowing how to write a program like this. Someday you might work on a project with me without knowing how to write a program like this. Then I would have to do you serious bodily harm. - Jack Klein

  4. #4
    Registered User
    Join Date
    Jul 2006
    Posts
    162
    personally i don't like it when people use for loops that way, it has... an awkward aesthetic and doesn't seem right, but i suppose technically it works. is it the most elegant way to achieve this exact effect? as in is there anything particularly demanding about this which requires this exact method, because i'm sure you can do the exact same thing with more appropriate syntax. :-)

    i'm just being picky, ignore what i said if you think it's weird, i just wanted to 'throw it out there' to maybe provoke discussion, more insight in to programming habits of others is always nice.


    What happens if p is longer than q, what memory gets accessed?
    you can see that the condition under which the loop continues is that the difference in value between the characters via tolower() is 0, the end of a sequence of characters (in a -string-) will end with NULL being 0, and 0 - any char value != 0 therefore the loop ends and d is returned as a value not 0 implying the strings don't match.

    that's what i see anyway.

    Code:
    int compare_name(const char *p,const char *q)
    {
    	int d=0;
    	while ( (d = tolower(*p) - tolower(*q) ) == 0 )
    	{
    		if (!*p) break;
    		++p, ++q;
    	}
    	return d;
    }
    same thing right? i think this is more readable, who cares if it takes up an extra line length.
    Last edited by simpleid; 08-13-2007 at 09:44 AM.

  5. #5
    Registered User
    Join Date
    Jan 2007
    Posts
    330
    Quote Originally Posted by Salem View Post
    > It never stops to amaze me in how many different ways algorithms are implemented in C
    There's certainly a large number of broken ways, especially those which attempt to cram as much as possible into a single line for no good reason.

    What happens if p is longer than q, what memory gets accessed?
    it seems to work ok there as has been pointed out already. That was the only reason you thought it was broken?

  6. #6
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    Conceded - others have got it right.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  7. #7
    Registered User
    Join Date
    Jan 2007
    Posts
    330
    Quote Originally Posted by simpleid View Post
    more insight in to programming habits of others is always nice.
    Yes, thats why I posted it here. I was also wondering if you guys&girls could see a reason why its implemented in this weird way. the code runs on m68k coldfire

  8. #8
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    I heard long ago of some compilers that more agressively optimise for loops, over while loops. Perhaps the author of this heard the same thing.
    Not sure I even believed it though.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  9. #9
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by iMalc View Post
    I heard long ago of some compilers that more agressively optimise for loops, over while loops. Perhaps the author of this heard the same thing.
    Not sure I even believed it though.

    Not to mention that there are ways to do string comparisons for strings much better by using longer items than one char at a time.

    And the "for()" keyword can actually be produced as a macro sort of like this (aside from putting the code inside the loop in the macro of course):
    Code:
    #define FOR(init, condition, update) init; while(condition)  { /* code in loop */; update; }
    Since most of this is similar to "while", it's most likely that the optimizing part of the compiler does both for-loops and while-loops.

    I find it more likely that the code was more complex to begin with, and got stepwise refined until this was what was left.

    --
    Mats

  10. #10
    Registered User
    Join Date
    Sep 2001
    Posts
    752
    Those are almost, but not quite, the same due to their behavior in the case of 'continue;'.

    I remember reading (long ago!) about how some compilers were found to execute
    Code:
    for(;;) {
       /* ... */
    }
    faster than
    Code:
    while (1) {
       /* ... */
    }
    or variants thereof because the compiler would actually perform an evaluation of testing 1 for truth.

    Basically... you just can't assume much anything about how optimization works. Just have faith that it does.
    Callou collei we'll code the way
    Of prime numbers and pings!

  11. #11
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by QuestionC View Post
    Those are almost, but not quite, the same due to their behavior in the case of 'continue;'.

    I remember reading (long ago!) about how some compilers were found to execute
    Code:
    for(;;) {
       /* ... */
    }
    faster than
    Code:
    while (1) {
       /* ... */
    }
    or variants thereof because the compiler would actually perform an evaluation of testing 1 for truth.

    Basically... you just can't assume much anything about how optimization works. Just have faith that it does.
    Um, yeah, a compiler that generates code to "evaluate" 1 and not just say "it's a constant that is always true" would be pretty darn poor - but I can believe it. I used some pretty basic compilers on 68K (on Atari/Amiga type machines) many years ago [which may well have something to do with the construct above as it's for a "ColdFire" 68K versiion].

    Any modern, decent compiler (such as gcc to name one that is easily available for many architectures) will definitely make no difference between those two cases.

    And I don't trust the optimization to do magic - if I want to know that something is optimized the way I expect it to be, I check the compiler output - it's very easy to "fool" a compiler into generating bad code. Most of the time it makes no difference, as most fo what one writes isn't "on the critical path", but if I think it matters, I check it. There are usually ways to fool the compiler back into generating good code if the current code is "bad" - for example use some more variables, using indices instead of updating pointers (or other way around) or simplify the whole piece of code somehow. Using one variable for each thing is a good thing to, don't re-use the same variable for different things - the compiler may not realize that you are not relying on the old value later on, and such, so it will "not do the right thing".

    --
    Mats

  12. #12
    Registered User
    Join Date
    Sep 2001
    Posts
    752
    Quote Originally Posted by matsp View Post
    And I don't trust the optimization to do magic - if I want to know that something is optimized the way I expect it to be, I check the compiler output - it's very easy to "fool" a compiler into generating bad code. Most of the time it makes no difference, as most fo what one writes isn't "on the critical path", but if I think it matters, I check it. There are usually ways to fool the compiler back into generating good code if the current code is "bad" - for example use some more variables, using indices instead of updating pointers (or other way around) or simplify the whole piece of code somehow. Using one variable for each thing is a good thing to, don't re-use the same variable for different things - the compiler may not realize that you are not relying on the old value later on, and such, so it will "not do the right thing".
    This is awful behavior. If you've seriously gone to the point of disassembling the program and tweaking the C code to change optimizer behavior... why aren't you just writing the functions in assembly in the first place?
    Callou collei we'll code the way
    Of prime numbers and pings!

  13. #13
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by QuestionC View Post
    This is awful behavior. If you've seriously gone to the point of disassembling the program and tweaking the C code to change optimizer behavior... why aren't you just writing the functions in assembly in the first place?
    It depends on what you want to achieve. If for example introducing a new variable gives the expected optimization for a large chunk of code, then it's much easier to do that than to use some disassembled C-code to write optimized assembler. And it's much easier to maintain a 15 line piece of C-code than it is to maintain 45 lines of assembler (averaging about 3 lines of assembly for one line of C is probably about right, depending on the complexity of the code of course - lots of indexing or complex expressions can easily push that way higher).

    Naturally, when I do this sort of "tweaks to make the compiler do the right thing", I would also add a comment to say so - otherwise it's just obscuring things.

    By the way, both MS and GCC have switches to output the assembly (with C-code attached in more recent versions), so you don't need to DISASSEMBLE anything.

    Most of the time, the optimizer does a good job, so it's rare that I find myself doing this - but it has happened. For one thing, most of the code isn't performance critical anyways, right? So unless it's either KNOWN for one reason or another to be part of the performance critical code, or you find it during profiling, no point to touch it.

    --
    Mats

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Code Review for upcoming contest
    By Darryl in forum Contests Board
    Replies: 2
    Last Post: 02-28-2006, 02:39 PM
  2. Problem : Threads WILL NOT DIE!!
    By hanhao in forum C++ Programming
    Replies: 2
    Last Post: 04-16-2004, 01:37 PM
  3. True ASM vs. Fake ASM ????
    By DavidP in forum A Brief History of Cprogramming.com
    Replies: 7
    Last Post: 04-02-2003, 04:28 AM
  4. Interface Question
    By smog890 in forum C Programming
    Replies: 11
    Last Post: 06-03-2002, 05:06 PM
  5. Replies: 0
    Last Post: 02-21-2002, 06:05 PM