Thread: callee cleanup and variable number of parameters

  1. #1
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229

    callee cleanup and variable number of parameters

    I am reading an ebook on x86 assembly, and in the chapter describing calling conventions, the author compares caller-cleanup conventions to callee-cleanup conventions.

    For callee-cleanup conventions, the author asserts, as a disadvantage, that functions following callee-cleanup conventions must have a fixed number of parameters.

    What I don't get is, why?

    If the callee can determine the number of actual parameters passed (for example, printf uses the format string to determine that), what is stopping the callee from cleaning up the stack?

    Thanks

  2. #2
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Take your run-of-the-mill void foo(...) function and tell me how foo is supposed to know how many bytes to pop the stack.
    While it might work okay for printf, I don't think it will do so for other functions, hence the use of the typical __cdecl, where the caller cleans the stack, seeing as it knows how many parameters were called.
    There may be other reasons too.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  3. #3
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    Hmm, I was under the assumption that functions that take variable number of parameters can always figure out the actual number of parameters passed (for example, like printf and scanf). I can see how if that's not the case, there's a valid point.

    But if the callee doesn't know how many parameters are passed, how does it know how many parameters it can access?

  4. #4
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    > If the callee can determine the number of actual parameters passed (for example, printf uses the format string to determine that),
    Are you sure?
    Code:
    printf( "This is my int %d\n", 1, 2, 3, 4, 5 );
    Whilst being dumb, it isn't fatal. So long as the n conversions in the format string are matched by at least n compatible values, all is well.

    > how does it know how many parameters it can access?
    It doesn't.
    Code:
    printf( "This is my int %d\n" );
    will try to read an int anyway from where the next parameter is expected to be.

    Without a strong type-check to enforce a bunch of rules for variadic functions, it's quite hard work for the callee to work out how many real parameters there are.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  5. #5
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Quote Originally Posted by cyberfish View Post
    Hmm, I was under the assumption that functions that take variable number of parameters can always figure out the actual number of parameters passed (for example, like printf and scanf). I can see how if that's not the case, there's a valid point.
    They just try to access parameters from the information they can extract from the format list. Whether the parameters are actually there or not is another question that we're well aware of.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  6. #6
    Registered User
    Join Date
    Sep 2008
    Location
    Toronto, Canada
    Posts
    1,834
    I agree with cyberfish that there is confusion.

    I don't see why either the caller or the callee can't clean up. I would favour callee cleaning because that cleanup code need only be there once instead of with each caller invocation, which presumably there are many. Plus X86 has a RETN which pops N arguments off the stack - which is quite nice.

    Certainly the caller knows how many parameters were just pushed. So it can do cleanup very easily. On the other hand, the callee would need to have additional smarts to determine how many parameters were given... the format string being at the top - so now it has to scan through and find how many parameters correspond to that. Assuming right-to-left parameter convention.

    Personally, issues of variable args passing could all have been avoided if compiler convention dictated that the compiler push number of arguments onto the stack. The callee can then tell right away.

  7. #7
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by nonoob View Post
    I agree with cyberfish that there is confusion.

    I don't see why either the caller or the callee can't clean up. I would favour callee cleaning because that cleanup code need only be there once instead of with each caller invocation
    Actually, caller cleanup is more efficient for this reason. You do NOT have to clean the stack after every function call. You can accumulate many arguments and then pop them all at once. Consider:

    Code:
    push arg1
    push arg2
    call func1
    push arg3
    push arg4
    call func2
    add esp, 16
    The single add at the end pops the arguments for two function calls in a single instruction. If it was callee-clean, then there would have to be at least two of these, but here we have collapsed them into one.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  8. #8
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Quote Originally Posted by nonoob View Post
    I agree with cyberfish that there is confusion.

    I don't see why either the caller or the callee can't clean up. I would favour callee cleaning because that cleanup code need only be there once instead of with each caller invocation, which presumably there are many. Plus X86 has a RETN which pops N arguments off the stack - which is quite nice.

    Certainly the caller knows how many parameters were just pushed. So it can do cleanup very easily. On the other hand, the callee would need to have additional smarts to determine how many parameters were given... the format string being at the top - so now it has to scan through and find how many parameters correspond to that. Assuming right-to-left parameter convention.

    Personally, issues of variable args passing could all have been avoided if compiler convention dictated that the compiler push number of arguments onto the stack. The callee can then tell right away.
    If you're using a compiler, it doesn't matter.
    If you're writing asm, then you can do your own calling conventions.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  9. #9
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    But caller cleanup allows this optimisation
    Quote Originally Posted by gcc manual
    -fno-defer-pop
    Always pop the arguments to each function call as soon as that function returns. For machines which must pop arguments after a function call, the compiler normally lets arguments accumulate on the stack for several function calls and pops them all at once.
    (actually, this option disables such behaviour)

    edit: brewbuck beat me to it
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  10. #10
    Registered User
    Join Date
    Sep 2008
    Location
    Toronto, Canada
    Posts
    1,834
    Interesting. I never heard of caller accumulating parameters for multiple calls. It doesn't seem to add much in the way of performance. Once clock maybe? If that. Whooptee-dooptee!. If the instruction is not pipelined with something else.

  11. #11
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by nonoob View Post
    Interesting. I never heard of caller accumulating parameters for multiple calls. It doesn't seem to add much in the way of performance. Once clock maybe? If that. Whooptee-dooptee!. If the instruction is not pipelined with something else.
    The x86 core has a feature called "esp folding" which allows the processor to delay making visible changes to esp when it is altered implicitly by push and pop instructions. An explicit argument cleanup, by adding a constant value to esp, interferes with this deferral and causes pipeline stalls.

    The fewer times you modify the value of esp explicitly, the better. There is more going on in these instructions than meets the eye.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  12. #12
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    Quote Originally Posted by Salem View Post
    > If the callee can determine the number of actual parameters passed (for example, printf uses the format string to determine that),
    Are you sure?
    Code:
    printf( "This is my int %d\n", 1, 2, 3, 4, 5 );
    Whilst being dumb, it isn't fatal. So long as the n conversions in the format string are matched by at least n compatible values, all is well.

    > how does it know how many parameters it can access?
    It doesn't.
    Code:
    printf( "This is my int %d\n" );
    will try to read an int anyway from where the next parameter is expected to be.

    Without a strong type-check to enforce a bunch of rules for variadic functions, it's quite hard work for the callee to work out how many real parameters there are.
    Ah I see now. Thanks.

    If you're using a compiler, it doesn't matter.
    But it does matter if you are trying to interface asm to compiler-generated code.

    The x86 core has a feature called "esp folding" which allows the processor to delay making visible changes to esp when it is altered implicitly by push and pop instructions. An explicit argument cleanup, by adding a constant value to esp, interferes with this deferral and causes pipeline stalls.

    The fewer times you modify the value of esp explicitly, the better. There is more going on in these instructions than meets the eye.
    Ah, so does that mean if there are only a few parameters, doing a few pops (to an unused register) may be faster than an add to esp?

  13. #13
    Registered User
    Join Date
    Sep 2008
    Location
    Toronto, Canada
    Posts
    1,834
    Thanks, brewbuck. I forgot about stalls. It's getting so predicting processor instruction timings is impossible "by eye". Wowee processors make for sloppy inefficient coding, and encourages the "throw more memory and hardware speed at it" syndrome. Efficient coders at the source-code level are becoming scarce.

  14. #14
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by cyberfish View Post
    Ah I see now. Thanks.


    But it does matter if you are trying to interface asm to compiler-generated code.


    Ah, so does that mean if there are only a few parameters, doing a few pops (to an unused register) may be faster than an add to esp?
    Maybe. I think you'd have to try it and see. The use of the register (even if it's just to discard the popped value) will affect the register renaming and possibly have its own pipeline effects. And to make it more fun, these are the kinds of things that vary quite a bit even between revisions of the same basic core.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  15. #15
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    Thanks, brewbuck. I forgot about stalls. It's getting so predicting processor instruction timings is impossible "by eye". Wowee processors make for sloppy inefficient coding, and encourages the "throw more memory and hardware speed at it" syndrome. Efficient coders at the source-code level are becoming scarce.
    That's why we have optimizing compilers, and people who DO know the in's and out's of hardware to write them .

    Maybe. I think you'd have to try it and see. The use of the register (even if it's just to discard the popped value) will affect the register renaming and possibly have its own pipeline effects. And to make it more fun, these are the kinds of things that vary quite a bit even between revisions of the same basic core.
    I see, thanks.

Popular pages Recent additions subscribe to a feed