Stack behaviour

This is a discussion on Stack behaviour within the Linux Programming forums, part of the Platform Specific Boards category; Hello all. I've been programming for an application and unfortunately, I've found a bug while the application was in production. ...

  1. #1
    pfs
    pfs is offline
    Registered User
    Join Date
    Sep 2008
    Posts
    14

    Question Stack behaviour

    Hello all.

    I've been programming for an application and unfortunately, I've found a bug while the application was in production.

    The bug was a stack corruption bug.

    Anyways, it seems that the code I've developed on my box didn't crash although while in production it did.

    So why did this happen?

    I even changed the stack size (feeling that it could be related), using ulimit (using PAM - limits.conf), on my development box to the values of the production one. And still it wasn't enough to trigger the bug.

    The bug is something like:

    Code:
    int my_function2(int pc, char **plist)
    {
        return 0; /* It doesn't really matter. */
    }
    
    int my_function1(int pc, char **plist)
    {
        char *this_plist[3];
    
        this_plist[0] = plist[0];
        this_plist[1] = "I_changed_this";
        this_plist[2] = NULL; 
    
        my_function2(3, this_plist);
        return 0;
    }
    I know that I can't pass local variables to other functions.

    So, a) why is this sort of bug not crashing on one machine and is on another? and b) any comments / suggestions that you may share with me on how to avoid this? (I know I can use static variables and dynamic allocations.)

    Thanks in advance.

  2. #2
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    The pieces of code you have posted are correct - there is no misuse of stack variables or anything such.

    Of course, it would depend a bit on what you are actually doing in the function2 that you omitted the content of. But what you have shown is not incorrect in any direct way.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  3. #3
    pfs
    pfs is offline
    Registered User
    Join Date
    Sep 2008
    Posts
    14

    Question

    Thanks for the quick response.

    But I thought I couldn't pass local variables of one function to another and different one.

    What am I thinking wrong here? :-P

  4. #4
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    You should not pass back any address of a local variable as a return value [1]. You can certainly pass local variables to functions that you CALL.

    [1] The TYPICAL thing done is something like this:
    Code:
    char *func(...)
    {
        char arr[10];
        ... fill in arr. 
        return arr;
    }
    That will not work, because the memory for "arr" is freed when the function returns.

    The same applies to any other form of returning address of local variables, but the above is the most commonly seen. And what makes it harder to debug is that it may appear to work until the particular data gets overwritten by something else. This can be particularly difficult to diagnose if the array is quite large and calls to other functions only use a small amount of stack - so the data doesn't get overwritten until there is a lot of stack usage.

    A trick would be to add a function that has a local 10000 element array and fill the array with something that you can recognise (and that is likely to cause a segmentation fault if you use it as a pointer, e.g. above 0xC0000000 on a 32-bit machine). Call this function all over the place - then at least you should be able to see that you are using a "dead" variable.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  5. #5
    pfs
    pfs is offline
    Registered User
    Join Date
    Sep 2008
    Posts
    14
    My assumption is obviously wrong then.

    That assumption came with the fact that I had a bug which I fixed declaring the variable static. If I didn't declare it that way it would crash and the gdb would give me a "
    Cannot access memory at address 0x8
    " error.

    So, isn't that kind of error related to stack corruption bugs? How do I know when a stack corruption happens then?

    I guess I should be more careful with my google searches AND assumptions then. :-P

    Thanks again.

    EDIT: I just removed that static declaration and tried to dynamically allocate the pointer and it also crashed. So the problem is somewhere else. :-P
    Last edited by pfs; 07-04-2009 at 06:01 PM.

  6. #6
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Sounds like you are dereferencing a NULL pointer, e.g.
    Code:
    struct something
    {
        int x, y, z;
    };
    
    ...
    
    struct something p = somefunction();
    // p is returned as NULL. 
    p->z = 42;
    Of course, you can get this from a "returning adress to a local variable", but there are literally hundreds of other possible scenarios too. It's usually possible to track back which function (or closer) by looking at the address where the error is.

    Any chance you are not checking a file that you opened, or not checking the return value of malloc perhaps?

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  7. #7
    pfs
    pfs is offline
    Registered User
    Join Date
    Sep 2008
    Posts
    14

    Exclamation

    Hi again.

    I've doing some research about stack corruption and I found this tool called splint, it seems pretty cool (although I haven't use it yet) and I found this in their manual which lead me to fix the issue.

    It seems that I can't declare a NULL.

    The fix is this:

    Code:
    int my_function2(int pc, char **plist)
    {
        return 0; /* It doesn't really matter. */
    }
    
    int my_function1(int pc, char **plist)
    {
        char *this_plist[3];
    
        this_plist[0] = plist[0];
        this_plist[1] = "I_changed_this";
    /* this_plist[2] = NULL;  Can't do this, otherwise bad things will happen. */
    
        my_function2(3, this_plist);
        return 0;
    }
    After doing this, it stopped crashing and the stack / buffers don't seem corrupted anymore.

    Do you have any idea why is it behaving like this? and why can't I declare it NULL?

  8. #8
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,185
    That most probably means that whatever function you are calling is not expecting you to pass a NULL pointer in, and is therefore dereferencing the pointer without checking. (In other words, the function you are calling is the problem, unless it is documented as not taking a NULL pointer argument.)

  9. #9
    pfs
    pfs is offline
    Registered User
    Join Date
    Sep 2008
    Posts
    14
    Quote Originally Posted by tabstop View Post
    That most probably means that whatever function you are calling is not expecting you to pass a NULL pointer in, and is therefore dereferencing the pointer without checking. (In other words, the function you are calling is the problem, unless it is documented as not taking a NULL pointer argument.)
    The thing is that
    Code:
    this_list[2]
    it isn't even being used inside of
    Code:
    int my_function2(int pc, char **plist);
    .

    That's one of the reasons I can't understand why it is crashing.
    Last edited by pfs; 07-05-2009 at 03:50 PM.

  10. #10
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,185
    Quote Originally Posted by pfs View Post
    The thing is that
    Code:
    this_list[2]
    it isn't even being used inside of
    Code:
    int my_function2(int pc, char **plist);
    .
    Then it isn't crashing. I mean, you can run this code all day long:
    Code:
    int my_function2(int pc, char **plist) {
        return 0;
    }
    
    int my_function1(int pc, char **plist) {
        char *this_plist[3];
    
        this_plist[0] = plist[0];
        this_plist[1] = "I_changed_this";
        this_plist[2] = NULL;
    
        my_function2(3, this_plist);
        return 0;
    }
    
    int main() {
    
        char *list[3];
        my_function1(3, list);
        return 0;
    }
    and it won't crash. You don't have to take my word for it, you can run it yourself. Now if you try to touch this_plist[2] at all inside my_function2, even just to try to print it or do something like this_plist[2][0] then you're in trouble. But if you don't touch it it won't crash.

  11. #11
    Guest Sebastiani's Avatar
    Join Date
    Aug 2001
    Location
    Waterloo, Texas
    Posts
    5,708
    >> and why can't I declare it NULL?

    You should *always* declare a pointer NULL rather than leave it uninitialized. Otherwise, it's going to be really difficult to locate invalid address bugs, as it will sometimes run fine and other times not. At least it crashes in a predictable manner when it's NULL!
    Code:
    #include <cmath>
    #include <complex>
    bool euler_flip(bool value)
    {
        return std::pow
        (
            std::complex<float>(std::exp(1.0)), 
            std::complex<float>(0, 1) 
            * std::complex<float>(std::atan(1.0)
            *(1 << (value + 2)))
        ).real() < 0;
    }

  12. #12
    pwning noobs Zlatko's Avatar
    Join Date
    Jun 2009
    Location
    The Great White North
    Posts
    132
    /* this_plist[2] = NULL; Can't do this, otherwise bad things will happen. */
    If you don't set it to NULL, it has an undefined value which may point to valid memory, so your bug is hidden, not removed. You need to keep setting it to NULL or to a real section of memory.

    I think you should post more of your code.

  13. #13
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Posts
    23,031
    Stack corruption often occurs when you write outside the boundaries of your allocated storage.
    Can you reproduce the bug on your dev machine using the same steps as on a production? Remember to check for buffer overruns, using whatever debugging tools you have.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. stack and pointer problem
    By ramaadhitia in forum C Programming
    Replies: 2
    Last Post: 09-12-2006, 12:41 AM
  2. infix evaluation using stack
    By lewissi in forum C++ Programming
    Replies: 0
    Last Post: 11-03-2005, 02:56 AM
  3. Question about a stack using array of pointers
    By Ricochet in forum C++ Programming
    Replies: 6
    Last Post: 11-17-2003, 10:12 PM
  4. error trying to compile stack program
    By KristTlove in forum C++ Programming
    Replies: 2
    Last Post: 11-03-2003, 06:27 PM
  5. Stack Program Here
    By Troll_King in forum C Programming
    Replies: 7
    Last Post: 10-15-2001, 06:36 PM

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21