C Board  

Go Back   C Board > Platform Specific Boards > Linux Programming

Reply
 
LinkBack Thread Tools Display Modes
Old 07-04-2009, 03:10 PM   #1
pfs
Registered User
 
Join Date: Sep 2008
Location: Portugal
Posts: 14
Question Stack behaviour

Hello all.

I've been programming for an application and unfortunately, I've found a bug while the application was in production.

The bug was a stack corruption bug.

Anyways, it seems that the code I've developed on my box didn't crash although while in production it did.

So why did this happen?

I even changed the stack size (feeling that it could be related), using ulimit (using PAM - limits.conf), on my development box to the values of the production one. And still it wasn't enough to trigger the bug.

The bug is something like:

Code:
int my_function2(int pc, char **plist)
{
    return 0; /* It doesn't really matter. */
}

int my_function1(int pc, char **plist)
{
    char *this_plist[3];

    this_plist[0] = plist[0];
    this_plist[1] = "I_changed_this";
    this_plist[2] = NULL; 

    my_function2(3, this_plist);
    return 0;
}
I know that I can't pass local variables to other functions.

So, a) why is this sort of bug not crashing on one machine and is on another? and b) any comments / suggestions that you may share with me on how to avoid this? (I know I can use static variables and dynamic allocations.)

Thanks in advance.
pfs is offline   Reply With Quote
Old 07-04-2009, 03:23 PM   #2
Kernel hacker
 
Join Date: Jul 2007
Location: Farncombe, Surrey, England
Posts: 15,686
The pieces of code you have posted are correct - there is no misuse of stack variables or anything such.

Of course, it would depend a bit on what you are actually doing in the function2 that you omitted the content of. But what you have shown is not incorrect in any direct way.

--
Mats
__________________
Compilers can produce warnings - make the compiler programmers happy: Use them!
Please don't PM me for help - and no, I don't do help over instant messengers.
matsp is offline   Reply With Quote
Old 07-04-2009, 03:31 PM   #3
pfs
Registered User
 
Join Date: Sep 2008
Location: Portugal
Posts: 14
Question

Thanks for the quick response.

But I thought I couldn't pass local variables of one function to another and different one.

What am I thinking wrong here? :-P
pfs is offline   Reply With Quote
Old 07-04-2009, 03:49 PM   #4
Kernel hacker
 
Join Date: Jul 2007
Location: Farncombe, Surrey, England
Posts: 15,686
You should not pass back any address of a local variable as a return value [1]. You can certainly pass local variables to functions that you CALL.

[1] The TYPICAL thing done is something like this:
Code:
char *func(...)
{
    char arr[10];
    ... fill in arr. 
    return arr;
}
That will not work, because the memory for "arr" is freed when the function returns.

The same applies to any other form of returning address of local variables, but the above is the most commonly seen. And what makes it harder to debug is that it may appear to work until the particular data gets overwritten by something else. This can be particularly difficult to diagnose if the array is quite large and calls to other functions only use a small amount of stack - so the data doesn't get overwritten until there is a lot of stack usage.

A trick would be to add a function that has a local 10000 element array and fill the array with something that you can recognise (and that is likely to cause a segmentation fault if you use it as a pointer, e.g. above 0xC0000000 on a 32-bit machine). Call this function all over the place - then at least you should be able to see that you are using a "dead" variable.

--
Mats
__________________
Compilers can produce warnings - make the compiler programmers happy: Use them!
Please don't PM me for help - and no, I don't do help over instant messengers.
matsp is offline   Reply With Quote
Old 07-04-2009, 04:41 PM   #5
pfs
Registered User
 
Join Date: Sep 2008
Location: Portugal
Posts: 14
My assumption is obviously wrong then.

That assumption came with the fact that I had a bug which I fixed declaring the variable static. If I didn't declare it that way it would crash and the gdb would give me a "
Quote:
Cannot access memory at address 0x8
" error.

So, isn't that kind of error related to stack corruption bugs? How do I know when a stack corruption happens then?

I guess I should be more careful with my google searches AND assumptions then. :-P

Thanks again.

EDIT: I just removed that static declaration and tried to dynamically allocate the pointer and it also crashed. So the problem is somewhere else. :-P

Last edited by pfs; 07-04-2009 at 05:01 PM.
pfs is offline   Reply With Quote
Old 07-04-2009, 05:03 PM   #6
Kernel hacker
 
Join Date: Jul 2007
Location: Farncombe, Surrey, England
Posts: 15,686
Sounds like you are dereferencing a NULL pointer, e.g.
Code:
struct something
{
    int x, y, z;
};

...

struct something p = somefunction();
// p is returned as NULL. 
p->z = 42;
Of course, you can get this from a "returning adress to a local variable", but there are literally hundreds of other possible scenarios too. It's usually possible to track back which function (or closer) by looking at the address where the error is.

Any chance you are not checking a file that you opened, or not checking the return value of malloc perhaps?

--
Mats
__________________
Compilers can produce warnings - make the compiler programmers happy: Use them!
Please don't PM me for help - and no, I don't do help over instant messengers.
matsp is offline   Reply With Quote
Old 07-05-2009, 02:02 PM   #7
pfs
Registered User
 
Join Date: Sep 2008
Location: Portugal
Posts: 14
Exclamation

Hi again.

I've doing some research about stack corruption and I found this tool called splint, it seems pretty cool (although I haven't use it yet) and I found this in their manual which lead me to fix the issue.

It seems that I can't declare a NULL.

The fix is this:

Code:
int my_function2(int pc, char **plist)
{
    return 0; /* It doesn't really matter. */
}

int my_function1(int pc, char **plist)
{
    char *this_plist[3];

    this_plist[0] = plist[0];
    this_plist[1] = "I_changed_this";
/* this_plist[2] = NULL;  Can't do this, otherwise bad things will happen. */

    my_function2(3, this_plist);
    return 0;
}
After doing this, it stopped crashing and the stack / buffers don't seem corrupted anymore.

Do you have any idea why is it behaving like this? and why can't I declare it NULL?
pfs is offline   Reply With Quote
Old 07-05-2009, 02:23 PM   #8
and the Hat of Guessing
 
tabstop's Avatar
 
Join Date: Nov 2007
Posts: 8,740
That most probably means that whatever function you are calling is not expecting you to pass a NULL pointer in, and is therefore dereferencing the pointer without checking. (In other words, the function you are calling is the problem, unless it is documented as not taking a NULL pointer argument.)
tabstop is offline   Reply With Quote
Old 07-05-2009, 02:43 PM   #9
pfs
Registered User
 
Join Date: Sep 2008
Location: Portugal
Posts: 14
Quote:
Originally Posted by tabstop View Post
That most probably means that whatever function you are calling is not expecting you to pass a NULL pointer in, and is therefore dereferencing the pointer without checking. (In other words, the function you are calling is the problem, unless it is documented as not taking a NULL pointer argument.)
The thing is that
Code:
this_list[2]
it isn't even being used inside of
Code:
int my_function2(int pc, char **plist);
.

That's one of the reasons I can't understand why it is crashing.

Last edited by pfs; 07-05-2009 at 02:50 PM.
pfs is offline   Reply With Quote
Old 07-05-2009, 10:45 PM   #10
and the Hat of Guessing
 
tabstop's Avatar
 
Join Date: Nov 2007
Posts: 8,740
Quote:
Originally Posted by pfs View Post
The thing is that
Code:
this_list[2]
it isn't even being used inside of
Code:
int my_function2(int pc, char **plist);
.
Then it isn't crashing. I mean, you can run this code all day long:
Code:
int my_function2(int pc, char **plist) {
    return 0;
}

int my_function1(int pc, char **plist) {
    char *this_plist[3];

    this_plist[0] = plist[0];
    this_plist[1] = "I_changed_this";
    this_plist[2] = NULL;

    my_function2(3, this_plist);
    return 0;
}

int main() {

    char *list[3];
    my_function1(3, list);
    return 0;
}
and it won't crash. You don't have to take my word for it, you can run it yourself. Now if you try to touch this_plist[2] at all inside my_function2, even just to try to print it or do something like this_plist[2][0] then you're in trouble. But if you don't touch it it won't crash.
tabstop is offline   Reply With Quote
Old 07-06-2009, 01:13 AM   #11
Guest
 
Sebastiani's Avatar
 
Join Date: Aug 2001
Posts: 4,922
>> and why can't I declare it NULL?

You should *always* declare a pointer NULL rather than leave it uninitialized. Otherwise, it's going to be really difficult to locate invalid address bugs, as it will sometimes run fine and other times not. At least it crashes in a predictable manner when it's NULL!
Sebastiani is offline   Reply With Quote
Old 07-06-2009, 05:10 AM   #12
pwning noobs
 
Zlatko's Avatar
 
Join Date: Jun 2009
Location: The Great White North
Posts: 125
Quote:
/* this_plist[2] = NULL; Can't do this, otherwise bad things will happen. */
If you don't set it to NULL, it has an undefined value which may point to valid memory, so your bug is hidden, not removed. You need to keep setting it to NULL or to a real section of memory.

I think you should post more of your code.
__________________
Sun Certified Java Programmer / Developer
IEEE CSDP
Zlatko is offline   Reply With Quote
Old 07-06-2009, 05:34 AM   #13
Mysterious C++ User
 
Join Date: Oct 2007
Posts: 14,099
Stack corruption often occurs when you write outside the boundaries of your allocated storage.
Can you reproduce the bug on your dev machine using the same steps as on a production? Remember to check for buffer overruns, using whatever debugging tools you have.
__________________
Using: Microsoft Windows™ 7 Professional (x64), Microsoft Visual Studio™ 2008 Team System
I dedicated my life to helping others. This is only a small sample of what they said:
"Thanks Elysia. You're a programming master! How the hell do you know every thing?"
Quoted... at least once.
Quote:
Originally Posted by cpjust
If C++ is 2 steps forward from C, then I'd say Java is 1 step forward and 2 steps back.
Elysia is offline   Reply With Quote
Reply

Thread Tools
Display Modes

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
stack and pointer problem ramaadhitia C Programming 2 09-11-2006 11:41 PM
infix evaluation using stack lewissi C++ Programming 0 11-03-2005 02:56 AM
Question about a stack using array of pointers Ricochet C++ Programming 6 11-17-2003 10:12 PM
error trying to compile stack program KristTlove C++ Programming 2 11-03-2003 06:27 PM
Stack Program Here Troll_King C Programming 7 10-15-2001 05:36 PM


All times are GMT -6. The time now is 05:30 PM.


Powered by vBulletin® Version 3.8.1
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.3.0 RC2

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22