Thread: Segmentation Fault Address Out Of Bounds appearing randomly

  1. #1
    Registered User
    Join Date
    Oct 2011
    Posts
    2

    Segmentation Fault Address Out Of Bounds appearing randomly

    Hi,

    I have been tracking down a segmentation fault for two weeks now and managed to "control" it today. Though I have no answer about why. Here is what I am working on :

    I am building a shared library under Ubuntu. In order to build this library, I use several modules obviously dispatched in several .c and .h files.

    The main part of the library is a web server implemented using libmicrohttpd that has a callback function used to answer the incoming connection. In this callback, the idea is to send back a value in the form of an XML. To do so, it calls a function that builds the xml based on a char* in the xml module, thus in the xml.c file.

    Namely :
    Code:
    void answer_to_connection()
    {
    if (received a get)
    {
    get_xml_value()
         send_xml_value()
    }
    }
    By doing this, I experienced weird behavior from the program, namely everything goes fine until randomly a Segmentation Fault appear, it can appear after the 5th request or the 100th !
    I investigated to see Where exactly this segfault appeared and could see that a bad pointer address was returned by get_xml_value() (Address Out Of Bounds). I thought that it was a problem of memory corruption but after performing tests with valgrind and malloc_debug nothing seemed problematic. I then decided to migrate the function inside the file containing answer_to_connection() using the EXACT same tools and functions. The weird thing is that in that configuration the Segmentation Fault doesn't appear anymore. The XML library is miniXML (nothing appeared to be wrong on that)

    Do any of you have any idea why and how this could happen ? And why moving the function to another file resolved the problem ? If you need any more information Ill be more than happy to give it !

    Thanks in advance

  2. #2
    Registered User hk_mp5kpdw's Avatar
    Join Date
    Jan 2002
    Location
    Northern Virginia/Washington DC Metropolitan Area
    Posts
    3,817
    I would suggest that moving the function to another file has not resolved the problem but rather hidden it. The problem likely still exists but you don't notice and that's even more dangerous than before. By changing where the code is located you've changed something in the overall memory picture in regards to how the code looks when compiled/linked/executed such that the problem is not being caught anymore.
    "Owners of dogs will have noticed that, if you provide them with food and water and shelter and affection, they will think you are god. Whereas owners of cats are compelled to realize that, if you provide them with food and water and shelter and affection, they draw the conclusion that they are gods."
    -Christopher Hitchens

  3. #3
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    This will be very difficult without much or all of the code. At the very least, we would need to see get_xml_value, and preferably all the functions that are called directly or indirectly by get_xml_value. Until then, these are just guesses:

    A common cause of such issues is some sort of array overrun. If get_xml_value calls some other function of yours with a buffer overrun, you would write right up the stack and into the local vars of get_xml_value, and could trash a local var that stores the pointer to be returned. It's experimental, but valgrind has an option to check for array overruns. I've never used it, but it can't hurt to try.

    Another possibility is that you store the pointer to return in a local variable that is not initialized, and get_xml_value may find it's way to a return statement in the function that returns that variable without ever setting it. Check all possible execution paths and make sure that you always set any variable before using or returning it.

    Also, see if you can identify something about the conditions that cause it to seg fault. Is it one particular type of request? Is it only requests of certain length? If you can isolate the conditions that cause a seg fault, then you can force it to happen and easily track down the . Otherwise, you may just have to hang around for 5 or 100 requests and step through get_xml_value by hand for each one.

    A last option is to put copious amounts of fprintf statements everywhere for debugging. Print out every variable and every function return value at every critical step. Make sure you print to stderr since it's unbuffered, otherwise a seg fault may crash the program before stdout is flushed and thus you think your problem is in one place when actually it's elsewhere.

    Oh, and make sure you're compiling with warnings turned all the way up and you've resolved all of them. Many of them catch questionable behavior or undefined behavior that can result in problems like yours.

  4. #4
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    38,658
    Well if you spent 2 weeks looking at the whole code, I doubt there is a lot anyone here can do with only 5 lines of code which isn't even syntactically correct!

    My first suggestion is put it back to how it was when it was crashing. Because as hk_mp5kpdw says, it is extremely unlikely that you fixed anything, only hidden it.

    Another tool to try and use is Electric Fence, which is another malloc debug tool.

    Make sure you compile it all with -g enabled (debug) and then run it in gdb.
    At the crash, start with 'bt' and post the results here.

    Two common mistakes to look out for are
    1. Mistakenly believing that malloc returns with memory full of \0. It does do initially, but as time goes on, it's likely to contain only just from previous use. So anything which starts with strcat rather than strcpy starts off working, then blows up.
    Use these options with valgrind.
    --malloc-fill=<hexnumber> fill malloc'd areas with given value
    --free-fill=<hexnumber> fill free'd areas with given value

    2. Forgetting to add 1 to any strlen() result when allocating memory.

    3. Forgetting to copy a \0 when necessary (if not using strcpy / strcat )

    Another good thing to check is if the segfault address looks like a printable string fragment (eg. 0x41423132 aka "AB12")

    valgrind and malloc_debug don't trap every problem.
    Eg.
    Code:
    #include <stdio.h>
    #include <string.h>
    
    void foo ( void ) {
      char a[10], b[10], c[10];
      strcpy( b, "hello world" );   // overrun
      printf("%s\n",b);
    }
    int main(int argc, char *argv[])
    {
      foo();
      return 0;
    }
    
    $ gcc -g bar.c
    $ ./a.out 
    hello world
    $ valgrind ./a.out
    ==4951== Memcheck, a memory error detector
    ==4951== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.
    ==4951== Using Valgrind-3.6.0.SVN-Debian and LibVEX; rerun with -h for copyright info
    ==4951== Command: ./a.out
    ==4951== 
    hello world
    ==4951== 
    ==4951== HEAP SUMMARY:
    ==4951==     in use at exit: 0 bytes in 0 blocks
    ==4951==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
    ==4951== 
    ==4951== All heap blocks were freed -- no leaks are possible
    ==4951== 
    ==4951== For counts of detected and suppressed errors, rerun with: -v
    ==4951== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 4 from 4)
    If you have code like this which overwrites a pointer, then it can go wrong very quickly.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  5. #5
    Registered User
    Join Date
    Oct 2011
    Posts
    2
    Thank you for your answers, I think I resolved the problem, I will keep you updated
    Last edited by Rej; 10-27-2011 at 06:03 AM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Segmentation fault
    By sirsmilealot in forum C Programming
    Replies: 12
    Last Post: 02-10-2010, 01:26 PM
  2. address out of bounds in sockets program
    By newbie_socketsp in forum Networking/Device Communication
    Replies: 2
    Last Post: 08-05-2008, 06:41 AM
  3. Checking for "out of bounds" address?
    By cpjust in forum C Programming
    Replies: 11
    Last Post: 10-31-2007, 11:10 AM
  4. Address out of bounds when returns
    By asilter in forum C Programming
    Replies: 1
    Last Post: 07-31-2007, 10:22 AM
  5. segmentation fault and memory fault
    By Unregistered in forum C Programming
    Replies: 12
    Last Post: 04-02-2002, 11:09 PM