Thread: Checking for "out of bounds" address?

  1. #1
    and the hat of sweating
    Join Date
    Aug 2007
    Location
    Toronto, ON
    Posts
    3,545

    Checking for "out of bounds" address?

    The sadistic bastard who wrote this test harness decided to load Keys & Values from a config file, but he defined the Values array as a union of unsigned long, char* so either numbers or strings could be stored.

    I'm trying to print out all the keys & values after they are parsed from the file, but I have no idea if the value is a number or string, so if I print it as a string when it's just a number I get a Segmentation fault.

    In C is there any way to check if a pointer is valid before using it?
    I'm guessing not. What about trying to print a char* pointer and not crashing if it's invalid (kind of like catching an exception)?

  2. #2
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by cpjust View Post
    In C is there any way to check if a pointer is valid before using it?
    Nothing portable, but it's possible to trap the segmentation fault which would occur if you tried to dereference an invalid pointer. But that's platform-specific.

    Anyway, just because the pointer is valid doesn't mean you should treat it as a pointer. This is awful design.

  3. #3
    and the hat of sweating
    Join Date
    Aug 2007
    Location
    Toronto, ON
    Posts
    3,545
    Quote Originally Posted by brewbuck View Post
    Nothing portable, but it's possible to trap the segmentation fault which would occur if you tried to dereference an invalid pointer. But that's platform-specific.

    Anyway, just because the pointer is valid doesn't mean you should treat it as a pointer. This is awful design.
    Well I think I can honestly say - this is the worst code I have ever seen in my life!
    I just want to get this program working as quick as possible so I never have to look at it again.

  4. #4
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by cpjust View Post
    Well I think I can honestly say - this is the worst code I have ever seen in my life!
    I just want to get this program working as quick as possible so I never have to look at it again.
    Probably, whether the value is treated as an integer or a string is based on what Key it is associated with. The original programmer probably just memorized or hard-coded which Key has what type of value in it.

  5. #5
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    Are the numbers within a given range (typically)?

    What kind of system are you running on?
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  6. #6
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by Salem View Post
    Are the numbers within a given range (typically)?
    I'm envisioning something like: "If the key is "NumberOfClients," then the value is an integer. If the key is "Hostname" then the value is a string" etc etc... In other words, the knowledge is embedded throughout the code.

    At least that's my best guess how it was designed to work.

  7. #7
    and the hat of sweating
    Join Date
    Aug 2007
    Location
    Toronto, ON
    Posts
    3,545
    Quote Originally Posted by Salem View Post
    Are the numbers within a given range (typically)?

    What kind of system are you running on?
    The numbers are whatever value you set in the config file, but most are single digits.
    It's on RedHat Enterprise Linux 3.

  8. #8
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    A simple numeric test then should suffice to tell a number from a pointer.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  9. #9
    and the hat of sweating
    Join Date
    Aug 2007
    Location
    Toronto, ON
    Posts
    3,545
    Actually, that probably won't work. When I looked through the config file again I saw that most numbers were 0 or 1, but a few were as high as 300,000, and some are big hex numbers that are supposed to represent some kind of address.
    I thought something like this would 'catch' the segmentation fault, but it didn't.
    Code:
    signal( SIGSEGV, SIG_IGN );
    Is that what signal() is supposed to do?

  10. #10
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Ignoring a segv is probably not the correct thing to do. You could probably write a signal handler to catch it - but the next problem is to "step back and try another way", which may not be so easy: Something like this would work:
    Code:
    #include <setjmp.h>
    jmp_buf catchjmpbuffer;
    #define TRY  if (!setjmp(catchjmpbuffer))
    #define CATCH else
    ....
    int sighandler_segv(...)
    {
    ...
        longjmp(catchjumpbuffer);
    }
    You can use this as this:
    Code:
    ...
       TRY {
           *somepointer ...
        }
        CATCH {
            /// somepointer is not a valid pointer. 
        }
    Edit: I should clarify that I haven't tested the above code, so it may not work correctly - it's more a sketch than a fully functional example. But hopefully it gives you the right sort of idea(s).

    The other option is of course to change the union into a struct, or some other way to add a data member so that you can check the type of the data value.

    --
    Mats
    Last edited by matsp; 10-31-2007 at 08:47 AM.
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  11. #11
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by cpjust View Post
    Actually, that probably won't work. When I looked through the config file again I saw that most numbers were 0 or 1, but a few were as high as 300,000, and some are big hex numbers that are supposed to represent some kind of address.
    I thought something like this would 'catch' the segmentation fault, but it didn't.
    Code:
    signal( SIGSEGV, SIG_IGN );
    Is that what signal() is supposed to do?
    That causes the signal to be ignored. The reason it isn't working for you is that signal handler behavior resets once a signal is delivered -- well, on SOME kinds of UNIX, at least. So you probably do ignore the first one, then the next one kills you. You have to write a real signal handler for SIGSEGV, and reset the signal handler inside itself:

    Code:
    void handle_sigsegv(int x)
    {
        signal(SIGSEGV, handle_sigsegv);
    }
    You also have to have a way of marking that the signal occurred, so you would have a volatile global variable which you set in the signal handler, and check after your test dereference.

    Anyway, this all sucks bad. I think the best thing to do would be just bite the bullet and make a table of all the different Key names and what the correct type for that key is, so you can handle them correctly. If you have integers that "look like" pointers the bad-address test is probably not good enough. It's hideous, anyway.

    EDIT: Interesting note from the Linux signal() man page:

    According to POSIX, the behaviour of a process is undefined after it
    ignores a SIGFPE, SIGILL, or SIGSEGV signal that was not generated by
    the kill(2) or the raise(3) functions. Integer division by zero has
    undefined result. On some architectures it will generate a SIGFPE sig-
    nal. (Also dividing the most negative integer by -1 may generate
    SIGFPE.) Ignoring this signal might lead to an endless loop.

    According to POSIX (3.3.1.3) it is unspecified what happens when
    SIGCHLD is set to SIG_IGN. Here the BSD and SYSV behaviours differ,
    causing BSD software that sets the action for SIGCHLD to SIG_IGN to
    fail on Linux.

    The use of sighandler_t is a GNU extension. Various versions of libc
    predefine this type; libc4 and libc5 define SignalHandler, glibc
    defines sig_t and, when _GNU_SOURCE is defined, also sighandler_t.
    Last edited by brewbuck; 10-31-2007 at 10:21 AM.

  12. #12
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    Perhaps create an 'isValidPointer' based on the following information.
    Code:
    #include <stdio.h>
    #include <string.h>
    #include <stdlib.h>
    #include <unistd.h>
    
    /* The linker generated symbols for the boundaries of .data */
    /* Similar names exist for the .text and .bss segments */
    /* Use 'nm prog' to find out the actual names on your system */
    extern char _data_start__[];
    extern char _data_end__[];
    
    int someData[10] = { 1 };
    
    int main()
    {
        char *p1 = malloc(10);  /* Hope this is close to the low end of the pool */
        char *p2 = malloc(10);
        void *b = sbrk(0);      /* find out end of allocation pool */
    
        printf( "Data start  =&#37;p\n", (void*)_data_start__ );
        printf( "Data end    =%p\n", (void*)_data_end__ );
        printf( "someData at =%p\n", (void*)someData );
        printf( "1st malloc  =%p\n", (void*)p1 );
        printf( "2nd malloc  =%p\n", (void*)p2 );
        printf( "Break       =%p\n", b );
        /* In future, all allocations should be between p1 and the current */
        /* value returned by sbrk() */
    
        free(p1);
        free(p2);
    
        return 0;
    }
    If the value falls outside the range of .data, .bss or the allocation pool, then trying to dereference it is likely to be a bad idea (guess integer).
    If it's in range, and dereferencing it (as a pointer) shows a printable character, then guess string.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. What does this do (Windows API)?
    By EVOEx in forum Windows Programming
    Replies: 4
    Last Post: 12-19-2008, 10:48 AM
  2. Profiler Valgrind
    By afflictedd2 in forum C++ Programming
    Replies: 4
    Last Post: 07-18-2008, 09:38 AM
  3. DX - CreateDevice - D3DERR_INVALIDCALL
    By Tonto in forum Game Programming
    Replies: 3
    Last Post: 12-01-2006, 07:17 PM
  4. Problems about gcc installation
    By kevin_cat in forum Linux Programming
    Replies: 4
    Last Post: 08-09-2005, 09:05 AM
  5. Im so lost at . .
    By hermit in forum C Programming
    Replies: 18
    Last Post: 05-15-2002, 01:26 AM