Thread: Seg Faults at Random

  1. #1
    Registered User
    Join Date
    Jan 2003
    Posts
    21

    Seg Faults at Random

    hello,

    I know this is a longshot and unfortunately I cant provide too much information but in the past the people of this board have helped me out so I hoping maybe someone could give me some advice.

    I managed to write my mud which I had posted about before, its not the most efficent mud in the world but it works which is the main thing for me at the moment. I shall improve it later on but right now im experiencing a bug which is driving me nuts and I cant figure out where its going wrong.

    im getting alot of these segfaults.. (running with gdb)

    Program received signal SIGSEGV, Segmentation fault.
    chunk_alloc (ar_ptr=0x4018e300, nb=1065) at malloc.c:2990
    2990 malloc.c: No such file or directory.
    in malloc.c

    now i backtrace them and its always in different functions, etc. Completely random, Also the timing is always random aswell it could run for a while and be fine, then after a while just crash with chunk_alloc errors when a different event occurs.

    Now ive searched on google for this but all i can find is pages with people dumping gdb output so im not 100% of what this error is. Im making a logical guess that in trying to allocate some memory for something its either overwriting or running out of ? Im sure its not running out of memory though as a ps -aux never see it rise above 0.9% mem usage.

    Now im running a fair few MySQL queries when people do things like /look etc and I think that maybe where my problems lye because im giving my query strings huge junks of memory like query=malloc(3096); as for a while I thought i wasnt allocating enough memory to the query strings, anyway that doesnt seem to have fixed the problems. Although could this be where my problems are ? the allocating too much memory for queries ?

    My MUD loops every second and ive built a linked list of events which the program will loop through to see if anything needs to occur. Ive double checked these functions and they all seem fine but what Im asking is, if a program is structured badly can they result in memory corruption i.e the chunk_alloc errors ? or would it just run slower but get there in the end.

    So i ask you guys if there is anyway I can try and track down where my program is going wrong, maybe some kind of memory monitor (running on linux btw) in determining if pointers are being overwritten ? I try to display as much information as possible in debug so as soon as an event occurs and it has to come out of the linked list (or go in)I display everything in the list so I can see the next pointer address, previous, etc. and its not looking like its corrupting anywhere.

    Well there ya have it, I know i havent provided too much information but im really at a loss here, just when I think I have found a problem, i fix it, it runs great for 2 / 3 hours and then bam chunk_alloc message appears in gdb

    thanks in advance to anyone who can provide even the smallest clue

    mrpickle.

  2. #2
    Registered User
    Join Date
    Apr 2002
    Posts
    1,571
    Without seeing any code its always tough though. It sounds to me like somewhere you are overrunning an array bounds.
    "...the results are undefined, and we all know what "undefined" means: it means it works during development, it works during testing, and it blows up in your most important customers' faces." --Scott Meyers

  3. #3
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    My kid's favorite game is "stompy-stomp".... on daddy.

    It sounds like you already know it, but someone is playing stompy-stomp in someone else's memory - in malloc.c's memory most likely.
    Becuase of that, I would first look for buffer overruns in memory returned by malloc(), and in general. The continued use a free'd pointer would be bad too.

    Do you have any other data structures more complex than linked lists?

    gg

  4. #4
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    I suggest you try
    Code:
    gcc prog.c -lefence
    Electric Fence is a malloc memory checker which will trap as soon as you access past the end of any allocated memory, or access any allocated memory which has been freed already.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  5. #5
    Registered User
    Join Date
    Jan 2003
    Posts
    21
    hello,

    firstly sorry for the double post yesterday I had problems accessing the site. ok well i couldnt use lefence as my compiler claimed /usr/bin/ld: cannot find -lefence. Anyway I downloaded something called Valgrind and run my program through that. I must admit the results were a little grim and some of the problems I have no idea how to fix such as this :

    Code:
    ==5492== Conditional jump or move depends on uninitialised value(s)
    ==5492==    at 0x401868C1: vgAllRoadsLeadToRome_select (vg_intercept.c:702)
    ==5492==    by 0x4018696E: __select (vg_intercept.c:722)
    Its reffering to my select function which is just simply : select(CurrentFDMax+1,&read_fds,NULL, NULL, &tv); so not a clue how to fix that one. This is another error :

    Code:
    ==5492== Syscall param write(buf) contains uninitialised or unaddressable byte(s)
    ==5492==    Address 0x4167C9BC is 72 bytes inside a block of size 8192 alloc'd
    although all the related files are in reference to my mysql.so files, so not sure if its a bug in them or in my code.

    Anyway it did show me where I was going wrong with my linked lists and event checking function..

    Basically i was doing this : (cut down)

    Code:
    void CheckEvents(struct node* head, int mud_counter) {
      struct node* current = head;
      while(current != NULL) {
        if(current->ToOccur == mud_counter) {
          switch(current->EventType)
          {
            case 1:
            DoSomething();
            break;
            case 2:
            DoSomethingElse();
            DeleteTheEvent(current);
            break;
            case 3:
            DoSomethingElse();
            break;
          }
        }
        current = current->next;
      }
    }
    Where I was going wrong was if I found a event which needed deletion I was sending it to my delete function to be removed but afterwards still trying to do : current = current->next; Anyway Valgrind pointed out this was wrong. Do you think that could have been the cause of the random crashes aswell ? Anyhow I was trying to fgure out how I could get around it and came up with this :

    Code:
    void CheckEvents(struct node* head, int mud_counter) {
      struct node* current = head;
      struct node* next; // < using this as my next reference
      while(current != NULL) {
        if(current->ToOccur == mud_counter) {
          switch(current->EventType)
          {
            case 1:
            next = current->next;
            DoSomething();
            break;
            case 2:
            DoSomethingElse();
            next = current->next;
            DeleteTheEvent(current);
            break;
            case 3:
            next = current->next;
            DoSomethingElse();
            break;
          }
        }
        current = next;
      }
    }
    Which i thought would work as Ill aways have my pointer to the next node and it works in that Valgrind doesnt complain about what im doing anymore but it seems to be doing an enormous amount of work as a ps -aux shows it to be using 20Megs of RSS when its running. Its also very very slow, so no doubt im either running a partial endless loop or just continuously expanding some memory somewhere. Could someone point out where im going wrong please or what I need to read up about to get my problem fixed. I think its this part : current = next; Ive tried fiddling around with it but cant seem to get it to run normally.

    *EDIT * ok ive figured out that adding a default in the case statement stops if from going to real high RSS but now the main combat loop is going about a second slower than it was before I changed to using the next variable...

    Anyway any help you may have will be greatly appreciated.

    Thanks,

    mrpickle.
    Last edited by mrpickle; 01-15-2004 at 11:19 AM.

  6. #6
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    You should switch to double linked lists. This way, to remove any given node, all you need is a pointer to the node itself. Thus:
    Code:
    struct node *n;
    
    n = findnodetodelete( somevalue );
    if( n )
    {
        if( n->prev )
            n->prev->next = n->next;
        if( n->next )
            n->next->prev = n->prev;
        if( n == firstnode )
            firstnode = n->next;
    
        freenode( n );
    }
    else
        printf("value not found, node not deleted");
    Basicly, you make the previous node point to the next node, and vice versa. Then just delete the current node. The syntax may take a little bit to get used to, but double linked lists are way way easier to work with for insertion and deletion.

    Quzah.
    Hope is the first step on the road to disappointment.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. sorted linked list...why seg faults!
    By S15_88 in forum C Programming
    Replies: 4
    Last Post: 03-24-2008, 11:30 AM
  2. It seg faults right away
    By Brimak86 in forum C Programming
    Replies: 6
    Last Post: 01-31-2008, 10:40 PM
  3. It seg faults
    By Brimak86 in forum C Programming
    Replies: 2
    Last Post: 01-15-2008, 09:06 PM
  4. Another brain block... Random Numbers
    By DanFraser in forum C# Programming
    Replies: 2
    Last Post: 01-23-2005, 05:51 PM