Help/ideas on how to catch a difficult but that write at random memory locations

**Kempelen** · 01-29-2014

Hello,

I have a complex program, around 25.000 code lines. It is quite stable, but I have a bug which only happens from time to time, and when in online-mode, so it is quite difficult for me to reproduce it. I have put lot of efforts to see where the problem is, and I have seen where the code crash, but that line is not the culprit, it only refers to read a memory location where it is supposed to be good data, and it is not. The real bug has happens before in execution. That data is read-only, I mean, I only compute it at program start-up, and then never write into it again, but my bug write to it doring code run, maybe because bad pointers address in any place, and look ramdom (different snapshots execution write at different memory locations).

My question is, is there a way to know when my code 'write' into that 'to me protected memory'? I thinking if there is a method to make a hash of all variables and arrays in that memory section, or any way to 'assert' when that big memory section is written, taking into accound that when I refer to that memory section it could be any of the lots of variables and arrays I precompute at start-up.

any help on how to catch this difficult bug would be wellcome.

thanks.

**grumpy** · 01-29-2014

If the code is compiled without optimising and with the compiler emitting debug information, any debugger will be able to identify what code writes to the affected memory.

The catch is that, unless your code is compiled that way, it will need to be recompiled. And the changes will affect layout of memory in your program, so may cause the offending code to write to a different location in memory.

You could also add output statements (e.g. write the affected values to stdout or stderr in selected places) and use those output values to localise the code which causes the problem. The catch is that, potentially, even an additional I/O statement can potentially change memory layout of the program (although that is less likely than recompiling with different settings, it is possible).

If you don't want to do the above (or tried them and find the symptom changed) I suggest eliminating sections of your program systematically. If the symptom vanishes on removing some section of code, then that's not definitive, but does give hints. The possibilities are

1) there is a flaw in the code you just removed.
2) there is a flaw in some code you previously removed, which interacts somehow with the code you just removed.
3) there is a flaw in some code you haven't removed yet, which interacts with some code you have just removed.
4) combinations of the above

There could still be a flaw in code you haven't removed, but a systematic approach keeping the above points in mind can narrow down the odds.

Not withstanding your claim the code is complex, I suggest using code inspection - 25000 may seem like a lot to you, but it's really not that much, unless the code is really badly written. Look for code that uses computed pointers (e.g. a pointer used to select something based on user input), that uses computed array indices (look for off-by-one errors in loops, falling off the end of an array, indices computed from data in files, etc). Also look for code that is excessively "clever" in playing with pointers - usually programmers who are too clever playing with pointers get it wrong.

Welcome to the land of debugging. It is VERY difficult to write code that is bug free. It is often VERY VERY difficult to find a bug once it exists.

**Salem** · 01-29-2014

Which OS / Compiler are you using? This helps us identify suitable tools.

For example, if you set up a large block of memory at initialisation time, which is supposed to be read-only, but later corrupted, then one thing to look into is how your OS could protect areas of memory by marking them read-only.

Then at least the program would crash on the errant write instruction (giving you useful information), rather than having to wait until something stumbles into the minefield.

Thread: Help/ideas on how to catch a difficult but that write at random memory locations

Thread Tools

Search Thread

Display

Help/ideas on how to catch a difficult but that write at random memory locations

Similar Threads

Difficult to catch bug

please see the random number gen difficult post

random number gen difficult

Accessing specific memory locations

Order of Memory Locations