Thread: self modifying code

  1. #1
    Registered User
    Join Date
    Jul 2008
    Posts
    29

    self modifying code

    So, I have an application which is VERY processor intensive. This application has a core which is general enough to work for all intended purposes. However, becuase of this generality, in certain situations it makes unnecessary intensive function calls. I could wrap these function calls in if cases with global variables to switch off those functions but I would rather not have the check at all (This code is being called billions of times). This point I dont want to argue. Therefore, Im trying to figure out how to copy the code into memory and then replace the call with a NOP. I've figured out how to copy the code into memory, and find the memory addresses right before and after the call. The problem comes in when I try to execute the code. I get a SEGFAULT. Any ideas?

  2. #2
    Registered User
    Join Date
    Jul 2008
    Posts
    29
    Heres what I have so far

    Code:
    void p(){
    	printf("Called.\n");
    }
    
    void *start;
    void *end;
    
    void mecallp(){
    	//Why do I do this address stuff here?
    	//	The answer is that labels only have scope in the function
    	end = &&mecallp_end;
    	start = &&mecallp_start;
    mecallp_start:
    	printf("About to call p.\n");
    mecallp_before:
    	p();
    mecallp_after:
    	printf("Came back from p.\n");
    mecallp_end: 
    	;
    }
    
    int main(int argc, char *argv[]){
    	mecallp();
    	start = &mecallp;
    	printf("Start = 0x%x\nEnd   = 0x%x.\n", start, end);
    	int size = end-start;
    	void *programdata = malloc(size);
    	memcpy(programdata, start, size);
    	printf("Data copied.\n");
    	void (*FN)() = programdata;
    	(*FN)();
    }
    Last edited by chacham15; 09-05-2008 at 05:55 PM.

  3. #3
    Registered User C_ntua's Avatar
    Join Date
    Jun 2008
    Posts
    1,853
    Just a quick note (don't have the time atm) you could see where the seg fault is, by debugger or simply putting a printf("OK\n") throughout the code and seeing exactly in what line you get a seg fault

  4. #4
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    I'm pretty sure you won't have much luck doing that . . . .

    You don't have to have a test if you want to branch based on the value of a variable. You could use an array of function pointers instead, for example.
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  5. #5
    Registered User slingerland3g's Avatar
    Join Date
    Jan 2008
    Location
    Seattle
    Posts
    603
    I kind of see what you are doing with the 32 address markers with start - end, but I believe the code issue is

    int size = end-start;
    void *programdata = malloc(size); <---------What do you believe size is?

  6. #6
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    I didn't even know you could do && (and of course you can't, except in gcc), so I had to look it up. The manual says this:
    Quote Originally Posted by GCC Manual 3.4.6
    You may not use this mechanism to jump to code in a different function. If you do that, totally unpredictable things will happen. The best way to avoid this is to store the label address only in automatic variables and never pass it as an argument.
    (Yes, old manual, but it says the same in 4.3.2, so there.) Apparently in this circumstance "totally unpredictable" = "segfault".

  7. #7
    Registered User
    Join Date
    Jul 2008
    Posts
    29
    slingerland3g, you are right. I changed the size to 8000, which is wayy too large and it at least now prints out "About to call p". What should size be set to? I tried to use the labels to measure the size of the code, but I guess that that doesnt work. Stepping into the instructions, I've determined that the problem is that the call is a relative call so since the program code is in a different location the relative leads to invalid memory hence the segfault. I dont know much about the bit representation of these instructions, so does anyone know of a way that I can fix these relative jumps?

    Thanks,
    chacham15

    Code:
    (gdb) disassemble mecallp
    Dump of assembler code for function mecallp:
    0x004010e2 <mecallp+0>: push   &#37;ebp
    0x004010e3 <mecallp+1>: mov    %esp,%ebp
    0x004010e5 <mecallp+3>: sub    $0x8,%esp
    0x004010e8 <mecallp+6>: movl   $0x40111b,0x403050
    0x004010f2 <mecallp+16>:        movl   $0x4010fc,0x403030
    0x004010fc <mecallp+26>:        movl   $0x402009,(%esp)
    0x00401103 <mecallp+33>:        call   0x40128c <printf>
    0x00401108 <mecallp+38>:        call   0x4010ce <p>
    0x0040110d <mecallp+43>:        movl   $0x40201b,(%esp)
    0x00401114 <mecallp+50>:        call   0x40128c <printf>
    0x00401119 <mecallp+55>:        leave
    0x0040111a <mecallp+56>:        ret
    End of assembler dump.
    The thing that I dont get though is that if call is a relative jump, why did the printf work?

  8. #8
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Calls in x86 are relative to the current location, so the call to printf and p would call a "random" location [the same distance from the malloc'd memory as your mecallp is from the respective functions]. Although it may be that printf is called in a different way - check the exact binary code of each call instruction.

    Also, malloc() if your system supports the NX (No Execute) for memory management, will not be executable. You need some sort of system specific memory allocation to do that (where you can specify that you want executable memory).

    As to the right solution for your problem, I don't think you are on the right pat at all - you will get better performance if you write several functions that do almost the same thing but skipping a few things when they are not necessary [I presume you KNOW under which circumstances you need what functionality, otherwise you would also not be able to patch out those function calls in your modified code].


    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  9. #9
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    This "works" with cygwin, but YMMV for the reasons matsp states.

    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    
    #define USE_PRINTF  pic_printf
    int (*pic_printf)(const char *p, ... );
    
    void adder ( void **start, void **end ) {
        if ( start ) {
            *start = &&labstart;
            *end = &&labend;
        } else {
    labstart:
            USE_PRINTF("In adder\n");
    labend:;
        }
    }
    void subber ( void **start, void **end ) {
        if ( start ) {
            *start = &&labstart;
            *end = &&labend;
        } else {
    labstart:
            USE_PRINTF("In subber\n");
    labend:;
        }
    }
    
    size_t  calcSize ( void *start, void *end ) {
        unsigned char *s = start;
        unsigned char *e = end;
        return e - s;
    }
    
    void copyCode ( char *buff, size_t *dest_offset, size_t dest_size, void *from, size_t from_size ) {
        if ( *dest_offset + from_size <= dest_size ) {
            memcpy( buff + *dest_offset, from, from_size );
            *dest_offset += from_size;
        } else {
            fprintf( stderr, "oops: Offset=&#37;lu, dest_size=%lu, copy_size=%lu\n",
                *dest_offset, dest_size, from_size );
        }
    }
    
    void dummy ( void ) {
    }
    #define DUMMY_SIZE      5
    #define PROLOGUE_OFFSET 0
    #define PROLOGUE_LENGTH 3
    #define EPILOGUE_OFFSET 3
    #define EPILOGUE_LENGTH 2
    #define DUMMY_PTR(x)    (void*)(((unsigned char *)dummy)+(x))
    
    int main ( ) {
        struct {
            void    *start;
            void    *end;
            size_t  size;
        } info[2];
        char    *buff;
        size_t  buff_offset = 0;
        size_t  buff_size;
        void    (*fnp)(void);
        
        pic_printf = printf;
        
        adder( &info[0].start, &info[0].end );
        info[0].size = calcSize( info[0].start, info[0].end );
        subber( &info[1].start, &info[1].end );
        info[1].size = calcSize( info[1].start, info[1].end );
    
        buff_size = info[0].size * 2 + info[1].size + DUMMY_SIZE;    /* add, sub, add */
        buff = malloc( buff_size );
    
        copyCode ( buff, &buff_offset, buff_size, DUMMY_PTR(PROLOGUE_OFFSET), PROLOGUE_LENGTH );
        copyCode ( buff, &buff_offset, buff_size, info[0].start, info[0].size );
        copyCode ( buff, &buff_offset, buff_size, info[1].start, info[1].size );
        copyCode ( buff, &buff_offset, buff_size, info[0].start, info[0].size );
        copyCode ( buff, &buff_offset, buff_size, DUMMY_PTR(EPILOGUE_OFFSET), EPILOGUE_LENGTH );
    
        fnp = (void(*)(void))buff;
        fnp();
        free( buff );
    
        return 0;
    }
    In order to call the thing from C, you need to have the standard prolog/epilog code copied as well.

    Further, you could probably have local variables, but ALL your patched in code would need the same set. Plus any locals at all change prolog/epilog as well.
    One way would be to use a 'struct' for all locals in each of the stub functions, then in dummy have a union of all those structs to get the worst-case size out of it.

    Also demonstrated is pic_printf, which helps fix the "relative addressing" problem.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  10. #10
    Registered User
    Join Date
    Jul 2008
    Posts
    26
    Using labels to find the start and end address of a function is very bad practice, and is not compatible with all compilers and OSs. For example, what happens if another compiler decides that part of another function can be executed by just jumping to a specific place in the function you have just modified? This is a usual compiler size optimisation, and in your case, it could cause strange bugs.

  11. #11
    Registered User
    Join Date
    Jul 2008
    Posts
    29
    Thanks Salem. matsp the reason that I dont want to write 8 different functions (although youre right that I could) is for maintainability purposes. For each new switch that occurs the number of necessary functions doubles. Then if a change needs to be made...I pity the person that has to go though thousands of nearly alike functions. Salem, is there anyway to simply replace the relative jumps with absolute jumps? Also, if I were interested in creating a patch for an executable are there any guides about this?

    Thanks!
    chacham15

  12. #12
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    I think this attempt at low-level optimisation is just plain wrong.
    There are ways of writing code that allows for all the genericity that you need without the speed hits of doing redundant stuff.
    I strongly suggest googling for "Policy-based design".

    The last thing you should be doing is resorting the extreme's you're embarking on at the moment. Do you not understand how incredibly risky what you're doing is? One false move and bam your app is dead. Not just now, but for every tiny little change in that area that is made later. It's a maintenance nightmare!
    Not to mention, it wont run at all on Vista unless you do what's required to satisfy DEP.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  13. #13
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Aside from the above suggestions, which are all good....

    Ok, so if having multiple versions of the function is a nightmare, how is it made easier by having code that needs to know exactly what the function looks like and then patch the function in various ways to remove functions - to me, that IS a maintenance nightmare.

    It wouldn't be difficult to use multiple-compilation of the same source to achieve variants of a function:
    Code:
    int funcVar1()
    {
    #define OPTION1
    #include "myfunc.h"
    #undef OPTION1
    }
    
    int funcVar2()
    {
    #define OPTION2
    
    // myfunc.h:
    ... bits of code you always need.
    #ifdef OPTION1
       some code you only want sometimes. 
    #endif
    .... more code that is always used.
    #ifdef OPTION2 
       some other code that only is needed occasionaly. 
    #endif
    
    #if defined(OPTION1) || defined(OPTION2)
      ... Some code that is needed in option1 or 2. 
    #else
       ... Some other code when not Opt1 or Opt2. 
    #endif
    ...
    Of course, your option variability could be named differently, and you can use integer values if you prefer. Even bitwise flags, eg.
    Code:
    #if FLAGS & 3 
     ... some code
    #else
      .. code you want if both of flag bits 0 and 1 are not set. 
    #endif
    Now you can just choose which variant you want from your main code, and you get more optimal code than patching out calls [even if NOP's are pretty much "does nothing", they still take up space in the pipeline up to the point where the processor decides that it's a NOP and can be discarded], and it's at least somewhat maintainable even if you move to another compiler, OS, or processor architecture.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  14. #14
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Another possibility is that in many cases you perform a test inside a loop and you know in advance that this test will always give a certain result. In that case you should probably look into some loop unswitching.

    By all means there is some tiny chance that what you're doing is the best approach. I highly doubt it, but you're in a better position to know than we are. However if you don't know about other techniques mentioned by posters here, then even you cannot be sure that you're going down the right path. You owe it to yourself to seek out and find the best solution.
    Last edited by iMalc; 09-07-2008 at 02:45 AM.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  15. #15
    Registered User
    Join Date
    Jul 2008
    Posts
    29
    Quote Originally Posted by iMalc View Post
    I think this attempt at low-level optimisation is just plain wrong.
    There are ways of writing code that allows for all the genericity that you need without the speed hits of doing redundant stuff.
    I strongly suggest googling for "Policy-based design".

    The last thing you should be doing is resorting the extreme's you're embarking on at the moment. Do you not understand how incredibly risky what you're doing is? One false move and bam your app is dead. Not just now, but for every tiny little change in that area that is made later. It's a maintenance nightmare!
    Not to mention, it wont run at all on Vista unless you do what's required to satisfy DEP.
    Oddly enough, it works on Vista... (maybe thats because UAC is off). All that Im doing is overwriting the function call with NOPs. Those other methods dont work because, they A. require templating, B. do not allow me to dynamically turn a function on or off.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Proposal: Code colouring
    By Perspective in forum A Brief History of Cprogramming.com
    Replies: 28
    Last Post: 05-14-2007, 07:23 AM
  2. Updated sound engine code
    By VirtualAce in forum Game Programming
    Replies: 8
    Last Post: 11-18-2004, 12:38 PM
  3. Self modifying code
    By orbitz in forum Linux Programming
    Replies: 1
    Last Post: 06-06-2003, 06:09 PM
  4. Interface Question
    By smog890 in forum C Programming
    Replies: 11
    Last Post: 06-03-2002, 05:06 PM
  5. Replies: 0
    Last Post: 02-21-2002, 06:05 PM