Thread: a library or API providing direct access to a running PE address space

  1. #1
    Banned
    Join Date
    May 2009
    Posts
    37

    Unhappy a library or API providing direct access to a running PE address space

    hi, i'm trying to come up with a method (either through another executable or dll) where the contents of an already running windows executable -- the stack, heaps, handlers, import pages, raw data, reference to virtual space -- basically the contents of a PE image, can be copied.

    after this, i would like to write these data to another instantiation of the same executable, effectively emulating a fork().

    please do give me any leads as you can, even though if it doesn't provide a complete way of doing this. small bits and clues will be much appreciated

  2. #2
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Accessing the loaded executable is the least of your problems - if your code is part of the exexcutable itself, it should be little problem doing that.

    The real problem is with DLL's and the OS itself.

    The first problem is that you need to create a second process, and you can (probably) do this by NtCreateProcess, but you then have to do a lot of the work that CreateProcess normally does.

    The next problem is that you either duplicate ALL the information from one process to the other, or you need to handle "copy-on-write" behaviour.

    Are you planning to do this just for fun, or do you have code that needs fork? It is likely better to rewrite the code that does fork, so that it uses either spawn or threads.

    Here's a note from the Cygwin implementation: Cygwin API Questions
    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  3. #3
    Banned
    Join Date
    May 2009
    Posts
    37
    yes, i am quite aware of the fork emulation in msys and cygwin.

    msys simply simply calls a dummy fork which tags the underlying function to use spawn when the exec if finally called. but the caveat here is that you assume that exec will be called sometime after fork. what if, for some silly reason, the programmer simply calls fork to split the process into 2 thereby creating a semi-thread behavior and exec is never called?

    the cygwin emulation, on the other hand, does a more accurate emulation of this, though far from being a perfect. since all cygwin executable lies on top of the posix api dll, all memory allocations and heaps and stacks are accounted for and easily copied from one instantiation to another. to get to the proper point of entry the parent process simply calls a setjmp() to the child process. but this is a very crude way of doing it, i still believe. it is no secret that fork() on cygwin can sometimes cause unexpected and unexplained crashes. i believe it's because though cygwin copied the higher level of data, it hasn't necessarily copied the lower level of it -- the proper reference to the virtual address, the debug information, the loaded/unloaded import/export tables, etc, etc that underlies the PE image.

    so. i was wondering if i can do a low-level way of doing it. i've been doing some research but more and more i lose hope of finding a clean solution for this -- the windows api simply doesn't provide this sort of access. i thought i've found a way with "Imagehlp" class of function but that was also a dead end too. if all else fails i'll probably resort to writin a driver (if that can actually yield a solution).

    edit:
    though i must admit, i can just contend to omit the "copy-on-write" behavior and settle for a version fork as it was in the earlier *nix-es.
    Last edited by renzokuken01; 05-22-2009 at 05:15 PM.

  4. #4
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Before I wrote the answer, I did some searching, and it seems like there is no "good" solution for this.

    The guys at CygWin did not choose to solve the problem in the way they did because they were lazy or stupid. They choose that solution because it is the nearest you can get without doing A LOT of undocumented system calls to sort of do the CreateProcess stuff, but without loading an exectuable.

    The simple fact of the matter is that Windows wasn't really designed to do this... If the guys at MS wanted us to do Fork, they would implement a DuplicateProcess system call that does the relevant work.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  5. #5
    Banned
    Join Date
    May 2009
    Posts
    37
    yes, i did realize that. they are quite capable implementors. though my belief that there is a "cleaner" solution to this lies in the fact that microsoft is a BIG corporation. what i mean by this, is that big corporations can be very, very clumsy. they can't really cover all bases in a practical enough manner. sooner or later, i thought, they will give a somewhat mundane api, thinking that they were doing it for some other purpose, but can actually be hacked to do what i needed to do. this was the main premise of my logic. so i've been scouring different places to know this.

    it seems this is one of those times when my intuition have completely failed me... i must admit. and it probably isn't also very wise of me to be posting this as there is a danger that anybody in redmond might find this and warn the top designers.

  6. #6
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by renzokuken01 View Post
    yes, i did realize that. they are quite capable implementors. though my belief that there is a "cleaner" solution to this lies in the fact that microsoft is a BIG corporation. what i mean by this, is that big corporations can be very, very clumsy. they can't really cover all bases in a practical enough manner. sooner or later, i thought, they will give a somewhat mundane api, thinking that they were doing it for some other purpose, but can actually be hacked to do what i needed to do. this was the main premise of my logic. so i've been scouring different places to know this.
    MS may be a big organization. But they also have some pretty darn clever people working there too.

    it seems this is one of those times when my intuition have completely failed me... i must admit. and it probably isn't also very wise of me to be posting this as there is a danger that anybody in redmond might find this and warn the top designers.
    I don't see why they would bother to worry about it. It's not like fork() is a competitive advantage in itself. It was a clever way to duplicate some of the in-process information from one process to another when Unix was first invented, but for most intents and purposes, spawn() and CreateProcess() does a more appropriate job.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  7. #7
    Banned
    Join Date
    May 2009
    Posts
    37
    mat, can you help me out here..? i was just about to give up on this route when i checked the msdn once again and it seems i overlooked 2 nifty functions: ImageDirectoryEntryToDataEx and ImageRvaToVa under Dbghelp. seems to provide the access i needed all along.

    can you check it out for me? coz, whenever i compile a test run of it in mingw or visual studio '08 it always spits out the errors "undefined reference" or "unresolved external symbols". can you tell me why that is? am i missing an sdk package?


    edit:

    at this point, it seems to me that the PE address space information retrieval can be done quite sufficiently (although awkwardly) through the "DbgHelp" API. but i still have the problem of writing the data into the new process. i can crudely use a sort of "memcpy() or WriteProcessMemory()" function, but this of course would be hacks and subjectible to changes in the PE specification. i need an actual library for writing back into the PE address space, specifically the header space. if anybody has a systematic way of doing this, pls pm or reply to me...
    Last edited by renzokuken01; 05-23-2009 at 04:56 PM.

  8. #8
    'Allo, 'Allo, Allo
    Join Date
    Apr 2008
    Posts
    639
    You didn't add the dbghelp library to the linker. I'm not entirely sure what use poking around the PE headers would be for a fork operation though.

    This is about as close as you can get to Windows native forking though as it stands it's neither compilable nor fully works on anything other than NT/2000. It'll require a bit of googling or flicking through the pages of the book to find the function prototypes and declarations for the structures if you persue it. Plus as Mats said, these are mostly undocumented functions so you're at the mercy of MS's internal tinkering, but it may be a useful alternate starting point for you.

  9. #9
    Banned
    Join Date
    May 2009
    Posts
    37
    at this point, it seems to me that the PE address space information retrieval can be done quite sufficiently (although awkwardly) through the "DbgHelp" API. but i still have the problem of writing the data into the new process. i can crudely use a sort of "memcpy() or WriteProcessMemory()" function, but this of course would be hacks and subjectible to changes in the PE specification. i need an actual library for writing back into the PE address space, specifically the header space. if anybody has a systematic way of doing this, pls pm or reply to me...

    edit:


    i am the type that basically have to be knocked to his senses... so if i am approaching the problem wrong, feel free to grab the nearest wooden bat and smack me in the head (i'm figuratively speaking, of course).
    Last edited by renzokuken01; 05-23-2009 at 08:44 PM.

  10. #10
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Assuming you are NOT using OS functions to duplicate the process, you will need to use Debug functions and use VritualAlloc in the child process to allocate memory in the right places.

    You also need to know which DLL's are loaded, both those that are loaded automatically by the system loader, and those loaded by the process itself.

    Finding from the .exe file what DLL's it is hard-linked against isn't too difficult. Finding what
    DLL's are loaded isn't too difficult.

    Note that you need to make sure the DLL's are OPENED/INITIALIZED, but then that any initialized data within the DLL's that were set up by the parent process gets set to the values that are set in the child process. Beware however of process ID's and thread ID's stored in the data - you do want those to reflect the child process, not the parent process.

    And that is just a few things that get really messy really quickly.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 10
    Last Post: 09-04-2008, 01:27 PM
  2. pointers
    By InvariantLoop in forum C Programming
    Replies: 13
    Last Post: 02-04-2005, 09:32 AM
  3. FILES in WinAPI
    By Garfield in forum Windows Programming
    Replies: 46
    Last Post: 10-02-2003, 06:51 PM
  4. Replies: 12
    Last Post: 05-17-2003, 05:58 AM
  5. Im so lost at . .
    By hermit in forum C Programming
    Replies: 18
    Last Post: 05-15-2002, 01:26 AM

Tags for this Thread