Thread: load user-mode dll into kernel-mode "process"

  1. #1
    Unregistered User Yarin's Avatar
    Join Date
    Jul 2007
    Posts
    2,158

    load user-mode dll into kernel-mode "process"

    Is there a way that I can load a user mode DLL (such as kernel32) into the address space of my device driver running in kernel mode?

    Unfortunately, there is no ZwLoadLibrary().

  2. #2
    Registered User
    Join Date
    Mar 2005
    Location
    Mountaintop, Pa
    Posts
    1,058
    I'm not sure if it's possible to call a userland DLL from the kernel as indicated by this MS link.

  3. #3
    'Allo, 'Allo, Allo
    Join Date
    Apr 2008
    Posts
    639
    No, not in any practical sense. Besides the GetProcAddress equivalent only covers HAL and NTOSKRNL exports. The donkey work required is a project unto itself, although it would probably be an excellent way to get accustomed to !analyze -v

    Just create a buddy program which does all the user work and talks to the driver via IOCTLs. It's probably not the only way but it's surely the easiest. Oh, and I'm not talking about calling GetProcAddress and sending the result to the driver, that's almost definitely a one-way ticket to BSOD city.

  4. #4
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    You can not call user-mode code from kernel mode drivers. You need a service process that acts as the conduit from user-mode to kernel-mode and vice versa.

    It is technically possible, in x86, to write code that can be run in both user-mode and kernel-mode, but there are LOTS of obstacles that get in the way if you haven't got full control over the whole OS (such as memory mapping, segment register setup, etc). For example, kernel-mode code may expect the memory to be locked into memory - there is no way in Windows to say "lock the user-mode code so that it doesn't get swapped out". For all you know, the user-mode process may not even be active any longer when you try to call it if your kernel driver is any way asynchronous.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  5. #5
    Unregistered User Yarin's Avatar
    Join Date
    Jul 2007
    Posts
    2,158
    I'm not surprised, device drivers are a pain to write, why not add to it?

    Thanks.

  6. #6
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by Yarin View Post
    I'm not surprised, device drivers are a pain to write, why not add to it?

    Thanks.
    Well, really there is a boundary between user and kernel mode, and that is there for a reason. Allowing kernel code to call back into user-code would make a hole big enough to run several buses side-by-side through into the kernel. What if your callback into the DLL leads to your code being allowed to modify the kernel content - then there is absolutely no protection against that sort of thing AT ALL.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  7. #7
    Unregistered User Yarin's Avatar
    Join Date
    Jul 2007
    Posts
    2,158
    Quote Originally Posted by matsp View Post
    Well, really there is a boundary between user and kernel mode, and that is there for a reason. Allowing kernel code to call back into user-code would make a hole big enough to run several buses side-by-side through into the kernel. What if your callback into the DLL leads to your code being allowed to modify the kernel content - then there is absolutely no protection against that sort of thing AT ALL.
    Makes sense.
    But...
    Recently I made a custom PE loader, and when testing it I noticed that kernel32 imports from ntdll. So, how does a user-mode DLL make kernel-mode function calls?

  8. #8
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by Yarin View Post
    Makes sense.
    But...
    Recently I made a custom PE loader, and when testing it I noticed that kernel32 imports from ntdll. So, how does a user-mode DLL make kernel-mode function calls?
    Through any one of the standard mechanisms for doing that. Software interrupts, call gates, task gates, etc.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  9. #9
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Windows uses system calls by INT 2Eh for it's system calls in the standar version of the NTDLL.DLL. There are, as brewbuck says, several ways to solve this problem in x86:
    - Call gates - a descriptor in the GDT (or possibly ldt) holds an entry for where the system calls go.
    - Interrupt gates - an entry in the IDT holds the entry for the system calls.
    - Task gate - similar to call-gates/interrupt-gates, but CPU will save task-state. I'm not aware of any OS that uses task-gates except for very special things [stack/double fault handling in Linux and Windows uses a task-gate to get a new stack pointer so that the OS can handle running out of stack or other "catastrophic" error like that].
    - SYSCALL/SYSENTER - special model specific register hold the system call entry points.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  10. #10
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by matsp View Post
    Windows uses system calls by INT 2Eh for it's system calls in the standar version of the NTDLL.DLL. There are, as brewbuck says, several ways to solve this problem in x86:
    - Call gates - a descriptor in the GDT (or possibly ldt) holds an entry for where the system calls go.
    - Interrupt gates - an entry in the IDT holds the entry for the system calls.
    - Task gate - similar to call-gates/interrupt-gates, but CPU will save task-state. I'm not aware of any OS that uses task-gates except for very special things [stack/double fault handling in Linux and Windows uses a task-gate to get a new stack pointer so that the OS can handle running out of stack or other "catastrophic" error like that].
    - SYSCALL/SYSENTER - special model specific register hold the system call entry points.

    --
    Mats
    One method which I haven't seen used (probably because it's inefficient) is to simply have usermode call directly to a kernel address -- the resulting fault triggers a lookup of the faulting address in a table. If the address corresponds to the entry point of a kernel API the kernel twiddles the CPL in the TSS and then restarts the call, also replacing the return value on the stack with the address of a trampoline which simply does the inverse in order to return control back to userspace.

    Sounds sick, but from userspace it looks like you're simply making a call directly to kernelspace.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  11. #11
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by brewbuck View Post
    One method which I haven't seen used (probably because it's inefficient) is to simply have usermode call directly to a kernel address -- the resulting fault triggers a lookup of the faulting address in a table. If the address corresponds to the entry point of a kernel API the kernel twiddles the CPL in the TSS and then restarts the call, also replacing the return value on the stack with the address of a trampoline which simply does the inverse in order to return control back to userspace.

    Sounds sick, but from userspace it looks like you're simply making a call directly to kernelspace.
    In theory you could do that - but it is, as you say, inefficient - not so much for the lookup part [there's probably some sort of lookup anyways, and if you do it right, you can just do some initial range (value must be between x and y) and then a shift/mask operation to get an index into a table. But the combination of fault handling being not so nice to (modern) processors:

    • because it will continue executing several more instructions in the mean-time, then having to "roll back" to the faulting call instruction.
    • Modifying the content of the stack-frame is no good for the "return stack optimisation" - this leads to "branch mispredict" handling, and the resulting pipeline flush and refill.


    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Pong is completed!!!
    By Shamino in forum Game Programming
    Replies: 11
    Last Post: 05-26-2005, 10:50 AM
  2. Kernel mode
    By /Muad'Dib\ in forum C++ Programming
    Replies: 2
    Last Post: 06-02-2004, 08:36 AM
  3. opengl help
    By heat511 in forum Game Programming
    Replies: 4
    Last Post: 04-05-2004, 01:08 AM
  4. problem with open gl engine.
    By gell10 in forum Game Programming
    Replies: 1
    Last Post: 08-21-2003, 04:10 AM
  5. OpenGL and Windows
    By sean345 in forum Game Programming
    Replies: 5
    Last Post: 06-24-2002, 10:14 PM