Thread: DLL rebase

  1. #1
    Registered User
    Join Date
    May 2006
    Posts
    1,579

    DLL rebase

    Hello everyone,


    I want to learn what DLL rebase is, currently I am reading, http://msdn2.microsoft.com/en-us/library/ms810432.aspx, but find the theory part is too brief and I can not understand,

    1. Why we need rebase;
    2. When we need rebase;
    3. How do the rebase;
    4. Performance impact.

    Any other documents to refer? From MSDN, this is the only one I found.


    thanks in advance,
    George

  2. #2
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    So, by default ALL DLL's have the same base-addresses for code and data. So if an application loads two DLL's, it will have to move at least one of those DLL's to a different location.

    If you give a different base address to each of the DLL's that are part of your project, the OS doesn't have to go about translating the memory addresses in the DLL to a different address. You do this by the /BASE option in the linker: http://msdn2.microsoft.com/en-us/lib...8s(VS.71).aspx

    Most, if not all "standard" DLL's already have random DLL base addresses.

    The performance impact only applies when the load of the DLL occurs, so normally during program start. As the original link says, it's often best to reduce the number of DLL's, as there is a fixed overhead for loading the DLL itself, in addition to a variable, but smaller, amount of overhead per byte of DLL content.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  3. #3
    Registered User
    Join Date
    May 2006
    Posts
    1,579
    Thanks Mats,


    Two more comments after studying your reply and MSDN link,

    1.

    DLL Rebase in this article's description is used for DLL loading only, which could speed up performance. Right?

    2.

    "default ALL DLL's have the same base-addresses ", and address you mentioned is process virtual address, not the global physical address?

    Quote Originally Posted by matsp View Post
    So, by default ALL DLL's have the same base-addresses for code and data. So if an application loads two DLL's, it will have to move at least one of those DLL's to a different location.

    If you give a different base address to each of the DLL's that are part of your project, the OS doesn't have to go about translating the memory addresses in the DLL to a different address. You do this by the /BASE option in the linker: http://msdn2.microsoft.com/en-us/lib...8s(VS.71).aspx

    Most, if not all "standard" DLL's already have random DLL base addresses.

    The performance impact only applies when the load of the DLL occurs, so normally during program start. As the original link says, it's often best to reduce the number of DLL's, as there is a fixed overhead for loading the DLL itself, in addition to a variable, but smaller, amount of overhead per byte of DLL content.

    --
    Mats

    regards,
    George

  4. #4
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Quote Originally Posted by George2 View Post
    2. "default ALL DLL's have the same base-addresses ", and address you mentioned is process virtual address, not the global physical address?
    Virtual Memory. Nothing in user mode space from what I know ever comes close to the physical memory. All is based upon the virtual memory.
    I mean, if all dlls were loading into the same place is physical memory, imagine how many dlls would already be loaded there?
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  5. #5
    Registered User
    Join Date
    May 2006
    Posts
    1,579
    Hi Elysia,


    I think we are only coversing a small part of what rebase is.

    Another point we missed is about "fixup", see this "Fixups" section of this link,

    http://msdn2.microsoft.com/en-us/library/ms810432.aspx

    I do not quite understand 100% of the point. But from my current understanding, it is not just for finding another address to load the DLL, it should do something else? (you can see it needs reloading DLL in application running time, copy on write).

    Quote Originally Posted by Elysia View Post
    Virtual Memory. Nothing in user mode space from what I know ever comes close to the physical memory. All is based upon the virtual memory.
    I mean, if all dlls were loading into the same place is physical memory, imagine how many dlls would already be loaded there?

    regards,
    George

  6. #6
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    When a DLL is moved in memory, some of the DLL code needs to be modified to work with the new address. These modifications are called fixups. This is slow.

    When rebasing, the fixup places are defaulted to the new address. Because the DLL probably won't need to be relocated from the new place, the defaults are correct and the fixup pass can be skipped.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  7. #7
    Registered User
    Join Date
    May 2006
    Posts
    1,579
    Hi CornedBee,


    1.

    Quote Originally Posted by CornedBee View Post
    When a DLL is moved in memory, some of the DLL code needs to be modified to work with the new address. These modifications are called fixups. This is slow.
    I think you mean some old address is changed to some new address (e.g. an address of a function) is change in some part of the DLL, we need to change code where the function is referred to the new address, so the context "fixup" applies to the code which refers the function, other than the making the function from old address in some new address itself?

    2.

    Quote Originally Posted by CornedBee View Post
    Because the DLL probably won't need to be relocated from the new place, the defaults are correct and the fixup pass can be skipped.
    Means "Fixup" in (1) is only applied to a temporary copy and the real DLL content -- which defaults from 0x1000000 does not change?


    regards,
    George

  8. #8
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Yes, fixups is where the loader (the thing what loads the application and it's DLL's or responds to LoadLibrary user calls) walks through all absolute virtual addresses (such as vtables, addresses of variables in the code or data initializations, jump tables for switch tables, function pointer assignments, etc, etc) in the DLL and replaces them with the actual address choosen for this DLL - of course, if you picked a rather unique DLL address in the first place, then no Fixup is needed, because the addresses are already correct in the binary file. This is the whole point of this discussion - no "fixup" means faster loading, which is what we want!

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  9. #9
    Registered User
    Join Date
    May 2006
    Posts
    1,579
    Thanks Mats,


    Two more comments,

    1. If we use default load address for a DLL (0x1000000), and our application has more than one dependent DLL, then from loading (application start-up) time, from the 2nd DLL, fixup operations begin to work on the DLL, right? (since 0x1000000 can only contain one DLL). It means the content of the DLL is changed in memory (change address to fixed-up address) and different from the physical DLL image on disk, right?

    2. When a DLL is built, all the address needed in the binary DLL image is hard-coded based on the default or assigned base address -- as absolute address, right?

    For example, suppose a function in DLL's address will be built as 0x1000100 if it has 0x100 bytes offset from beginning.

    Previously, I think I am wrong, because I think when a DLL is built, all the address needed in the binary DLL is built as relative address, not absolute address and the exact address is determined at load time when choosing an Address.

    For example, suppose a function in DLL's address will be built as 0x100 other than 0x1000100 if it has 0x100 bytes offset from beginning.

    Any comments about (1) and (2)?

    Quote Originally Posted by matsp View Post
    Yes, fixups is where the loader (the thing what loads the application and it's DLL's or responds to LoadLibrary user calls) walks through all absolute virtual addresses (such as vtables, addresses of variables in the code or data initializations, jump tables for switch tables, function pointer assignments, etc, etc) in the DLL and replaces them with the actual address choosen for this DLL - of course, if you picked a rather unique DLL address in the first place, then no Fixup is needed, because the addresses are already correct in the binary file. This is the whole point of this discussion - no "fixup" means faster loading, which is what we want!

    --
    Mats

    regards,
    George
    Last edited by George2; 02-03-2008 at 12:45 AM.

  10. #10
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Quote Originally Posted by George2 View Post
    2. When a DLL is built, all the address needed in the binary DLL image is hard-coded based on the default or assigned base address -- as absolute address, right?
    Yes. It wouldn't work otherwise, of course.

    Previously, I think I am wrong, because I think when a DLL is built, all the address needed in the binary DLL is built as relative address, not absolute address and the exact address is determined at load time when choosing an Address.
    Think a little here! If they contained relative addresses, then a fixup is necessary whatever address the dll is loaded into because the processor works on absolute addresses, not relative! So that would negate any advantage at all.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  11. #11
    Registered User
    Join Date
    May 2006
    Posts
    1,579
    Thanks Elysia,


    Now I understand using absolute address. But what do you mean "a fixup is necessary whatever address the dll is loaded into"? And "negate any advantage at all"?

    Quote Originally Posted by Elysia View Post
    Think a little here! If they contained relative addresses, then a fixup is necessary whatever address the dll is loaded into because the processor works on absolute addresses, not relative! So that would negate any advantage at all.

    regards,
    George

  12. #12
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    If the dll was using relative references, then regardless of what address it was mapped into, the loader would still need to parse the code and fix any addresses. The whole discussion was that all dlls are mapped into a specific absolute address by default, which the code is compiled to refer to. So the dll, by default, relies on itself being mapped into address 0x1000000 for example. If the dll is mapped into that address, no fixup is required. But if it's mapped into a different address, then it needs to run a fixup.
    Similarly, if relative addresses are used, then it needs to run a fixup too, to change these relative addresses into absolute addresses. And as opposed to the absolute addresses, this fixup always needs to run whatever address the dll is mapped into. Therefore it would negate any advantage to using relative addresses when using absolute addresses can skip the fixup in a certain situation.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  13. #13
    Registered User
    Join Date
    May 2006
    Posts
    1,579
    Thanks Elysia,


    I have understood your point. My question is answered. You are so knowledgeable and patient.

    Quote Originally Posted by Elysia View Post
    If the dll was using relative references, then regardless of what address it was mapped into, the loader would still need to parse the code and fix any addresses. The whole discussion was that all dlls are mapped into a specific absolute address by default, which the code is compiled to refer to. So the dll, by default, relies on itself being mapped into address 0x1000000 for example. If the dll is mapped into that address, no fixup is required. But if it's mapped into a different address, then it needs to run a fixup.
    Similarly, if relative addresses are used, then it needs to run a fixup too, to change these relative addresses into absolute addresses. And as opposed to the absolute addresses, this fixup always needs to run whatever address the dll is mapped into. Therefore it would negate any advantage to using relative addresses when using absolute addresses can skip the fixup in a certain situation.

    regards,
    George

  14. #14
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    On the other hand, the x64-64 architecture has special extensions that allow position-independent code (PIC) to be generated. Such code actually uses relative addressing and thus works no matter where the DLL is, making rebasing unnecessary.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  15. #15
    Registered User
    Join Date
    May 2006
    Posts
    1,579
    Thanks for sharing your knowledge, CornedBee!


    Quote Originally Posted by CornedBee View Post
    On the other hand, the x64-64 architecture has special extensions that allow position-independent code (PIC) to be generated. Such code actually uses relative addressing and thus works no matter where the DLL is, making rebasing unnecessary.

    regards,
    George

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. non-MFC DLL with MFC app question.
    By Kempelen in forum Windows Programming
    Replies: 10
    Last Post: 08-20-2008, 07:11 AM
  2. dll communicating between each other
    By cloudy in forum C++ Programming
    Replies: 5
    Last Post: 06-17-2005, 02:20 AM
  3. DLL and std::string woes!
    By Magos in forum C++ Programming
    Replies: 7
    Last Post: 09-08-2004, 12:34 PM
  4. .lib vs .h vs .dll
    By Shadow12345 in forum C++ Programming
    Replies: 13
    Last Post: 01-01-2003, 05:29 AM
  5. Passing parameters from VB to C++ through ActiveX DLL
    By torbjorn in forum Windows Programming
    Replies: 0
    Last Post: 12-10-2002, 03:13 AM