Thread: Loading files into memory

  1. #16
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Quote Originally Posted by C_ntua View Post
    To see if I understand this correctly. Say you have:
    Code:
    struct s {
       int x;
       int y;
       short z;
       double d;
    };
    Then the compiler will want everything to be a multiplication of 4 bytes? Thus pad the memory with 2 bytes after the short?
    Typically the compiler will strive to align on DWORD (4 bytes) boundaries or on boundaries that matches the biggest variable size (in this case, double) in the struct.
    However, all padding is both compiler and platform dependant.

    Can I assume that it is faster by padding the bytes so it can think of the struct in a more "array" way, so knowing the address of the pointer of the struct, find more quickly the values of each member of the struct?
    Not really, since the offsets are known at compile time, there's no runtime overhead for accessing each member, whether they're padded or not.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  2. #17
    Registered User C_ntua's Avatar
    Join Date
    Jun 2008
    Posts
    1,853
    So why would padding be faster?

  3. #18
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Quote Originally Posted by C_ntua View Post
    So why would padding be faster?
    Because:
    Quote Originally Posted by CornedBee View Post
    On some architectures, trying to read unaligned memory can even trigger a hardware exception.
    Quote Originally Posted by Elysia View Post
    Even on the case of x86 and x86-64, it can cause processor penalties (speed hits!) to read unaligned memory...
    Quote Originally Posted by matsp View Post
    Aside from CornedBee's correct statement that on some machines, reading unaligned memory [1] can cause crashes. Even if it doesn't cause a crash, it is slower to read memory from an unaligned address (because the processor has to collect data from two separate reads and put it together into one item).
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  4. #19
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    Oh, and you typically can't perform atomic operations (test_and_set, compare_and_swap, etc.) on unaligned memory locations.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  5. #20
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by Elysia View Post
    Typically the compiler will strive to align on DWORD (4 bytes) boundaries or on boundaries that matches the biggest variable size (in this case, double) in the struct.
    I would rephrase that and say that it strives to align all items to their natural alignment. That is 4 bytes for DWORD and int, 2 bytes for WORD or short, 1 byte for char, 2 or 4 bytes for a wchar_t (2 bytes in Windows, 4 bytes in Linux), 8 bytes for a long long (or long in Linux on a 64-bit OS), 8 bytes for double.

    This is because the data-lines going into the processor are lined up in a way that the 32 bits for a 32-bit integer comes in on certain lines. If it's unaligned, the processor will have to "double step" to get the data in, first read one portion, then jump to the next portion.
    However, all padding is both compiler and platform dependant.
    Indeed. And the penalty for getting the padding "wrong" varies from a one extra clock-cycle per access, then a few dozen or hundreds extra for a "unaligned access trap", all the way to "program crashes". I know OS's that only allow unaligned access in user-mode, so kernel mode unaligned access will lead to an Kernel OOPS, BSOD or whatever the corresponding is on the relevant OS.

    And CornedBee makes another good point that some instructions are REQUIRED to be aligned even if generally data CAN be unaligned at some performance penalty.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  6. #21
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    Another example of instructions that require alignment: SSE. SSE loads (except for special "unaligned load" instructions) must be aligned to 16-byte boundaries.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  7. #22
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Quote Originally Posted by matsp View Post
    I would rephrase that and say that it strives to align all items to their natural alignment. That is 4 bytes for DWORD and int, 2 bytes for WORD or short, 1 byte for char, 2 or 4 bytes for a wchar_t (2 bytes in Windows, 4 bytes in Linux), 8 bytes for a long long (or long in Linux on a 64-bit OS), 8 bytes for double.
    You're probably right now that I think about it :)
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  8. #23
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by CornedBee View Post
    Another example of instructions that require alignment: SSE. SSE loads (except for special "unaligned load" instructions) must be aligned to 16-byte boundaries.
    Yes, very true. And with the above mentioned exception in mind, there is only two options: Make sure the data IS aligned, or read the data into a register using the special "unaligned" instructions (and they ARE slower in most machines, even if the data IS aligned). For such operations, you do not only need alignment to the boundary of the basic data, but a bigger alignment (because the basic data is multiple elements, but the WHOLE block of for example 4 floats need to be 16-byte aligned).

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. reading files into memory
    By bobthebullet990 in forum C Programming
    Replies: 3
    Last Post: 11-30-2005, 03:39 PM
  2. CreateProcess and Memory mapped files
    By kevcri in forum C++ Programming
    Replies: 14
    Last Post: 12-10-2003, 03:14 AM
  3. Is it necessary to write a specific memory manager ?
    By Morglum in forum Game Programming
    Replies: 18
    Last Post: 07-01-2002, 01:41 PM
  4. Loading a file into memory ???
    By Unregistered in forum C++ Programming
    Replies: 2
    Last Post: 01-09-2002, 07:35 AM