Thread: Structure padding to 16 bytes for use with atomic instructions - like, WTH?

  1. #1
    Registered User
    Join Date
    Dec 2009
    Posts
    83

    Structure padding to 16 bytes for use with atomic instructions - like, WTH?

    Gents,

    I am using atomic instructions on x64 and variables so used must be 16 byte aligned.

    I use a number of structures where their members are so operated upon.

    The structures accordingly needs must be 16 byte aligned and padded - their internal members must be on 16 byte boundaries and, crucially, there must be tail padding to a 16 byte boundary, so I can allocate arrays of these structures and use pointer math to iterate.

    (I am naturally using aligned malloc).

    The problem I am finding is that it is not apparent to me how to achieve this end.

    Here below we have a test structure (currently I'm working with the latest Amazon Linux GCC, 4.6.3, on x64);

    Code:
    #define LFDS700_ALIGN_DOUBLE_POINTER 16
    #define LFDS700_ALIGN(alignment)  __attribute__( (aligned(alignment)) )
    
    LFDS700_ALIGN(LFDS700_ALIGN_DOUBLE_POINTER) struct test_element
    {
      struct lfds700_freelist_element
        fe;
    
      lfds700_atom_t
        thread_number;
    
      unsigned int
        datum;
    };
    This in turn contains as you have seen a struct lfds700_freelist_element, thus (PAC_SIZE is 2);

    Code:
    LFDS700_ALIGN(LFDS700_ALIGN_DOUBLE_POINTER) struct lfds700_freelist_element
    {
      struct lfds700_freelist_element
        *next[PAC_SIZE];
    
      void const
        *user_data;
    };
    I allocate an array of test elements, thus;

    Code:
    te_array = abstraction_aligned_malloc( sizeof(struct test_element) * 100000, LFDS700_ALIGN_DOUBLE_POINTER );
    The problem manifest is that sizeof(struct test_element) is 40 bytes!

    So the second element does not begin on a 16 byte boundary and we all fall down.

    Printing the addresses of the first element in the test element array, I see the following;

    Code:
    (gdb) print *ts->te_array
    $2 = {fe = {next = {0x7fffec0008d0, 0x2}, user_data = 0x7fffdc0008d0}, thread_number = 3, datum = 0}
    (gdb) print sizeof(struct test_element)
    $3 = 40
    (gdb) print &ts->te_array->fe.next
    $4 = (struct lfds700_freelist_element *(*)[2]) 0x7fffdc0008d0 (16 bytes long and aligned on 16 bytes)
    (gdb) print &ts->te_array->fe.user_data
    $5 = (const void **) 0x7fffdc0008e0 (8 bytes long and aligned on 16 bytes)
    (gdb) print &ts->te_array->thread_number
    $6 = (lfds700_atom_t *) 0x7fffdc0008e8 (8 bytes long and aligned on 8 bytes)
    (gdb) print &ts->te_array->datum
    $7 = (unsigned int *) 0x7fffdc0008f0 (8 bytes long and aligned on 16 bytes)
    So we see fe->next is the first element and so is correctly aligned curtsey of aligned malloc, where fe->next is 16 bytes, fe->user_data is correctly aligned, but then te->thread_number is misaligned and te->datum is given eight bytes rather than four, leaving us in the end without correct tail padding to a 16 byte boundary.

    So, what gives? how *am* I supposed to indicate to the compiler it must pad structures to 16 byte boundaries?
    Last edited by Toby Douglass; 06-30-2013 at 11:09 AM.

  2. #2
    Hurry Slowly vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,788
    Add padding member to the struct?
    All problems in computer science can be solved by another level of indirection,
    except for the problem of too many layers of indirection.
    – David J. Wheeler

  3. #3
    Registered User
    Join Date
    Dec 2009
    Posts
    83
    Quote Originally Posted by vart View Post
    Add padding member to the struct?
    The code is portable. The platform will vary, e.g. the compiler in particular. Structure alignment is compiler dependent. Hard coding padding is awkward enough when word lengths vary, but when the compiler also varies, you can no longer expect your padding to work.

  4. #4
    Registered User
    Join Date
    Apr 2013
    Posts
    1,658
    If you know the number of structures needed before you allocate them, you can do a single call to malloc() with a padded count, then set pointers to structures as rounded up (padded) offsets from the pointer returned by malloc() (to round up, use something like (count * ((sizeof(...)+15)%16)) ).

  5. #5
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    The problem manifest is that sizeof(struct test_element) is 40 bytes!
    O_o

    The alignment would not honor the size of structure members the way you've written the code.

    The alignment would only apply to the structure proper as allocated by the compiler.

    You need to change the alignment for each member within the structure which would change the size so, with an aligned `malloc', could be allocated accordingly the alignment of each member of the array.

    (In other words, apply the alignment required directly to the members of the structure.)

    Soma
    “Salem Was Wrong!” -- Pedant Necromancer
    “Four isn't random!” -- Gibbering Mouther

  6. #6
    Registered User
    Join Date
    Dec 2009
    Posts
    83
    Quote Originally Posted by rcgldr View Post
    If you know the number of structures needed before you allocate them, you can do a single call to malloc() with a padded count, then set pointers to structures as rounded up (padded) offsets from the pointer returned by malloc() (to round up, use something like (count * ((sizeof(...)+15)%16)) ).
    The padded count is hard to know since CPU and compiler vary.

    I would greatly prefer (and expect exists) a solution where I can use ordinary pointer math to iterate over an array, rather than manually computing offsets.

  7. #7
    Registered User
    Join Date
    Dec 2009
    Posts
    83
    Quote Originally Posted by phantomotap View Post
    O_o

    The alignment would not honor the size of structure members the way you've written the code.

    The alignment would only apply to the structure proper as allocated by the compiler.
    Mmm. I see what you're saying, but in fact even the latter is not occurring, as the size of the struct is 40, rather than an integer multiple of 16.

    Alignment without tailing padding is half-baked as it cannot be used in arrays.

  8. #8
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    I see what you're saying, but in fact even the latter is not occurring, as the size of the struct is 40, rather than an integer multiple of 16.
    O_o

    You have misunderstood.

    You are not using the feature correctly.

    Alignment without tailing padding is half-baked as it cannot be used in arrays.
    Indeed.

    That is why the compiler will pad the structure to the appropriate multiple when used correctly.

    [Edit]
    I would have just told you to search for examples, but I wanted to make sure the hint behind my words had a specific example.
    [/Edit]

    Soma

    Code:
    #define align __attribute__( (aligned(16)) )
    
    struct align STest1
    {
        char a1;
        int b1;
        char c1;
        double d1;
        char e1;
        double d2;
        char c2;
        int b2;
        char a2;
    };
    
    struct STest2
    {
        align char a1;
        align int b1;
        align char c1;
        align double d1;
        align char e1;
        align double d2;
        align char c2;
        align int b2;
        align char a2;
    };
    
    struct STest3_imp
    {
        char a1;
        int b1;
        char c1;
        double d1;
        char e1;
        double d2;
        char c2;
        int b2;
        char a2;
    };
    
    struct STest3
    {
        align struct STest3_imp data; // This is what I was hinting.
    };
    
    #include <stdio.h>
    
    int main()
    {
        printf("%d\n", (int)sizeof(STest1));
        printf("%d\n", (int)sizeof(STest2));
        printf("%d\n", (int)sizeof(STest3));
        return(0);
    }
    Last edited by phantomotap; 06-30-2013 at 06:00 PM. Reason: *derp*
    “Salem Was Wrong!” -- Pedant Necromancer
    “Four isn't random!” -- Gibbering Mouther

  9. #9
    Registered User
    Join Date
    Apr 2013
    Posts
    1,658
    Quote Originally Posted by Toby Douglass View Post
    The padded count is hard to know since CPU and compiler vary. I would greatly prefer (and expect exists) a solution where I can use ordinary pointer math to iterate over an array, rather than manually computing offsets.
    What I was suggesting was to use an array of "aligned" pointers to access the members of an array of structures. Normal indexing could be used with the array of pointers. phantomotap may have a bettter method though.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. C Structure Padding
    By audinue in forum C Programming
    Replies: 20
    Last Post: 07-12-2011, 10:14 PM
  2. Structure Padding in C
    By karthik537 in forum C Programming
    Replies: 3
    Last Post: 06-15-2011, 07:10 AM
  3. Structure padding
    By ShashiKantSuman in forum C Programming
    Replies: 4
    Last Post: 05-03-2011, 07:50 AM
  4. Padding in Structure
    By ganesh bala in forum C Programming
    Replies: 11
    Last Post: 01-29-2009, 09:25 PM
  5. Atomic instructions
    By Roaring_Tiger in forum C Programming
    Replies: 1
    Last Post: 04-29-2003, 09:00 PM