fast checking for alignment

This is a discussion on fast checking for alignment within the C++ Programming forums, part of the General Programming Boards category; i need to pre-increment a pointer to be 16B aligned for use with sse2 functions, but i am somewhat concerned ...

  1. #1
    3735928559
    Join Date
    Mar 2008
    Location
    RTP
    Posts
    839

    fast checking for alignment

    i need to pre-increment a pointer to be 16B aligned for use with sse2 functions, but i am somewhat concerned about the repeated use of the modulus. i have heard it is possible to check for alignment with the & operator, but i can't seem to get that to work.

    Code:
    //...
    unsigned char* index1 = memberDataBlock;
    unsigned char* index2 = someOtherDataBlock;
    
    while(index1%16||index2%16)
    {
        *index1++ |= *index2++;
    }
    
    //...
    is there a faster way to achieve the condition above other than using the modulus? i don't need the actual remainder, i just need to know if there is one or not.


    please feel free to move this to the C forum if it is deemed more appropriate

  2. #2
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,185
    "while (index1&0xF)" perhaps?

  3. #3
    3735928559
    Join Date
    Mar 2008
    Location
    RTP
    Posts
    839
    aaaah yes. your use of hex has made my careless error immediately obvious.

    index&0xF , not index&16

    edit: thanks!
    Last edited by m37h0d; 08-10-2009 at 02:24 PM.

  4. #4
    Captain Crash brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,263
    Quote Originally Posted by m37h0d View Post
    i need to pre-increment a pointer to be 16B aligned for use with sse2 functions, but i am somewhat concerned about the repeated use of the modulus. i have heard it is possible to check for alignment with the & operator, but i can't seem to get that to work.

    Code:
    //...
    unsigned char* index1 = memberDataBlock;
    unsigned char* index2 = someOtherDataBlock;
    
    while(index1%16||index2%16)
    {
        *index1++ |= *index2++;
    }
    Nice infinite loop you have there. (The only way it isn't infinite is if index1 & 0xF == index2 & 0xF)
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  5. #5
    3735928559
    Join Date
    Mar 2008
    Location
    RTP
    Posts
    839
    you sure?
    Code:
    while(index1&0xF||index2&0xF)
    "seems to work" /famous last words


    The only way it isn't infinite is if index1 & 0xF == index2 & 0xF
    and yes, if i want them both to be 16 byte aligned, isn't that exactly the condition i'm looking for? O_o
    Last edited by m37h0d; 08-10-2009 at 02:41 PM.

  6. #6
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,185
    Quote Originally Posted by m37h0d View Post
    you sure?
    Code:
    while(index1&0xF||index2&0xF)
    "seems to work"
    I'm with you. (This assumes that you only care about one of them being aligned. If you need them both to be aligned, then things could get rough.)

  7. #7
    3735928559
    Join Date
    Mar 2008
    Location
    RTP
    Posts
    839
    yes, i need them both aligned. and brewbuck's right. the addresses in my test just happened to be 16B aligned already.

  8. #8
    Captain Crash brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,263
    Quote Originally Posted by m37h0d View Post
    yes, i need them both aligned. and brewbuck's right. the addresses in my test just happened to be 16B aligned already.
    If you need them both aligned, and they are not already co-aligned to 16 bytes, then you cannot do it. You will either need to copy one data set to a location where it is 16-byte aligned, or use SSE2 shuffle instructions combined with an extremely hard-to-understand inner loop to handle the misalignment. Lastly, you could use the unaligned store/load which, while slower, might still be faster than the other two alternatives.

    It's extremely difficult to write efficient SIMD algorithms when you aren't in control of the original alignment.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  9. #9
    3735928559
    Join Date
    Mar 2008
    Location
    RTP
    Posts
    839
    in this case, i can be, but it may not be the case forever. the buffers in this case are data buffers for a piece of hardware. the driver for the particular piece of equipment has an option to use a user-defined buffer. future HW selections may not; thus necessitating nasty superfluous memcpys. i doubt that's really much of an issue though, because this is honestly the first piece of hw i've seen whose driver attempts to manage it's own memory on the client side.

  10. #10
    Captain Crash brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,263
    Quote Originally Posted by m37h0d View Post
    in this case, i can be, but it may not be the case forever. the buffers in this case are data buffers for a piece of hardware. the driver for the particular piece of equipment has an option to use a user-defined buffer. future HW selections may not; thus necessitating nasty superfluous memcpys. i doubt that's really much of an issue though, because this is honestly the first piece of hw i've seen whose driver attempts to manage it's own memory on the client side.
    Most hardware buffers I've ever seen are at least 16-byte aligned and possibly quite a bit more. Particularly hardware buffers which are intended for direct memory mapping, are usually page-aligned. Hardware designers like alignment just as much as you do
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  11. #11
    3735928559
    Join Date
    Mar 2008
    Location
    RTP
    Posts
    839
    yes, curious that it invariably was 16+B aligned, but the documentation doesn't guarantee it.

    sadly the user-buffer option is not working according to spec :grumble:

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Buidl Library with ./configure script
    By Jardon in forum C Programming
    Replies: 6
    Last Post: 07-24-2009, 10:36 AM
  2. Profiler Valgrind
    By afflictedd2 in forum C++ Programming
    Replies: 4
    Last Post: 07-18-2008, 10:38 AM
  3. Interpreter.c
    By moussa in forum C Programming
    Replies: 4
    Last Post: 05-28-2008, 06:59 PM
  4. Forced moves trouble!!
    By Zishaan in forum Game Programming
    Replies: 0
    Last Post: 03-27-2007, 07:57 PM
  5. Problems about gcc installation
    By kevin_cat in forum Linux Programming
    Replies: 4
    Last Post: 08-09-2005, 10:05 AM

Tags for this Thread


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21