Thread: fast XORing chars

  1. #1
    Registered User
    Join Date
    May 2006
    Posts
    57

    fast XORing chars

    Im working on a project where speed is a major concern and where I'm also working on alot of 4x4 arrays of chars.

    I would like to XOR these chars by grouping them together 4 at a time(by casting), hopefully as a optimization(so the proc is working with its normal 32 bit amount).

    Sounds simple but I cant seem to appease the compiler gods.

    My code:
    Code:
    #include <stdio.h>
    
    int main() {
    	unsigned char x[4];
    	unsigned char y[4];
    
    	x = (unsigned char[4])((unsigned int)x ^ (unsigned int)y);
    	
    	return 0;
    }
    gives:
    xor.c: In function 'main':
    xor.c:7: error: cast specifies array type
    Any advice?
    Thanks

  2. #2
    ---
    Join Date
    May 2004
    Posts
    1,379
    So you want to
    Code:
    for(i=0;i++;i<4){
      x[i] ^= y[i]
    }
    but you want it done all in one clear sweep?
    I'm not sure that is possible
    I wouldn't worry about speed with something so small. I'm sure you wouldn't even notice it.

  3. #3
    Woof, woof! zacs7's Avatar
    Join Date
    Mar 2007
    Location
    Australia
    Posts
    3,459
    > (unsigned int)x ^ (unsigned int)y
    That doesn't do what you think it does.

    It casts x & y (actually pointers in this case) to unsigned integers -- and XORs the address together.

    Basically you want to XOR 2 full sized words? (Assuming they're 32bits each -- and a char is 8bits), rather than XORing 4 1/4 words?
    Well the compiler *might* optimize in this case.
    Last edited by zacs7; 05-29-2008 at 10:33 PM.

  4. #4
    Registered User
    Join Date
    May 2006
    Posts
    57
    hmmm, never would of caught that. But now my dereferencing gives me errors now. Not sure how I should attack this now

  5. #5
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    unsigned long int *px = (unsigned long int *)x;

    Then
    *px = *px ^ *py;

    But this is a cast-alignment problem. You may not even be able to access the arrays through a long pointer, if the alignment is all wrong.

    Some machines will trap with a memory error, and others will have much worse performance than if you'd just done it one char at a time.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  6. #6
    Registered User
    Join Date
    May 2006
    Posts
    57
    Hmm, not sure I understand how this could possibly be slower.

    I don't know much about it, but in assembly you don't have 'types', you just have different operations and these operations have different versions for different sizes. Correct? So that leads to a possible compiler problem?

    I hate how I would have to waste space with making the extra pointers, I'm also in a space crunch. Hmm...


    Thanks for all the quick replies guys.

  7. #7
    Woof, woof! zacs7's Avatar
    Join Date
    Mar 2007
    Location
    Australia
    Posts
    3,459
    > you just have different operations and these operations have different versions for different sizes.
    > Correct?
    Not really. OPs are done on words, for a 32bit system a word would most likely be 4 bytes long (2^32).

    > I'm also in a space crunch. Hmm...
    What exactly are you doing?
    Last edited by zacs7; 05-29-2008 at 11:09 PM.

  8. #8
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    > I hate how I would have to waste space with making the extra pointers, I'm also in a space crunch. Hmm...
    Perhaps try it and see what the generated code looks like.

    Most decent compilers will optimise away such things to get to where you really want to be.

    It's not like you're returning the pointers, incrementing them, passing to other functions or anything. It's a simple assignment to clarify the cast, then dereference - that's all.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  9. #9
    Registered User
    Join Date
    Oct 2001
    Posts
    2,129
    Maybe you could make a union?

  10. #10
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by zxcv View Post
    Hmm, not sure I understand how this could possibly be slower.
    If you have an unaligned access (that is, an int or long that is not aligned to an even 4 byte address), then the processor will do one of the following, depending on a combination of the processor model and the OS behaviour:
    1. Crash the application with a "unaligned access trap error".
    2. Perform the individual reads of the unaligned access inside a trap handler, taking 10-100x longer to access than the a single byte access.
    3. Perform in hardware, multiple reads. This is only 2x slower than the aligned access, so acceptable.

    I don't know much about it, but in assembly you don't have 'types', you just have different operations and these operations have different versions for different sizes. Correct? So that leads to a possible compiler problem?
    Correct, the assembler code will be using "sized" operations to do byte, word or long accesses (byte=8 bit, word=16 bit, long = 32 bit). But the rules about alignments vary from one type of processor to another (and sometimes also varies depending on the settings or models withing a processor family).
    I hate how I would have to waste space with making the extra pointers, I'm also in a space crunch. Hmm...


    Thanks for all the quick replies guys.
    Are you sure that this is a performance limitation in the first place? I'd expect a modern processor being able to do a gazillion XOR operations in a second, so you'd either be using some pretty limited hardware (e.g. embedded system), or you are doing XOR on huge arrays - in which case, you may want to consider the fact that SSE has an XOR instruction that does 16 bytes at a time - but you'd better have the data 16-byte aligned.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. while condition question..
    By transgalactic2 in forum C Programming
    Replies: 3
    Last Post: 04-04-2009, 04:49 PM
  2. Counting how many chars in a string
    By tigs in forum C Programming
    Replies: 4
    Last Post: 08-05-2002, 12:25 AM
  3. really got stuck with unsigned chars
    By Abdi in forum C Programming
    Replies: 7
    Last Post: 06-11-2002, 12:47 PM
  4. move chars position
    By SpuRky in forum C Programming
    Replies: 3
    Last Post: 06-09-2002, 02:59 AM
  5. fancy strcpy
    By heat511 in forum C++ Programming
    Replies: 34
    Last Post: 05-01-2002, 04:29 PM