Thread: Questions about GCC's vector extensions

  1. #1
    Registered User
    Join Date
    Jun 2009
    Posts
    101

    Questions about GCC's vector extensions

    I am using GCC 4.6.2 on Mac OS X and am experimenting with vector extensions. The idea is to perform basic adjustments to still images for the purpose of evaluating the speed advantage of using vectors. However, I'm running into some unexpected behavior.

    I have two functions. One processes a buffer of still image data without vectors (simple for loop). The second processes RGBA as a single vector, using a foor loop with 1/4 of the iterations of the first loop. The idea is to process four pixels at a time using the XMM SSE registers.

    Here are the loops:
    Code:
    int i,j;
    
    /*adjustment struct used to adjust color*/
    struct {
    	uint8_t b;
    	uint8_t g;
    	uint8_t r;
    	uint8_t a;
    } adjust;
    
    /*image dimensions*/
    int width,height;
    
    /*calculate size of image data*/
    long imgdata_size = (width*height)*sizeof(uint32_t);
    
    
    /***scalar version***/
    uint32_t *buf = (uint32_t*)malloc(imgdata_size);
    
    /*
    read image from disk, etc...
    */
    
    for(i=0;i<height;i++){
    	for(j=0;j<width;j++){
    		buf[(i*width)+j] += *((uint32_t*)&adjust);
    	}
    }
    free(buf);
    
    
    /***vector version***/
    typedef uint32_t v4si __attribute__ ((vector_size (16)));
    
    /*put four copies of adjustment struct into 128 bit vector*/
    v4si adjust_vec = {*(uint32_t*)&adjust,*(uint32_t*)&adjust,*(uint32_t*)&adjust,*(uint32_t*)&adjust};
    
    v4si *vecbuf = (v4si*)malloc(imgdata_size);
    
    /*
    read image from disk, etc...
    */
    
    for(i=0;i<imgdata_size/sizeof(v4si);i++){
    	vecbuf[i] += adjust_vec;
    }
    free(vecbuf);
    Now, both of these loops run OK. However, they both run at the same speed! How's that? One is using scalar values, one is using vectors. Shouldn't the vector version blow the scalar version away?

    I checked the asm generated by GCC, and in fact GCC is vectorizing both. There is heavy use of XMM registers in both versions. The only way I can force GCC to not vectorize is if I compile as 32 bit. Compiling as 64 bit blows away 32 bit even if -O3 is used for both. Are 128 bit registers not available in 32 bit mode? If I try to turn off optimization in 64 bit mode, GCC still uses XMM registers. I can't keep it from using XMM no matter what I do, even if I explicitly set -O0 and -mno-sse (GCC complains "SSE register return with SSE disabled").

    I guess the issue is that I don't see a lot of point in using vector extensions when GCC is vectorizing the code anyway. If this is the case, then what's the point of vector extensions? Is GCC just that good that it doesn't need vectors explicitly defined?
    Last edited by synthetix; 03-24-2012 at 12:05 AM.

  2. #2
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,612
    If you compiled with stricter rules (-std=c99 instead of the default gnu89 for example) the results of optimizing switches will change because the compiler cannot take advantage of extensions without violating the rules. So you should keep that in mind.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Two easy vector questions.
    By Pecado in forum C Programming
    Replies: 8
    Last Post: 10-17-2010, 06:22 PM
  2. so many open gl extensions...
    By Raven Arkadon in forum Game Programming
    Replies: 15
    Last Post: 08-25-2006, 02:28 PM
  3. html extensions
    By Chesspunk in forum Tech Board
    Replies: 5
    Last Post: 05-22-2004, 12:25 AM
  4. anyone know about interrupt 13h extensions?
    By merlyn2000 in forum A Brief History of Cprogramming.com
    Replies: 17
    Last Post: 06-17-2002, 11:00 AM
  5. extensions
    By canine in forum Windows Programming
    Replies: 4
    Last Post: 01-25-2002, 06:10 AM