Thank you for the 2nd example, rafe.
But I still got a problem with vectors:
I've just tried to use the SSE instructions of my Celeron Tualatin with this little program:
Code:
#include <stdlib.h>
typedef float v4sf __attribute__ ((mode(V4SF)));
typedef union
{
v4sf v;
float p[4];
}TEST;
int main()
{
int i;
TEST test, test2;
test.p[0] = 11111.12349;
test.p[1] = 12378.4357;
test.p[2] = 12343.2387;
test.p[3] = 23498.23489;
test2.p[0] = 2.5;
test2.p[1] = 3.2;
test2.p[2] = 4.9;
test2.p[3] = 7.3;
test.v = __builtin_ia32_divps(test.v, test2.v);
for(i = 0; i < 4; i++)
printf("%f ", test.p[i]);
printf("\n");
return 0;
}
compiled with "-msse" option
But the program crashes with "Illegal instruction (core dumped)":
Code:
Exception: STATUS_PRIVILEGED_INSTRUCTION at eip=004010EC
eax=00000000 ebx=00000000 ecx=610C819C edx=00000002 esi=00000000 edi=00402910
ebp=0022FEF0 esp=0022FE90 program=C:\test\vector.exe
cs=001B ds=0023 es=0023 fs=0038 gs=0000 ss=0023
Stack trace:
Frame Function Args
0022FEF0 004010EC (00000001, 615F0740, 0A040330, 0022FF24)
0022FF40 610072E8 (610CBAA8, FFFFFFFE, 0000002C, 610CB9CC)
0022FF90 610075CD (00000000, 00000000, 80430F47, 00000000)
0022FFB0 00402702 (00401096, 037F0009, 0022FFF0, 77E8CA90)
0022FFC0 0040103C (0022E7A4, 0022F844, 7FFDF000, 0022E8CC)
0022FFF0 77E8CA90 (00401000, 00000000, 000000C8, 00000100)
End of stack trace
I am using Cygwin with gcc 3.2 on W2k.
I hope someone can help me, because I don't know what I have done wrong in the source code ...