This is my first try with GCC inline assembly. I followed an example from this paper (page 3) and wrote a small test program:
It compiles well with GCC 3.3.1 (cygwin):Code:#include <stdio.h> int main() { int i; float a[4] = {2.0}, b[4] = {3.0}, c[4] = {-1.0}; __asm__("movaps %1, %%xmm0 \n\t" /* copy vector a[] to SSE register xmm0 */ "movaps %2, %%xmm1 \n\t" /* copy vector b[] to SSE register xmm1 */ "divps %%xmm0, %%xmm1 \n\t" /* divide xmm0 by xmm1 and write result to xmm1 */ "movaps %%xmm1, %0" /* copy xmm1 to vector c[] */ : "=m" (c[0]) /* output %0 */ : "m" (a[0]), /* input %1 */ "m" (b[0]));/* input %2 */ for(i = 0; i < 4; ++i) printf("%f", c[i]); printf("\n"); return 0; }
gcc -pedantic -W -Wall -masm=intel -o test.exe divtest.c
But the test program just crashes with "illegal instruction".
I guess there is an obvious mistake in my source code, but as I already wrote this is my first try.
Yes, my CPU supports SSE (Intel Celeron Tualatin (PIII core)).
Thank you for your help.
The stack trace:
Here is the example from the paper I mentioned above:Code:Exception: STATUS_PRIVILEGED_INSTRUCTION at eip=004010DF eax=BF800000 ebx=00000004 ecx=610CB16C edx=00000002 esi=00000000 edi=00000000 ebp=0022FEF0 esp=0022FE80 program=C:\test.exe cs=001B ds=0023 es=0023 fs=0038 gs=0000 ss=0023 Stack trace: Frame Function Args 0022FEF0 004010DF (00000001, 616020CC, 0A040330, 0022FF24) 0022FF40 61005018 (610CFEE0, FFFFFFFE, 000003D4, 610CFE04) 0022FF90 610052ED (00000000, 00000000, 8043138F, 00000000) 0022FFB0 004014C1 (00401055, 037F0009, 0022FFF0, 77E787F5) 0022FFC0 0040103C (00000000, 00000000, 7FFDF000, 00000000) 0022FFF0 77E787F5 (00401000, 00000000, 000000C8, 00000100) End of stack trace
Code:for (i=0;i<100;i+=4) __asm__ __volatile__ ( "movaps %1, %%xmm0 \n\t" "movaps %2, %%xmm1 \n\t" "addps %%xmm0, %%xmm1 \n\t" "movaps %%xmm1, %0" : "=m" (a[i]) : "m" (b[i]), "m" (c[i]));



LinkBack URL
About LinkBacks




