Why isn't this C program optimized?
Code:
for (j=1; j<=3120; j++) {
for (i=1; i<=j-1; i++) {
T = 0.0;
for (k=1; k<=i-1; k++)
T += 0.0;
}
}
printf("T=%f\n", T);
I use gcc 3.4.1 (with -O3) on an AMD Opteron dual-code dual-processor machine running Solaris 10 to compile and run the above code. It takes about 30 seconds. But if I replace
T += 0.0 with either
T *= 1.0 or
T += 12.345 * 54.321 or
T = 0.0
it takes about 6 seconds.
1. What prevents gcc from optimizing the code with T+=0.0 such that running the program would also take 6 seconds?
2. What prevents gcc from optimizing the code such that it would take much less than 6 seconds? For example, with T=0.0 in the k loop, one should know T is zero without doing any computation. Why can't gcc detect it?
3. If I remove the last printf(), the timings do not change. I thought without printing T, the computation of T is irrelevant because T's value will never be used. So gcc should have optimized away the entire computation!