Thread: Why isn't this C program optimized?

  1. #16
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Quote Originally Posted by CornedBee
    Oh yeah, they just might be volatile

    Why, though? Well, because any good compiler should have good enough flow analysis to do it, of course!
    There's an example in the C99 draft that shows where a floating point number example, quite similar actually to this one, isn't optimized out because it could raise a floating point exception. It has to do with nested loops and adding and the fact that the loop may or may not execute, or some such.

    Anyway, just because it should, doesn't mean it will. The standard doesn't say it must, so if it isn't, that's why. That was kinda my whole point.

    Quzah.
    Hope is the first step on the road to disappointment.

  2. #17
    Registered User
    Join Date
    Jul 2005
    Posts
    98
    Quote Originally Posted by quzah
    There's an example in the C99 draft that shows where a floating point number example, quite similar actually to this one, isn't optimized out because it could raise a floating point exception. It has to do with nested loops and adding and the fact that the loop may or may not execute, or some such.
    It indeed does appear to have something to do with whether the variable in question is an int or a float.
    Code:
    #include <stdlib.h>
    int main(int argc, char *argv[])
    {
      int i,j,k;
      int T;
      int size = atoi(argv[1]);
    
      for (j=1; j<=size; j++) { 
        for (i=1; i<=j-1; i++) { 
          T = 0; 
          for (k=1; k<=i-1; k++) 
            T += 2 * 3; 
        } 
    } 
    printf("T=%d\n", T);
    can be optimzed by lifting the "T += 2.0 * 3.0" out of the innermost loop and it would become
    Code:
    #include <stdlib.h>
    int main(int argc, char *argv[])
    {
      int i,j,k;
      int T;
      int size = atoi(argv[1]);
    
      for (j=1; j<=size; j++) { 
        for (i=1; i<=j-1; i++) { 
          T = 0; 
          T += 2 * 3 * (i-1); 
          for (k=1; k<=i-1; k++) {}
        } 
    } 
    printf("T=%d\n", T);
    Actually both the above programs takes about 6 seconds to run on my computer, when I compile them using gcc -O3 and set size=3120.
    However, if I declare T as a float, then the "unoptimized" program
    Code:
    #include <stdlib.h>
    int main(int argc, char *argv[])
    {
      int i,j,k;
      float T;
      int size = atoi(argv[1]);
    
      for (j=1; j<=size; j++) { 
        for (i=1; i<=j-1; i++) { 
          T = 0.0;   
          for (k=1; k<=i-1; k++) 
              T += 2.0 * 3.0; 
        } 
    } 
    printf("T=%f\n", T);
    takes about 29 seconds, whereas the "optimized" program
    Code:
    #include <stdlib.h>
    int main(int argc, char *argv[])
    {
      int i,j,k;
      float T;
      int size = atoi(argv[1]);
    
      for (j=1; j<=size; j++) { 
        for (i=1; i<=j-1; i++) { 
          T = 0.0; 
          T += 2.0 * 3.0 * (i-1); 
          for (k=1; k<=i-1; k++) {}
        } 
    } 
    printf("T=%f\n", T);
    takes about 6 seconds, with size=3120.
    I read the gcc man page and there is this "-ffast-math" optimization flag that seems to optimize floating point operations at the expense of correctness. However, even with this flag, the "unoptimized" program still takes 29 seconds to run, with size=3120.
    When I remove the printf() statement at the end, the unoptimized program can now run for 6 seconds. But with the removal of printf(), the optimized program still runs for 6 seconds.
    The conclusion thus seems to be: For some reason, gcc is not able to optimize if the variable in question is a float. But if it is an int or the invariant computation is lifted outside the innermost loop by hand, then the running time does not even depend on the "visibility" of the variable (i.e. whether it is printed at the end). This can be illustrated by the following program with
    empty loop body:
    Code:
    #include <stdlib.h>
    int main(int argc, char *argv[])
    {
      int i,j,k;
      int size = atoi(argv[1]);
    
      for (j=1; j<=size; j++) { 
        for (i=1; i<=j-1; i++) { 
           for (k=1; k<=i-1; k++) {}
        } 
    }
    it still takes 6 seconds.
    And interestingly, gcc does not optimize away the entire loop nest. (The running time increases as 'size' increases.) I think that's kind of dumb. Some might have used such a dumb loop to delay the running of a program, as one poster suggests; but with a -O3 flag, I think gcc should optimize it away.
    Unfortunately I am not able to deciper the assembly code. I cannot even understand what the assembly code of a simple program works.
    Code:
    int main(int argc, char *argv[])
    {
      float T;
      T = 2.0 * 3.0;
      return 0;
    }
    gcc -O0 -S 2x3.c -o 2x3.s generates an assembly program (on an IA-32 machine):
    Code:
            .file   "2x3.c"
            .version        "01.01"
            .text
    .globl main
            .type   main, @function
    main:
            pushl   %ebp
            movl    %esp, %ebp
            subl    $8, %esp
            andl    $-16, %esp
            movl    $0, %eax
            addl    $15, %eax
            addl    $15, %eax
            shrl    $4, %eax
            sall    $4, %eax
            subl    %eax, %esp
            movl    $0x40c00000, %eax
            movl    %eax, -4(%ebp)
            movl    $0, %eax
            leave
            ret
            .size   main, .-main
            .ident  "GCC: (GNU) 3.4.1"
    It really beats me why 15 is added to the register eax twice and then right shift once and left shift once. I am guessing these 4 arithmetic ops are what T = 2.0*3.0 is translated into. But when I write a T = 2.0*5.0 program, everything is unchanged except that $0x40c00000 becomes $0x41200000. So it seems "movl $0x40c00000, %eax" is really the computation. But how come $0x40c00000 is 6.0?! I am totally confused.
    Quote Originally Posted by quzah
    Anyway, just because it should, doesn't mean it will. The standard doesn't say it must, so if it isn't, that's why. That was kinda my whole point.
    I think it does not have much to do with the C Standard. Most likely a language standard would not say whether certain things should be or must be optimized. It is up to the compiler. Therefore if gcc says something should be optimized in a certain way when the -O3 flag, say for example, or a specific optimization flag, say "-ffast-math", is used, then it should. If not, it violates the gcc spec (and not the C Standard). Unfortunately, there is not much detailed documentation about gcc optimization. For example, gcc manual does not say what kind of optimzation flag, if any, should be specified when the lifting of an invariant computation out of a loop is desired. Nor does it say what flag should be specified in order to optimize away the computation of an "invisible" variable. I think these optimization techniques are quite basic and any reasonable compiler should have them; but again, very unfortunately, gcc simply fails to document them and include them in the -O3 option.

  3. #18
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Quote Originally Posted by hzmonte
    I think it does not have much to do with the C Standard. Most likely a language standard would not say whether certain things should be or must be optimized.
    I had a rather lengthy post typed up here, regarding the mention of optimization in the standard, until I remembered I just don't give a ........ if you believe me or not.


    Quzah.
    Hope is the first step on the road to disappointment.

  4. #19
    Registered User Rennor's Avatar
    Join Date
    Aug 2006
    Location
    Finland
    Posts
    45
    Quote Originally Posted by quzah
    I had a rather lengthy post typed up here, regarding the mention of optimization in the standard, until I remembered I just don't give a ........ if you believe me or not.
    You made me burst (in laughter) and spill coffee (luckily I am experienced coffee-keyboard user and no damage occured) all over. Luckily our secretary gave me clean t-shirt from our storage room so I dont have to walk around the office in white shirt with spilled coffee on it.

    My colleagues looked at me weirdly (with secret understanding).

    Sorry for OT.

  5. #20
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Quote Originally Posted by Rennor
    You made me burst (in laughter) and spill coffee (luckily I am experienced coffee-keyboard user and no damage occured) all over. Luckily our secretary gave me clean t-shirt from our storage room so I dont have to walk around the office in white shirt with spilled coffee on it.

    My colleagues looked at me weirdly (with secret understanding).

    Sorry for OT.
    Glad to be of service.


    Quzah.
    Hope is the first step on the road to disappointment.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Issue with program that's calling a function and has a loop
    By tigerfansince84 in forum C++ Programming
    Replies: 9
    Last Post: 11-12-2008, 01:38 PM
  2. Need help with a program, theres something in it for you
    By engstudent363 in forum C Programming
    Replies: 1
    Last Post: 02-29-2008, 01:41 PM
  3. Replies: 4
    Last Post: 02-21-2008, 10:39 AM
  4. My program, anyhelp
    By @licomb in forum C Programming
    Replies: 14
    Last Post: 08-14-2001, 10:04 PM