Function calls and performance

**koplersky** · 12-10-2012

Hello there,

So I was reading about inline functions and I came up with a doubt.

If function calls tends to be slower than loops in C, what are the advantages of using recursive functions? I mean, if I have the option to choose between using a recursive function and loops, shouldn't I always prefer loops?

Also, what is your criteria when deciding to split a function? Is it just a question of keeping different things in different functions or are there some "rules"? Or is it just "this function is too big, let me split it"?

I really appreciate any advice.

Regards,
koplersky

**Adak** · 12-10-2012

Inline functions are not slower than loops - they become part and parcel of the calling function. The advantages of recursive functions is the stack that keeps track of just where you are at any time - going deeper with another call, or popping back out with a return.

Quicksort is a great example - it can be done iteratively of course, but it's a mess to code it up, and not so pretty, if I do say so. Recursively, it's VERY clean - and some recursive versions of it are even faster than the iterative version. Which is better, depends on exactly what you are doing. I'm not a big fan of recursive functions, but for Quicksort, it's a winner.

I like to keep the functions doing one thing - say, getting input. Normally, that would include parsing it - but if the input is complex, and the parsing is too, then making another function just for parsing, seems right. Size is one factor, complexity is another. I want to finish whatever the function was supposed to do, (encapsulating it in that one function), but if I can't easily grasp all the logic in the function readily, then it's time to think about splitting it up.

**koplersky** · 12-10-2012

Thanks for the answer, Adak.

Originally Posted by Adak

Inline functions are not slower than loops - they become part and parcel of the calling function.

Yes, what I read was this:

The point of making a function inline is to hint to the compiler that it is worth making some form of extra effort to call the function faster than it would otherwise - generally by substituting the code of the function into its caller. As well as eliminating the need for a call and return sequence, it might allow the compiler to perform certain optimizations between the bodies of both functions.

That's what got me confused. I mean, inline functions eliminate the need of this call and return sequence, so keep calling functions might be in some case slower than using loops, right? So why would one prefer to use recursion? Except that it might improve code readability, isn't there a case where this way of thinking (always improve code readability and thus using recursive function) might make the program run slower?

Originally Posted by Adak

and some recursive versions of it are even faster than the iterative version.

Could you please provide me an example?

Originally Posted by Adak

I like to keep the functions doing one thing - say, getting input. Normally, that would include parsing it - but if the input is complex, and the parsing is too, then making another function just for parsing, seems right. Size is one factor, complexity is another. I want to finish whatever the function was supposed to do, (encapsulating it in that one function), but if I can't easily grasp all the logic in the function readily, then it's time to think about splitting it up.

So there is no rule to when split a function? We could say that it's a matter of taste, always aiming at improve code readability and letting each function doing one job?

**laserlight** · 12-10-2012

Originally Posted by koplersky

So why would one prefer to use recursion? Except that it might improve code readability, isn't there a case where this way of thinking (always improve code readability and thus using recursive function) might make the program run slower?

If the difference to the efficiency is negligible and the other pitfalls of recursion do not apply (e.g., limited stack space), whereas the improvement to readability and/or the reduced time needed to code and debug is significant, then why not use a recursive solution?

On the other hand, if the problem lends itself well to iteration, why not use an iterative solution?

Originally Posted by koplersky

So there is no rule to when split a function? We could say that it's a matter of taste, always aiming at improve code readability and letting each function doing one job?

Yes, there is always some kind of judgment call to be made.

**koplersky** · 12-10-2012

Thank you laserlight, straight to the point as always.

**iMalc** · 12-11-2012

In general a lot of people will suggest to perfer the recursive version because it is cleaner, and many will tell you to prefer the iterative version because it may be faster and cant suffer from stack overflow issues, and really they are both right. My take on it is that you should probably start by writing the recursive version, but also try to write the iterative version if for no reason other than to learn how to translate from one to the other. Being able to convert between them is a good skill to have!

The best, and perhaps most surprising example of the performance of iterative vs recursive that I ever came across, was for Bitonic Sort.
Basically it can cleverly be done entirely iteratively with no need to maintain an explicit stack, which amounts to nested loops performing the sort in a breadth-first manner,
Or it can be done recursively, with a doubly-recursive function that calls another function which may also be doubly-recrursive. This then causes the sort to be performed in a totally depth-first manner.
Or finally, one can create various hybrid approaches with differing amounts of recursion and iteration, where the recursive parts can either be swapped out for the iterative counterparts that perform the sort in a different order, or they can be translated into an iterative form directly, possibly using an explicit stack.
It turned out in my case that one of the hybrids I created consistently beat both the fully iterative and fully recursive methods. It's not strictly about iteration vs recursion because the order it progresses for each part is different, but it was very interesting nevertheless.

**Click_here** · 12-11-2012

It really depends on what you are doing.

A good example of where recursion may look easier, but not be ideal (given in "C Unleashed") is the Fibonacci Sequence.

Code:

int fib(int x)
{
  if((x == 1) || (x == 2))
  {
    return 1;
  }
  
  return fib(x-2) + fib(x-1);

}

For this to work on on fib(9), it first calls fib(7), which calls fib(6), and then fib(5)... After that it then calls fib(8), which goes through each calculation again. This can cost a lot of time with larger numbers.

Another way of approaching this problem would be to not use recursion -

Code:

int fib(int x)
{
  int i, a[3];

  if(x<3) return 1;

  a[1] = a[0] = 1;
  for(i=2; i<x; ++i)
  {
    a[2] = a[0] + a[1];
    a[0] = a[1];
    a[1] = a[2];
  }

  return a[2];
}

The performance changed from geometric to linear.

There are other things, like a binary search - Using recursion to create a VERY efficient search algorithm.

I suppose that the message that I got from that section of the book was to not use recursion for recursion's sake.

[edit]
PS - Thanks for recommending that book to me Salem, it's been great. It covers advanced techniques in C programming, which I couldn't recommend more.

For others
Heathfield, R., Kirby, L., et al. (2000), "C Unleashed", Sams Publishing, Dallas. TX
http://www.amazon.com/C-Unleashed-Ri.../dp/0672318962
[/edit]

**koplersky** · 12-11-2012

The way I understood it was that calling functions would make the program run slower, that's why recursion came to my mind.

Click_here this example is indeed interesting, because when I first learned recursion teachers used a recursive version of the Fibonacci Sequence to explain why it's good. So it's just easier and cleaner.

Thank you all for the answers.

**laserlight** · 12-11-2012

Originally Posted by koplersky

Click_here this example is indeed interesting, because when I first learned recursion teachers used a recursive version of the Fibonacci Sequence to explain why it's good. So it's just easier and cleaner.

It also happens to be the first problem to which I saw memoization applied

**koplersky** · 12-11-2012

Originally Posted by laserlight

It also happens to be the first problem to which I saw memoization applied

And from this casual comment I learned a new technique, or at least the concept of something I had already done.

**Adak** · 12-11-2012

The sorting times for the iterative Quicksort version are due to it's silly choice of the first element of the array/sub-array, as the pivot, and should be ignored.

The difficultly of handling the stack explicitly is clear, however.

Code:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>

#define INSERT 64
#define SIZE   100000

void binaryInsertion(void);
void bubbleSort(void);
int checkIt(void);
void insertionSort(int lo, int hi); 
void middle1(int lo, int hi);
void quicksortOpt(int lo, int hi);
void quicksortIter(void);
void quicksortRecursive(int lo, int hi);
void fillArray(void);
void shellSort(void);
void selectionSort(void);
void swap(int i, int j);   

int A[SIZE]={0};
unsigned long long int SWAPS=0;
int TYPE=0; 

int main(void) {
   int i,j;
   char type[3][11]={{"random"},{"ascending"},{"descending"}};
   clock_t start,stop;

   srand(time(NULL));
   for(j=0;j<3;j++) {
      TYPE=j;
      fillArray();
      printf("\nSorting %d integers, initially in %s order. The first 10 numbers:\n\n",SIZE,type[TYPE]);
      for(i=0;i<10;i++)  printf("%6d  ",A[i]); 
      printf("\n");
/*
      SWAPS=0;
      start=clock();
      binaryInsertion();
      stop=clock();
      printf("    Binary Insertion: %15llu swaps in %10.4f seconds\n",SWAPS,(double)(stop-start)/CLOCKS_PER_SEC);
      if(checkIt());
*/   
/*
      SWAPS=0;
      start=clock();
      bubbleSort();
      stop=clock();
      printf("         Bubble sort: %15llu swaps in %10.4f seconds\n",SWAPS,(double)(stop-start)/CLOCKS_PER_SEC);
      if(checkIt());

      SWAPS=0;
      fillArray();
      start=clock();
      middle1(0,SIZE);
      stop=clock();
      printf("        Middle1 sort: %15llu swaps in %10.4f seconds\n",SWAPS,(double)(stop-start)/CLOCKS_PER_SEC);
      if(checkIt());

      SWAPS=0;
      fillArray();
      start=clock();
      insertionSort(0,SIZE);
      stop=clock();
      printf("      Insertion sort: %15llu swaps in %10.4f seconds\n",SWAPS,(double)(stop-start)/CLOCKS_PER_SEC);
      if(checkIt());
*/
      SWAPS=0;
      fillArray();
      start=clock();
      quicksortIter();
      stop=clock();
      printf(" Iterative Quicksort: %15llu swaps in %10.4f seconds\n",SWAPS,(double)(stop-start)/CLOCKS_PER_SEC);
      if(checkIt());


      SWAPS=0;
      fillArray();
      start=clock();
      quicksortOpt(0,SIZE-1);
      stop=clock();
      printf(" Optimized Quicksort: %15llu swaps in %10.4f seconds\n",SWAPS,(double)(stop-start)/CLOCKS_PER_SEC);
      if(checkIt());

      SWAPS=0;
      fillArray();
      start=clock();
      quicksortRecursive(0,SIZE-1);
      stop=clock();
      printf(" Recursive Quicksort: %15llu swaps in %10.4f seconds\n",SWAPS,(double)(stop-start)/CLOCKS_PER_SEC);
      if(checkIt());

/*
      SWAPS=0;
      fillArray();
      start=clock(); 
      selectionSort();
      stop=clock();
      printf("      Selection sort: %15llu swaps in %10.4f seconds\n",SWAPS,(double)(stop-start)/CLOCKS_PER_SEC);
      if(checkIt()); 
*/      
      
      SWAPS=0;
      fillArray();
      start=clock(); 
      shellSort();
      stop=clock();
      printf("          Shell sort: %15llu swaps in %10.4f seconds\n\n",SWAPS,(double)(stop-start)/CLOCKS_PER_SEC);
      if(checkIt()); 
   } 

   for(i=0;i<10;i++)
      printf("%6d  ",A[i]);
   
   
   printf("\n");
   return 0;
}
//Binary Insertion Sort  Has a subtle error in it - sorts correctly
//but overwrites SWAPS++ for Bubble sort. 
void binaryInsertion(void)
{
    register int i, m;
    int n=SIZE,hi, lo, tmp;

    for (i = 1; i < n; i++) {
        lo = 0, hi = i;
        m = i / 2;

        do {
            if (A[i] > A[m]) {
                lo = m + 1;
            } else if (A[i] < A[m]) {
                hi = m;
            } else
                break;

            m = lo + ((hi - lo) / 2);
        } while (lo < hi);
 
        if (m < i) {
            tmp = A[i];
            memmove (A + m + 1, A + m, sizeof (int) * (i - m));
            A[m] = tmp;
            SWAPS++;
        }
    }
}
void bubbleSort() {   
   int i, n, swapped;
   n = SIZE-1;
   do {
      swapped = 0;
      for(i =0; i < n; i++) {  
         if(A[i] > A[i + 1 ]) {
            swap(i, i + 1);
            ++SWAPS; 
            swapped = 1;
         }
      }
      --n;
   }while(swapped);
}
void middle1(int lo, int hi) {
   int i,j,temp;
   for(i=hi-1;i>-1;i--) {      
      for(j=0;j<i;j++) {
         if(A[j] > A[i]) {
            temp=A[j];
            A[j]=A[i];
            A[i]=temp;
            ++SWAPS;
         }
      }
   }
}
void insertionSort(int lo, int hi) {
   int i, j, val; 
   for(i=lo+1;i<hi;i++) {  
      val = A[i];
      j = i-1;
      while(A[j]>val) {
         A[j + 1] = A[j];
         ++SWAPS;
         --j;
         if(j<0) 
            break;
      }   
      A[j+1] = val;
   }
}
  //Quicksort, Iterative
void quicksortIter(void) {
  int I, J, left, right, stack, pivot,maxStack=0;  
  int ST[64] = { 0 }; //uses ~20 with N=25,000 and range of 0-999
  left = stack = 0; right = SIZE-1; 
                                                
  while(1) {
    while(left >= right) {         //better with the >=
      if(stack != 0) {             
          right = ST[stack];         
          left = ST[stack - 1]; 
          stack -= 2;
        }
        else { //if T0 == 0, we're done sorting
          printf("maxstack: %d \n",maxStack); 
          return;    //not break 
        }
      }
      pivot = A[left]; 
      I = left+1;    //forward loop counter 
      J = right+1;   //backward loop counter 
     
    
      /* these iterators are touchy */                 
      do {
        while(A[--J] > pivot);      
        while(A[I] < pivot) ++I;  

        if(J > I) {
          swap(I, J);
          ++SWAPS;
        }
      }while(J > I);

      A[left] = A[J];
      A[J] = pivot;
                                    
      if((J - left) < (right - J))  //take the smallest sub array first
        ST[stack + 1] = J + 1;
                                          
      if(ST[stack + 1] == J + 1) { 
        ST[stack + 2] = right; 
        right = J - 1; 
      }
      else {
        ST[stack + 1] = left; 
        ST[stack + 2] = J - 1; 
        left = J + 1;
      }
      stack = stack + 2;                      //push onto ST[]
      if(stack > maxStack) maxStack = stack;  //debug aid 
                                          
    }
    
}
//Optimized Quicksort
void quicksortOpt(int lo, int hi) {
   int i, j, pivot;
   if(lo == hi) return; 

  if(hi - lo < INSERT) {  //The optimizer. Value of
    insertionSort(lo, hi+1);           //INSERT will vary 
    return;
  }

   i=lo; 
   j=hi;
   pivot= A[(lo+hi)/2]; 

  /* Split the array into two parts */
   do {    
      while (A[i] < pivot) i++; 
      while (A[j] > pivot) j--;
      if (i<=j) {  
         swap(i, j);   //this line works fine.
         SWAPS++;
         i++;
         j--;
      }
   } while(i<=j);
  
   if (lo < j) quicksortOpt(lo, j);
   if (i < hi) quicksortOpt(i, hi);
}

//Good but unoptimized, Quicksort
void quicksortRecursive(int lo, int hi) {
   int i, j, pivot;
   if(lo == hi) return; 
   i=lo; 
   j=hi;
   pivot= A[(lo+hi)/2]; 

  /* Split the array into two parts */
   do {    
      while (A[i] < pivot) i++; 
      while (A[j] > pivot) j--;
      if (i<=j) {  
         swap(i, j);   //this line works fine.
         SWAPS++;
         i++;
         j--;
      }
   } while(i<=j);
  
   if (lo < j) quicksortOpt(lo, j);
   if (i < hi) quicksortOpt(i, hi);
}
void selectionSort()
{
   int i,j,n=SIZE,min;
    
   for(i = 0; i < n-1; i++) {
      min = i;                 //set min to the firt index  
      for(j = i+1; j < n; j++) {
         if (A[j] < A[min]) {
            min = j;
         }
      }
 
      if( min != i ) {
         swap(i, min);
         SWAPS++;
      }
   }
}

//Shell sort
void shellSort(void) {
  int i, j, gap, temp;
  for(gap = SIZE/2; gap > 0; gap /= 2) {
    for(i = gap; i < SIZE; i++) {
      for(j = i - gap; j >= 0 && A[j] > A[j + gap]; j -= gap) {
	      ++SWAPS;
         temp = A[j];
	      A[j] = A[j + gap];
        A[j + gap] = temp;
      }
    }
  }
}
void swap(int i, int j) {    
   int temp;
   temp = A[i];
   A[i] = A[j];
   A[j] = temp;
}
void fillArray(void) {
   int i,j;
   if(TYPE==0) {         //random order
      for(i = 0; i < SIZE; i++)
         A[i] = (rand() % 10000)+1;
   }else if(TYPE==1) {   //ascending order
      for(i=0;i<SIZE;i++)
          A[i]=i;
   }else if(TYPE==2) {   //descending order
      for(j=0,i=SIZE-1;i>=0;i--,j++)
         A[j]=i;
   }
}
int checkIt(void) {
   int i;
   for(i=0;i<SIZE-1;i++) {
      if(A[i]>A[i+1]) {
         printf("Error! Not sorted in ascending order.\n");
         return 1;
      }
   }
   return 0;
}

**c99tutorial** · 12-11-2012

Personally, I prefer the iterative version of the fibonacci function, because it exactly matches the mathematical definition.

If you're really concerned with speed, though, you should focus on algorithms rather than "optimizations". Even with the best optimizations, for example even if writing something with a for loop as opposed to recursive calls turns out to be 10 times faster, all this ever does is give you (slightly) scaled-up performance. What does this mean in terms of the algorithm? It means if your algorithm takes n operations to complete, then even the fastest, heavily-optimized version will still take c * n operations, where c < 1. Notice how this is true for both versions of fibonacci, above. It doesn't matter if you do it recursively or with for loops. And it still doesn't matter even if you apply memoization.

If you find a better algorithm to solve the problem at hand this is much more meaningful for performance. For example, there is a closed form expression for the fibonacci numbers. What does this mean? It means you can calculate any fibo(n) in only c operations, where c is a constant number of operations to apply the specific formula involved, some of which are exponentiations (which one might think are more expensive than a bunch of additions). Given this approach, the fibo function will now take only c steps to complete, no matter if you do something like fibo(100000) or fibo(7). And then if you're STILL worried that fibo(n) is doing wasteful exponentiations for small n, then just use memoization for the smallest or most commonly requested fibonacci numbers - i.e. just return a pre-computed fibo(n) for, say, the first 100 fibonacci numbers.

**koplersky** · 12-11-2012

I see, so the difference in performance is not that much.

Thank you for the answers, and specially Adak for the fine sample code.

**cas** · 12-11-2012

I see, so the difference in performance is not that much.

If you mean the difference between iterative and recursive, the performance difference probably isn't an issue. However, the fact that the stack on many systems is rather limited, combined with the fact that C does not mandate tail-call elimination, means that recursion is usually a bad idea unless you know the function will recurse a limited number of times.

Functional languages (e.g. Erlang, Haskell, etc.), which do not have looping constructs, require recursion, but do tend to require tail-call elimination where possible, so recursion can be safe.

**laserlight** · 12-11-2012

Originally Posted by c99tutorial

Personally, I prefer the iterative version of the fibonacci function, because it exactly matches the mathematical definition.

Eh, the recursive version exactly matches the mathematical definition since expressing it as a recurrence relation is err... mathematical.

Originally Posted by c99tutorial

Notice how this is true for both versions of fibonacci, above. It doesn't matter if you do it recursively or with for loops. And it still doesn't matter even if you apply memoization.

It is not true for both versions of Fibonacci presented above: the iterative version and the recursive version with memoization both run in linear time, but Click_here's point was that that is not the case for the recursive version without memoization. The thing is, applying memoization changes the algorithm: you're doing table lookups instead of continually re-computing.

Originally Posted by c99tutorial

For example, there is a closed form expression for the fibonacci numbers. What does this mean? It means you can calculate any fibo(n) in only c operations, where c is a constant number of operations to apply the specific formula involved, some of which are exponentiations (which one might think are more expensive than a bunch of additions). Given this approach, the fibo function will now take only c steps to complete, no matter if you do something like fibo(100000) or fibo(7).

Heh, I once asserted the same thing to a prof of mine, who then pointed out that the exponentation/square root calculations run in logarithmic time (if I remember correctly). So, it is not really c steps to complete (i.e., constant time), but it certainly is asymptotically faster than linear time.

Thread: Function calls and performance

Thread Tools

Search Thread

Display

Function calls and performance

Similar Threads

pointers and function calls inside a function

Memory allocated to variables of a function on different function calls

Performance and footprint of virtual function

Performance Timing Function

Function calls