# Thread: For Loops: Which Method Is Better?

1. ## For Loops: Which Method Is Better?

Hello! Writing some code recently I came up with two different ways of doing for loops/statements.

Method 1:
Code:
for (int iii = 0; iii <  5; iii++)
{
SomeStatement;
SomeOtherStatement;
AnotherStatement;
}
Method 2:
Code:
for (int iii = 0; iii <  5; iii++)
SomeStatement;

for (int iii = 0; iii <  5; iii++)
SomeOtherStatement;

for (int iii = 0; iii <  5; iii++)
AnotherStatement;
As far as I can tell both methods are equivalent (please correct me if I'm wrong)

Assuming both do the exact same thing, which would generally be preferred? Thanks!

2. Originally Posted by Vekta
As far as I can tell both methods are equivalent (please correct me if I'm wrong)
Ah, but they are not. It is the difference between:
Code:
ABC
ABC
ABC
ABC
ABC
and
Code:
AAAAA
BBBBB
CCCCC

3. Originally Posted by Vekta
Assuming both do the exact same thing, which would generally be preferred?
It is as laserlight pointed out. The 2 dont really work in the same way.

Method 1:
Code:
for(int iii = 0; iii < 5; ++iii)
{
cout<<"A";
cout<<"B";
cout<<"C";
}

Method 2:
Code:
for(int iii = 0; iii < 5; ++iii)
cout<<"A";

for(int iii = 0; iii < 5; ++iii)
cout<<"B";

for(int iii = 0; iii < 5; ++iii)
cout<<"C";

But lets say they do indeed do the same thing(not the above example, but your statements), then the obvious choice would be method 1 ... since it iterates just 5 times, whereas your 2nd method would iterate 15 times to achieve the same thing?

4. Method 1 is better, because you change the index fewer times, and because the loop body is bigger, which means the cpu can do more things before each branch. (Branches can slow down other instructions, unless the next instruction does not depend on the branch, or it can guess correctly if the branch is taken.) It only matters if the 3 operations are very simple however.

Mind, this would qualify as a micro-optimisation, and not worth bothering with unless you've determined that particular command sequence to be a performance critical.

5. Originally Posted by King Mir
...which means the cpu can do more things before each branch. (Branches can slow down other instructions, unless the next instruction does not depend on the branch, or it can guess correctly if the branch is taken.)...
This would usually be optimized by the compiler by unrolling the loop, though, and on today's OOO CPUs, the cpu would probably just continue to stuff down more and more instructions down the pipeline from subsequent iterations of the loop, limiting the performance hit.
Also, on today's hardware, branches typically only slow down the cpu if they are mispredicted.
Nevertheless, I agree that it is a micro-optimization.

6. What ?
How can it be micro/premature optimization to write similar iterations within the same loop?
It is as natural as ..say.. annotating while reading, not afterwards !

Or did you two mean something else ?

7. I think they are talking about manual loop unrolling. Other than that, it is nonsense to say that method 1 is better than method 2 because they don't do the same thing.

8. I remember getting in an argument with Mario F. about for loops once and it was nothing like this. I think though that the conclusion will be the same. The method of writing a for loop that is better is the one that works. For loops are so simple that if you can save time by doing an extra something in the initializing part of the loop, and use ++i instead of i++, it won't hurt.

9. Originally Posted by Elysia
This would usually be optimized by the compiler by unrolling the loop, though, and on today's OOO CPUs, the cpu would probably just continue to stuff down more and more instructions down the pipeline from subsequent iterations of the loop, limiting the performance hit.
Also, on today's hardware, branches typically only slow down the cpu if they are mispredicted.
Nevertheless, I agree that it is a micro-optimization.
Unrolling the loop wouldn't do much good if there's a data dependency with each iteration of the loop. So using 1 loop instead of 3 gives the compiler and CPU more opportunity for such optimisations.

10. Here's an example of what I'm talking about:
This code:
Code:
int foo(int array1[], int array2[],int array3[]){
int product=1;
for (int i = 0; i <  SIZE; i++)
product *= array1[i];

for (int i = 0; i <  SIZE; i++)
array3[i] = array1[i]+array2[i];

return product;
}
could better written like this:
Code:
int foo(int array1[], int array2[],int array3[]){
int product=1;
for (int i = 0; i <  SIZE; i++) {
product *= array1[i];
array3[i] = array1[i]+array2[i];
}
return product;
}
The second will preform better, unless the compiler can itself realize that the two are functionally equivalent. I don't think most can.

11. Originally Posted by King Mir
The second will preform better, unless the compiler can itself realize that the two are functionally equivalent. I don't think most can.
Because of possible aliasing, the two are not necessarily functionally equivalent, e.g.,
Code:
#include <iostream>

int foo1(int array1[], int array2[],int array3[], int size){
int product=1;
for (int i = 0; i <  size; i++)
product *= array1[i];

for (int i = 0; i <  size; i++)
array3[i] = array1[i]+array2[i];

return product;
}

int foo2(int array1[], int array2[],int array3[], int size){
int product=1;
for (int i = 0; i <  size; i++) {
product *= array1[i];
array3[i] = array1[i]+array2[i];
}
return product;
}

void print(int numbers[], int size)
{
for (int i = 0; i < size; ++i)
{
std::cout << numbers[i] << " ";
}
std::cout << std::endl;
}

int main()
{
{
int x[] = {1, 2, 3, 4, 5};
int y[] = {5, 6, 7, 8};
int result = foo1(x, y, x + 1, 4);
std::cout << "result = " << result << std::endl;
print(x, 5);
print(y, 4);
}

{
int x[] = {1, 2, 3, 4, 5};
int y[] = {5, 6, 7, 8};
int result = foo2(x, y, x + 1, 4);
std::cout << "result = " << result << std::endl;
print(x, 5);
print(y, 4);
}
}

12. Originally Posted by laserlight
Because of possible aliasing, the two are not necessarily functionally equivalent, e.g.,
That's right, but in a given context that the programmer is aware of, they could be. So the compiler can't do the optimisation, because of the example you provide, and the CPU can't do the necessary code reshuffling, because it can't see far enough ahead to see the see the second loop.

13. Originally Posted by King Mir
Unrolling the loop wouldn't do much good if there's a data dependency with each iteration of the loop. So using 1 loop instead of 3 gives the compiler and CPU more opportunity for such optimisations.
Depends on how the loop looks like, and the type of hardware.
A modern OOO CPU will likely just continue to loop while stuffing down ops through the pipeline, so it essentially "unrolls the loop" itself.
If there are too many data dependencies or too many loops (as in your example), then you'll have problems, though.
I'm just speaking from a theoretical perspective, though. I can't say from a practical one, as I am not well versed in compiler technologies.

14. Go with method1 as it gives more performance compared with method 2. It depends on the content of the loop body. If three statements are dependent on each other go with method 1. If the need is kind of statement should be executed n times before we execute next statement we go with method two. So we can't say Method1 = Method2 until the thread starter says what each statement is...

15. Screw performance. Do the thing that makes the code readable. You optimize only when the program isn't fast enough and profiling has shown you where the slow parts are.

In King Mir's example, you have one function doing two very different things (getting the product of an array and getting elementwise sums of two arrays), so using two loops, and two separate functions in fact, is better. And while you're at it, get rid of those loops entirely and use algorithms.

Code:
int product(int array[], int size) {
return std::accumulate(array, array+size, 1, std::multiplies<int>());
}

void sum_elements(int input1[], int input2[], int output[], int size) {
std::transform(input1, input1+size, input2, output, std::plus<int>());
}