# Better algorithm for this ? (generating combinations)

This is a discussion on Better algorithm for this ? (generating combinations) within the C++ Programming forums, part of the General Programming Boards category; The problem is to generate all possible unique selections of n numbers from a given set of size > n. ...

1. ## Better algorithm for this ? (generating combinations)

The problem is to generate all possible unique selections of n numbers from a given set of size > n.
I've pretty much done it by brute force, as the number of final results should be same whichever way I choose. If a better way is not possible, I'd like suggestions about simple optimizations.
Code:
```std::deque<std::set<int>> find_combinations(const std::set<int>& U,unsigned int n)
{
std::deque<std::set<int>> result;
if(n<1 || U.size()<n)//Nothing Happens Here !
{}
else if(n==1) // The Universal input set is broken up and redistributed.
{
for(auto x:U)
result.push_back(std::set<int>({x}));
}
else // Recursion FTW, no comments !!
{
std::set<int> temp(U);
for(auto x:U)
{
temp.erase(x);
auto partial_result = find_combinations(temp,n-1);
for(auto& y:partial_result)
y.insert(x);
result.insert
(
std::begin(result),
std::begin(partial_result),
std::end(partial_result)
);
}
}
return result;
}```
I noticed that having the return container a deque makes this much faster than if a vector or list is used. Why ?
Also, is it worth the trouble trying to do it iteratively?
Finally, can anyone help me determining the complexity of this function ?
(I think it is something near (log base n U.size() )^ n) but I'm probably very very wrong about that.)

Btw, here is a main you can use to test it.
Code:
```int main(int argc,char** argv)
{
std::set<int> input;
for(int i=0;i<atoi(argv[1]);i++)
input.insert(i);

auto foo = find_combinations(input,atoi(argv[2]));

//     for(auto x:foo) //displaying results
//     {
//         for(auto y:x)
//             std::cout<<y<<'\t';
//         std::cout<<std::endl;
//     }
}```

2. Also, is it worth the trouble trying to do it iteratively?
Iterative processing almost always wins over collection processing, but what happens behind the scenes isn't relevant to a client.

It is only worth it if you are in a situation where you need to preserve stack space or coding for a standard that prevents recursion of any type.

Soma

3. Originally Posted by phantomotap
It is only worth it if you are in a situation where you need to preserve stack space or coding for a standard that prevents recursion of any type.
With move constructors in work here, I'd say that the stack issue is absent.
(Other than making the temp set in line#13, but it'd be same if done with loops.)

4. With move constructors in work here, I'd say that the stack issue is absent.
Move constructors have nothing to do with stack space unless the allocator storage lives on the stack; the data for the `std::set<???>' and `std::deque<???>' live on the heap.

The overhead for saving registers is probably larger than the cost of the few local variables.

The metric for stack preservation in industry standards is targeted more at limiting stack depth than the size of any given frame.

In this case, the stack is linear with `n'. The size of any individual frame wouldn't even be considered at a lot of places; this recursion simply would not be allowed because it is linear.

I've had to argue for binary recursions that are practicably limited to a depth of only 19. I've lost every single time.

Soma

5. Returning an std::deque<std::set<int>> looks expensive to me; I'd pass in the std::deque as a reference parameter and add results to it there. Also, it may be cheaper to pass along a single std::set<int> &U, where each recursive call removes items from U and then adds them back in as necessary. It depends on how expensive lookups and removals are (and other containers may be better in this regard, although you'd have to be careful to make the container stable -- add back to same position removed from). Maybe I'm too much of a Prolog programmer, but it makes way more sense to me to maintain data structures that are updated at each recursive call than making copies all the time. Also, this becomes much easier to convert to an iterative version if you like.

So here's how I would approach the problem. Sorry there aren't many comments in the code, ask me to elaborate if any of it is confusing.
Code:
```#include <iostream>
#include <set>
#include <vector>
#include <deque>

std::deque<std::set<int>> find_combinations(const std::set<int>& U,unsigned int n)
{
std::deque<std::set<int>> result;
if(n<1 || U.size()<n)//Nothing Happens Here !
{}
else if(n==1) // The Universal input set is broken up and redistributed.
{
for(auto x:U)
result.push_back(std::set<int>({x}));
}
else // Recursion FTW, no comments !!
{
std::set<int> temp(U);
for(auto x:U)
{
temp.erase(x);
auto partial_result = find_combinations(temp,n-1);
for(auto& y:partial_result)
y.insert(x);
result.insert
(
std::begin(result),
std::begin(partial_result),
std::end(partial_result)
);
}
}
return result;
}

/** Generates combinations of @a data with @a n elements. Results will be added
to @a results, which should have @c data.size() choose @a n elements after
this function returns.

@a chosen and @a start indicate partial solutions, and should initially be
empty and zero, respectively.
*/
void find_combinations_dwk(std::deque<std::vector<int> > &results,
std::vector<int> &data, std::vector<int> &chosen,
unsigned start, unsigned n) {

if(n <= 0) {
results.push_front(chosen);
return;
}

for(std::vector<int>::size_type x = start; x < data.size(); x ++) {
int value = data[x];
data.erase(data.begin() + x);

chosen.push_back(value);
find_combinations_dwk(results, data, chosen, x, n - 1);
chosen.pop_back();

data.insert(data.begin() + x, value);
}
}

int main(int argc, char *argv[]) {
std::set<int> data;
std::vector<int> dataVector;
for(int x = 0; x < 20; x ++) {
data.insert(x);
dataVector.push_back(x);
}

if(argc == 1) {
auto size = find_combinations(data, 8).size();
std::cout << "find_combinations found: " << size << std::endl;
}
else {
std::deque<std::vector<int> > results;
std::vector<int> chosen;
find_combinations_dwk(results, dataVector, chosen, 0, 8);

std::cout << "find_combinations_dwk found: "
<< results.size() << std::endl;
}

return 0;
}```
Timing results:
Code:
```\$ g++ --std=c++0x choose.cpp -o choose
\$ time ./choose
find_combinations found: 125970

real	0m3.624s
user	0m3.576s
sys	0m0.044s
\$ time ./choose dwk
find_combinations_dwk found: 125970

real	0m0.320s
user	0m0.300s
sys	0m0.020s
\$```
So it looks like my method is roughly an order of magnitude faster, at least for this input size! Of course it's not exactly a fair comparison, as I needed to use std::vector to avoid invalidated iterators while your method uses std::set, but I think you'll find that coding in this style can often be more efficient.

If you wanted to make my version iterative, it wouldn't be too hard. Just notice the only part that relies on recursion: really, the values of x. So all you'd need would be an std::stack of values of x, put the whole function into a loop and push and pop values of x as necessary. At least in theory.

Cheers, dwk.

 Note I didn't actually check the output, but because the container has 20 choose 8 elements, I'm pretty sure it's correct. Or close enough. [/edit]

6. Originally Posted by dwks
Maybe I'm too much of a Prolog programmer ...
Cool. It's good to find someone else who also likes Prolog. Whilst I haven't used it in a few years, I've still got some of the games I made with it lying around, and still think of messing with it from time to time.

Nice use of C++11 there manasij7479!

7. Code:
```CombinationMatrix::CombinationMatrix(unsigned int numVars, unsigned int numStates):
numVars(numVars),
numStates(numStates),
Matrix(pow((double)numStates, (double)numVars), numVars)//(rows,columns)
{
unsigned int i, j;
std::vector<unsigned int> row(columns);
for(i=1;i<rows;++i)
{
baseConvert(row, i, numStates);
for(j=0;j<columns;++j)
{
(*this)(i,j)=row[j];
}
row.assign(row.size(),0);
}
}

//...

static void baseConvert(std::vector<unsigned int>& result, unsigned int value,unsigned int base)
{
unsigned int digits = (unsigned int)1+log((double)value)/log((double)base);
for(int i=0;i<digits-1;i++)
{
result[i]=value%base;
value/=base;
}
result[digits-1]=value;
}```
if i understand you correctly, for your case, i believe the number of states == number of variables.

8. Hmm, that's an interesting idea, m37h0d. There's not quite enough code there for me to figure it out at a quick glance but I think generating bit-patterns and using them as a kind of shadow-array reference is a good idea. Certainly you'd save on memory if you were generating permutations of more complex objects.

Although, if you're going to go that route there are bit-manipulation algorithms that generate the next n-bit number with k one bits, which is exactly what we need. That's probably the way to go, either chaining together real ints or just implementing this stuff in your own bignum kind of library. I imagine it would be faster than calling log() all the time. Bit Twiddling Hacks

@iMalc: I haven't used Prolog much in a while either, but it definitely trains you to think in terms of recursion and which lists you're adding stuff to and which lists you're removing stuff from... It's a good language to be familiar with, I think.

I echo the sentiment about C++11: nice job, manasij7479.

9. Originally Posted by phantomotap
Move constructors have nothing to do with stack space unless the allocator storage lives on the stack; the data for the `std::set<???>' and `std::deque<???>' live on the heap.
Sorry, I was confused about something else regarding move semantics here... and I always forget that the standard allocators do not store on the stack.

10. Originally Posted by dwks
Also, it may be cheaper to pass along a single std::set<int> &U, where each recursive call removes items from U and then adds them back in as necessary.
Neat idea.. doesn't seem to work very well for sets, though. (where, it should do better, iirc)

It took some effort for me to understand your code. (... and I'm nowhere near 'seeing' what m37h0d's is doing; yet.)
Thanks.

11. Originally Posted by iMalc
Nice use of C++11 there manasij7479!
Originally Posted by dwks
I echo the sentiment about C++11: nice job, manasij7479.
C++11 makes it remarkably easy to write naive code in an elegant way.
This was a nice example of that.

12. C++11 makes it remarkably easy to write naive code in an elegant way.
O_o

remarkably easy
o_O

That is one for the record books. ^_^

I can't even decide on which silly thing to say.

*shrug*

Okay.

*leans away from desk lamp*

*steeples fingers*

He is coming along nicely...

Bwahahahah!

Seriously though, if you are at the point where that flavor of code is "remarkably easy" you are going to be a great programmer.

Soma

13. >_<

Nope. I'm not happy with it; I should have went with the "Dr. Who" reference.

Soma

14. Originally Posted by phantomotap
remarkably easy
I taught my cousin (in the 1st week of his C++ experience) basics of using containers and ranged loops.

If he, being a physics student, could grasp that more easily than using counter variables and C strings, then it is definitely 'remarkably easy' .