# Incremental alphabet from charset

• 06-28-2008
Incremental alphabet from charset
I've been trying to do this for a while, with no luck.
I've got a charset,
Code:

`char charset[1024];`
That always contains 2 or more characters.
I've been trying to take the chars from the charset and make words from them, incrementally to a certain length.
It's hard to explain.
I have an array, AKA
Code:

`char charset[1024];`
Lets say it contains
Code:

`abc`
and the max length is 2, ill get:
Code:

```a b c aa ab ac ba bb bc ca cb cc```
----------------------------------------
Could somebody shed some light on this?

EDIT:
By the way, i'm not asking for code, just a few tips.
• 06-28-2008
C_ntua
Create a string of length 1. Store the first character. Repeat this 1024 times.

Create a string of length 2. Store the first character. Treat this character as the "main" character for this loop. Then put another character into the string. Repeat the above 1024-1 (or better 1024+1-length). Each time you repeat you keep the main character of course the same.
Create a string of length 2. Store the second character. Etc etc.

Create a string of length of 3. Store the first character. Treat this character as the "main" character for this loop. Then put another character into the string. Treat it as second "main". Then put another character into the string. Repeat the above 1024-2 (or better 1024+1-length). Each time you repeat you keep the main characters of course the same.
Create a string of length 3. Store the second character. Etc etc.

Do this until max length and you will have every possible sequence of characters.

You will need about O(n!) time for this.

Now this is kind of obvious. I suppose you have the idea but don't use the appropriate code? So you want to know how the above can be written in C/C++?
• 06-28-2008
Quote:

I suppose you have the idea but don't use the appropriate code? So you want to know how the above can be written in C/C++?
Yeah, but I definitely do not want a full source.
Just an idea of how to start it.
• 06-28-2008
C_ntua
Well, not much time to think about it, a lot of exams await me :). I m sure someone here already knows such an algorithm. It is a common think after all.
I ll get to this tomorrow maybe
• 06-28-2008
Thanks mate!
• 06-28-2008
foxman
I don't see how to implement C_ntua solution without having to write code specific to each string length. That's it, you would have to write some code for generating all the string of length 1, then write some other code for generating all the string of length 2, etc, until you are bored. That makes me think there should be a better way/algorithm for doing this -- but maybe there's a clever implementation of C_ntua I just don't see.

I see quite different ways to resolve your problem "elegantly", but most have poor spatial complexity. Let's say N is the maximum length of your generated strings and M the number of characters in your "charset". In the example you gave, N would be 2 and M would be 3. Well, the first way I tough off was about building a tree of depth N, where every node has M childs, the root node being kind of special. So, staying with your example, we would build a tree that would looks like this (sorry for the bad looking ASCII-art ;)

Code:

```          root       /    |    \       /    |    \     /      |      \   a        b        c  / | \    / | \    / | \ a  b  c  a  b  c  a  b  c```
Once built, to generate every "permutations", you would basically only had to do a breadth-first search, and on every node, print the path from the root to the current node. The problem is, even if it's not really complex to implement if you know trees, it's still a bit "overkill" and will take a good time to write, plus the (really) bad complexity. This just can't be a great solution.

So, I came up with another idea. How about using numbers (with a logical representation in base M), where each digit would be associated with a character from your set. So, to generates every possible combination, you would only have to increment your number, then "split" his digits and print the associated character. Staying with our example:

Code:

```Str  Number  Number     base 3  base 10 a    0      0 b    1      1 c    2      2 aa  00      0 ab  01      1 ac  02      2 ba  10      3 bb  11      4 bc  12      5 ca  20      6 cb  21      7 cc  22      8```
In our example, a is associated with the digit 0, b is associated with the digit 1 and c is associated with the digit 2. So, from a number in base M, you can "derive" a string.

But there's a little problem. How do we know if 0 should yield to the string "a" or "aa" or "aaaaaa". Well, we can't, but this isn't a big problem anyway, so let's not care.

Knowing that if you want to generate every possible strings of length I with M characters, you will get M^I different strings, you can easily calculate where your "number" has to stop for every length of string. This solution has a really good spatial complexity, is fairly easy to implement and will work whatever the length of the string you want to generate. Just be careful about the possible overflow of the integer you chose to hold the "number"; e.g, if you chose an unsigned int for holding this number, you need that UINT_MAX > M ^ N.

Anyway, I tested it, and it worked, like I tough. I'm giving the solution (everyone will hate me on this board :p), but --

-- well, in fact, no, I'll not be giving the solution. Except if you ask.

Also, if you can think of another solution, try it, there's numerous of them.
• 06-28-2008
iMalc
This is how you do it:

Start off with the empty string.
Repeat until you reach the maximum length desired:
For ALL strings produced so far:
Produce more strings with the new char in position 0 to strlen(x)
End repeat

So initially you have ""
The new char is 'a'. We put this in the only position that exists in the empty string, and produce "a".
Now we have two strings: "" and "a".

We now repeat with 'b'
Inserting 'b' into "" is trivial and gives us "b"
Then we insert 'b' into all possible placs in "a". This gives us "ab" and "ba"
We now have 5 strings: "", "a", "b", "ab", "ba"

We now repeat with 'c'
Inserting 'c' into "" is again trivial and gives us "c"
Then we insert 'c' into all possible placs in "a". This gives us "ac" and "ca"
Then we insert 'c' into all possible placs in "b". This gives us "bc" and "cb"
Then we insert 'c' into all possible placs in "ab". This gives us "abc", "acb", and "cab"
Then we insert 'c' into all possible placs in "ba". This gives us "bac", "bca", and "cba"
We now have 15 strings:
"", "a", "b", "ab", "ba", "ac", "ca", "bc", "cb", "abc", "acb", "cab", "bac", "bca", "cba".

You could now proceed to insert 'd' into every string produced so far, in every possible place.
You can sort the results however you wish. The above is simply the order they were created.
• 06-28-2008
foxman
Here's another way to do it:

Start with an empty string or with every different strings of length 1. Push them in a queue.
While the queue isn't empty
• Pop the next value of the queue. Do whatever you want with the generated string (print/store).
• Push in the queue the string you just popped, concatenated with every characters, if the original string is shorter than the max length.

Example. You start with a queue with the string "a", "b", "c".
You pop the string "a". Let's say you print it. Than you push "aa", "ab", "ac"
You pop the string "b". You print it. Than you push "ba", "bb", "bc"
You pop the string "c". You print it. Than you push "ca", "cb", "cc"
You pop the string "aa". You print it. Since "aa".length == 2 and you don't want to look for string longer than 2 characters, you don't push any new strings.
Etc...
• 06-28-2008
iMalc
Oh I missed that you wanted to allow duplicate letters. In that case it's trivial; foxman has the answer.
• 06-28-2008