World's most popular travel blog for travel bloggers.

[Solved]: Get the nth lexicographic string of "all" possible combinations of an alphabet

, , No Comments
Problem Detail: 

Is there a way to find the nth string of characters from an alphabet, without having to store "all" of the combinations?

Example:

Alphabet $A = \{a,b,c\}, n=12$.

All possible combinations in lexicographic order are $C = \{a, ab, abc, ac, acb, b, ba, bac, bc, bca, c, ca, cab, cb, cba\}$.

You can see that there's no repetition of characters in any string and the empty string is not a member of $A$.

So, the nth string is: $ca$.

Clearly, to find this I'm iterating over the set $C$.

The problem is that I have to generate all these strings, wich takes a really long time, and then search for the nth one. If the alphabet is large $(1 \leq A \leq 26 )$ the set $C$ grows too fast (not sure how much) and is impossible to store.

My questions is if exists a way to find these nth one, without generating all the strings.

Also, I'm not sure if this question goes in this StackExchange site.

Asked By : joanca

Answered By : Yuval Filmus

It is better if you include the empty string as the $0$th string in lexicographic order. Here is how you convert an index into a word.

First, we are going to find the first letter of the word, and an internal index among all words starting with this letter. We do this by counting how many words start with each letter, including the "empty letter". I'll let you figure out how to do that. In your case, we get $1,5,5,5$. This means that $[0,1)$ corresponds to the empty sword, $[1,6)$ to words starting with $a$, $[6,11)$ to words starting with $b$, $[11,16)$ to words starting with $c$.

Given an index, you locate which interval it belongs to, and record the starting character and the internal index inside the interval. In your example, $12$ belongs to $[11,16)$ (and so starts with $c$), and its internal index is $1$.

If the index is $0$, then we are done. Otherwise, we recurse. In your example, we are left with the alphabet $\{a,b\}$ for which the intervals are $[0,1),[1,3),[3,5)$. The current index is $1$, so the word at this level of recursion starts with $a$, and the new index is $0$. That means that the recursion ends here. The outputs is thus $ca$.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/43370

0 comments:

Post a Comment

Let us know your responses and feedback