**Problem Detail:**

I have a homework problem that is perplexing me because the math is beyond what I have done, although we were told that it was unnecessary to solve this mathematically. Just provide a close upper bound and justify it.

Let $$f(n) = |\{w ∈ \{a, b\}^n : aa \notin w \}|. $$ Provide an asymptotic upper bound on $f$ as $n\to\infty$.

So far: $$ \begin{array}{c|c|c} n & \text{strings} & \text{compared to } 2^n \\\hline 1 & 2 & 2^n - 0 \\ 2 & 3 & 2^n - 1 \\ 3 & 5 & 2^n - 3 \\ 4 & 8 & 2^n - 8 \\ 5 & 13 & 2^n - 18 \\ 6 & 21 & 2^n - 43 \\ \end{array} $$

The math that will give me an exact bound is beyond me. Obviously $O(2^n)$ is an upper bound, though it is not particularly tight.

Any suggestions on what I should try?

###### Asked By : Wawa

#### Answered By : Lee Gao

So I'm not completely sure, but I think you're asking to count the number of strings of size $n$ (over the alphabet $\{a, b\}$) where the factor/substring $aa$ does not appear right?

In this case, there are a few combinatorial approaches that you can take. Both Yuval and ADG have given simpler and more intuitive arguments, so I definitely suggest checking their answers out! Here's one of my favorites, it's a little strange, but it's a very general (and kind of a fun) approach.

Let's start with a simpler language, that of words which starts and ends with $b$ (also with no substrings of $aa$). We can look at an admissible string (e.g. $bbb\textbf{a}b\textbf{a}bbbb$) as a list of sequences of $b$s separated by singular $a$s. This gives the construction: $$ w = (b^+a)^*b^+ $$ Now, how do we count sentences that belongs to this language?

Let's imagine that we're expanding these expressions out. What does $e^*$ denote? Well, it's simply $$ e^* = \epsilon \mid e \mid ee \mid eee \mid eeee \mid \dots $$ Now, this will make very little sense, but let's imagine that $e$ is a variable over some numerical field. In particular, we'll treat $\epsilon \to 1$, $a \mid b \to a + b$, and $abc \to a \times b \times c$. This then says that $$ e^* \to 1 + e + ee + eee + \dots $$ Let's try to see the motivation behind this strange interpretation. This is *almost* a bijective transformation. In particular, we want to preserve the *count* of each $e^n$ word, which, as you can readily see, we do. However, there's one crucial difference between the string expressions and numerical expressions: the multiplication (concatenation in strings, $\times$ in numerical expressions) is now commutative! Intuitively, commutativity lets us treat all permutations of the same word as the same; that is, we don't disambiguate between the expression $bbbab$ and $bbabb$; they both represent a string with 4 $b$s and one $a$. Therefore, this transformation allows us to preserve the count of each word of a certain number of $a$s and $b$s, but it now allows us to turn a blind eye on the superfluous details that we don't care about.

If you go back to precalculus, you might recognize this series as $\frac{1}{1 - e}$. I know that it doesn't make any sense to rewrite this regular expression as a numerically-valued function, but just bare with me for a moment.

Similarly, $e^+ = ee^* \to \frac{e}{1 - e}$. Which means that we can translate $w$ into $$ w \to \frac{1}{1 - (\frac{b}{1 - b} \times a)} \times \frac{b}{1 - b} $$

In turn, we can simplify this down to $$ w(a, b) = b \times \frac{1}{1 - (b + ba)} $$

This tells us that the language $w$ is isomorphic to the language $b(b \mid ab)^*$ (whose direct translation is already $\frac{b}{1 - b - ba}$) without ever having to resort to any language-theoretic tools! This is one of the power of treating these series as closed-form functions: we can perform simplifications on them that are nearly impossible to perform otherwise, hence reducing it down to a simpler problem.

Now, if you still recall any of your calculus course, you'll recall that certain types of functions (being well-behaving enough) admits these series representations known as Taylor expansions. Don't worry, we won't really have to worry about those pesky Calc 1 problem-sets; I'm just merely pointing out that these functions can be represented as the sum \begin{align*} w(a, b) &= \sum_{i,j} w_{ij} a^i b^j \end{align*} so that $w_{ij}$ gives the number of words that satisfies $w$ such that it has exactly $i$ occurrences of $a$ and $j$ occurrences of $b$. However, we don't particularly care about whether something is an $a$ or a $b$. Rather, we just care about the total number of characters in the string. To turn a "blind eye" between $a$ and $b$, we can just (literally) treat them the same, e.g. let $z = a = b$ and get $$ w(z) = w(z, z) = \frac{z}{1 - z - z^2} = \sum_k w_k z^k $$

where $w_k$ counts the number of satisfiable words of length $k$.

Now, all that's left to do is to find $w_k$. The usual combinatorial approach here would be to decompose this rational function into its partial-fraction: that is, given the denominator $1 - z - z^2 = (z - \phi)(z - \psi)$, we can rewrite $\frac{z}{(z - \phi)(z - \psi)} = \frac{A}{z - \phi} + \frac{B}{z - \psi}$ (There's a bit of algebra involved here, but this is a universal property of rational (one polynomial dividing another) functions). To solve this, you can refactor $$ \frac{A}{z - \phi} + \frac{B}{z - \psi} = \frac{z}{(z - \phi)(z - \psi)} $$ which generates the constraints $A + B = 1, A\psi + B\phi = 0$. Irregardless of what $A$ and $B$ are, recall that $\frac{1}{1 - x} = 1 + x + x^2 + \dots$, well, we can rearrange \begin{align*} w(z) &= \frac{-A}{\phi - z} + \frac{-B}{\psi - z} \\ &= (-A\phi)\frac{1}{1 - \frac{z}{\phi}} + (-B\psi) \frac{1}{1 - \frac{z}{\psi}} \\ &= (-A\phi)\left(1 + \phi^{-1} z + \phi^{-2} z^2 + \dots\right) + (-B\psi)\left(1 + \psi^{-1} z + \psi^{-2} z^2 + \dots\right) \end{align*} therefore $$ w_k = (-A\phi)\phi^{-k} + (-B\psi)\psi^{-k} $$ Here, $\phi$ is the golden ratio $\frac{1 + \sqrt{5}}{2}$ and $\psi = -\phi^{-1}$ is its conjugate. We then have an easy description of the asymptotic behavior of the $w$ language: it runs in $\Theta(\phi^n)$. In fact, if you expand everything out, you'll find that $$ w_k = \frac{\phi^{k} - \psi^{k}}{\sqrt{5}} = \left\lceil \frac{\phi^{k}}{\sqrt{5}} \right\rceil $$ There's also an intricate connection to another common combinatorial class. This is just the Fibonacci numbers!

Now, suppose you have $w_k$, which counts the number of strings of size $k$ that starts and ends with $k$ (and also contains no $aa$ substrings), how can we build a string that can start or end with an $a$? Well, it's simple: an admissible string is either in $w$ (starts and ends with $b$), or it is $a w$ (starts with $a$), or it is $w a$ (ends with $a$), or it is $a w a$ (starts and ends with $a$). Therefore: $$ f(n) = w_n + w_{n-2} + 2 * w_{n - 1} $$ Recall that $w_n$ is the fibonacci sequence, so $w_{n-1} + w_{n-2} = w_n$, which means that \begin{align*} f(n) &= (w_n + w_{n-1}) + (w_{n-2} + w_{n-1}) \\ &= w_{n+1} + w_{n} \\ &= w_{n+2} \end{align*} Therefore, $f(n) = \text{fib}(n + 2) = \left\lceil \frac{\phi^{n+2}}{\sqrt{5}} \right\rceil$

Now you probably don't have to do this analysis, but just by having the insight that this sequence is a shifted Fibonacci sequence ought to give you an idea of some other combinatorial interpretations that you can try.

###### Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/66099

**3.2K people like this**

## 0 comments:

## Post a Comment

Let us know your responses and feedback