World's most popular travel blog for travel bloggers.

[Solved]: Why does a suffix tree have a linear number of nodes (relative to input string size)?

, , No Comments
Problem Detail: 

Aren't there $n^2$ unique substrings of a string (irrespective of the alphabet size)? Perhaps the number of unique suffix substrings is less than the number of unique substrings of a string.

Asked By : Wuschelbeutel Kartoffelhuhn

Answered By : A.Schulz

For a text of length $n$ we have up to $1+{ n+1 \choose 2}$ different substrings, however there are only $n+1$ suffixes (for every suffix you can pick the position where it starts).

I assume you consider the compressed suffix tree (edge labels are words). This is a tree with $n+1$ leaves and every internal node has at least two children. Thus we have less interior nodes than leaves an therefore the tree has size $O(n)$.

Notice that in the uncompressed version (edge labels a characters) with a large alphabet, you can have super-linear suffix trees. For example, consider the text abcdefghijk....

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/6943

0 comments:

Post a Comment

Let us know your responses and feedback