World's most popular travel blog for travel bloggers.

What is the time complexity of calling successor $n$ times during tree traversal?

, , No Comments
Problem Detail: 

According to some sources, the time complexity of finding the successor of a node in a tree is $O(h)$. So, if the tree is well balanced, the height $h=\log n$, and the successor function takes time $O(\log n)$. Yet, according to this stackoverflow post on the time complexity of an inorder traversal of a binary search tree, if you call the successor function $n$ times, the time complexity is $O(n)$.

What resolves the apparent contradiction between:

If I call the sucessor function once, the time complexity is $O(h)$, which could be $O(n)$ or $O(\log n)$, depending on the kind of tree.

AND

If I call the successor $n$ times, the time complexity is $O(n)$ in a balanced tree.

Shouldn't tree traversal take $O(n^2)$ or $O(n\log n)$ time?

Asked By : user1675999

Answered By : Merbs

Gist: The time complexity of calling the sucessor function multiple times is not merely the product of the number of calls and the worst-case bound, though that product does encompass the worst case. Rather, if the function going to be called from every node, a more sophisticated analysis can establish a tighter worst-case bound for specific trees and implementations of the successor function.

At best, tree traversal takes at $O(n)$ time: $n-1$ calls are made to the successor function, which takes constant time. One simple example of this is the traversal of a linked list, which has a successor function that immediately returns a pointer to the next node with no additional computation.

Linked List

The worst "reasonable" tree traversal would take $O(n^2)$ time: $n-1$ calls are made to the successor function, which searches all $n$ nodes before determining the successor. For example, if we have an unsorted tree and the successor is defined to be the next largest element (rather than based on its position in the tree). Of course, we could sort the tree first in time $O(n\log n)$ and in general, if we expect to call the successor function $n$ times, it would be wise to preprocess the tree so that it can be traversed in one of the three canonical ways: inorder, preorder, or postorder.

Ok, time to clear a misconception! Let's ask what's the best-case and worst-case scenarios for the successor function are when we have an inorder traversal of a binary search tree:

Sorted Binary Tree

Remember that in inorder traversal, the left subtree is visited, then the parent node, then the right subtree. So an inorder traversal in the tree above gives us $[A,B,C,D,E,F,G,H]$. (For contrast, preorder traversal would have given us $[F,B,A,D,C,E,G,I,H]$). We start at the beginning of the tree at $A$. From $A$, the successor function checks to see that it has no right subtree (it doesn't) before finding its closest ancestor for which it is the descendant of the left child. But wait! I didn't tell you that each node had knowledge of its ancestors, so we potentially have to search the entire tree just to determine where $A$ is! So now we split into two cases:

  • Worst-case: Assuming that our node has no right subtree, if the tree was a binary search tree, we could find the node in $O(\log n)$ time. But worst-case, it's not a binary search tree; in fact, it has no structure, so every call takes $O(n)$ time, but if we're consistent about how we search, the total number of steps to find the node's successor is at least $\sum_{x=1}^{n}x=\frac{(n)(n+1)}{2}=O(n^2)$.

  • Best-case: Each node has a link to its ancestor, so from $A$ to $B$ takes two steps (check for right subtree; back up to ancestor for which it is the left child). The successor function that takes the most steps is from $E$ to $F$, for which $E$ first checks that it has no right subtree, then checks to see if it is the left child of $D$ (it isn't), or the descendent of the left child of $B$ (it isn't), and finally if it is the descendent of the left child of $F$ (it is). That took $O(h)=O(\log n)$ steps, but most calls to the function won't take that many steps. Quoting the SO post, "observe that each edge in the tree gets visited at most twice (once from parent to child and once from child to the parent)" so the total number of nodes visited is $2n$ and hence $O(n)$ worst-case bound.

Another way to appreciate that the average call to successor is $<\log(n)$ is to consider a balanced tree with $n$ nodes and roughly $n/2$ leaves. Half of all successor function calls will be to an ancestor, half of which are from the left child ($\frac{1}{4}n$) and thus takes only one generation/step, $\frac{1}{8}n$ go back two generations, $\frac{1}{16}n$ go back three. Hopefully you see the pattern: $\sum_{i=1}\frac{i}{2^{i+1}}$ converges to $1$. And the other half does the step-analogous $\min$ function (searching it's right subtree for the leftest child), so it also converges to $1$. The expected time is thus $2$ (if I've done my math carefully). So that's $n-1$ calls to a successor function that on average takes ~$2$ steps $\rightarrow O(n)$ time.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/6378

0 comments:

Post a Comment

Let us know your responses and feedback