World's most popular travel blog for travel bloggers.

What is the average height of a binary tree?

, , No Comments
Problem Detail: 

Is there any formal definition about the average height of a binary tree?

I have a tutorial question about finding the average height of a binary tree using the following two methods:

  1. The natural solution might be to take the average length of all possible paths from the root to a leaf, that is

    $\qquad \displaystyle \operatorname{avh}_1(T) = \frac{1}{\text{# leaves in } T} \cdot \sum_{v \text{ leaf of } T} \operatorname{depth}(v)$.

  2. Another option is to define it recursively, that is the average height for a node is the average over the average heights of the subtrees plus one, that is

    $\qquad \displaystyle \operatorname{avh}_2(N(l,r)) = \frac{\operatorname{avh}_2(l) + \operatorname{avh}_2(r)}{2} + 1$

    with $\operatorname{avh}_2(l) = 1$ for leafs $l$ and $\operatorname{avh}_2(\_) = 0$ for empty slots.

Based on my current understanding, for example the average height of the tree $T$

    1        / \   2   3  / 4 

is $\operatorname{avh}_2(T) = 1.25$ by the second method, that is using recursion.

However, I still don't quite understand how to do the first one. $\operatorname{avh}_1(T) = (1+2)/2=1.5$ is not correct.

Asked By : Timeless

Answered By : Raphael

There is no reason to believe that both definitions describe the same measure. You can write $\operatorname{avh}_1$ recursively, too:

$\qquad \displaystyle \operatorname{avh}_1(N(l,r)) = \frac{\operatorname{lv}(l)(\operatorname{avh_1}(l) + 1) + \operatorname{lv}(r)(\operatorname{avh_1}(r) + 1)}{\operatorname{lv}(l) + \operatorname{lv}(r)}$

with $\operatorname{avh}_1(l) = 0$ for leaves $l$. If you don't believe that this is the same, unfold the definition of $\operatorname{avh}_1$ on the right hand side, or perform an induction proof.

Now we see that $\operatorname{avh}_1$ works quite differently from $\operatorname{avh}_2$. While $\operatorname{avh}_2$ weighs the recursive heights of a nodes children equally (adding and dividing by two), $\operatorname{avh}_1$ weighs them according to the number of leaves they contain. So they are the same (modulo the anchor) for leaf-balanced trees, that is balanced in the sense that sibling trees have equally many leaves. If you simplify the recursive form of $\operatorname{avh}_1$ with $\operatorname{lv}(l) = \operatorname{lv}(r)$ this is immediately apparent. On unbalanced trees, however, they are different.

Your calculations are indeed correct (given your definition); note that the example tree is not leaf-balanced.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/2762

0 comments:

Post a Comment

Let us know your responses and feedback