World's most popular travel blog for travel bloggers.

[Solved]: What is the difference between radix trees and Patricia tries?

, , No Comments
Problem Detail: 

I am learning about radix trees (aka compressed tries) and Patricia tries, but I am finding conflicting information on whether or not they are actually the same. A radix tree can be obtained from a normal (uncompressed) trie by merging nodes with their parents when the nodes are the only child. This also holds for Patricia tries. In what ways are the two data structures different?

For example, NIST lists the two as the same:

Patricia tree

(data structure)

Definition: A compact representation of a trie in which any node that is an only child is merged with its parent.

Also known as radix tree.

Many sources on the web claim the same. However, apparently Patricia tries are a special case of radix trees. Wikipedia entry says:

PATRICIA tries are radix tries with radix equals 2, which means that each bit of the key is compared individually and each node is a two-way (i.e., left versus right) branch.

I don't really understand this. Is the difference only in the way comparisons are made when doing look-ups? How can each node be a "two-way branch"? Shouldn't there be at most ALPHABET_SIZE possible branches for a given node?

Can someone clarify this? For practical purposes, are radix tries typically implemented as Patricia tries (and, hence, often considered the same)? Or can no such generalizations be made?

Asked By : w128

Answered By : Mario Cervera

I found this post very helpful.

To see the difference between Patricia tries and radix trees, it is important to understand:

  • The notion of radix, since Patricia tries are radix trees with radix equal to 2.
  • The way keys are treated: as streams of bits. Keys are compared $r$ bits at a time, where $2^r$ is the radix of the trie.

Suppose that we insert the keys smile, smiled, and smiles (in this order) in a Patricia trie. The binary representation of these keys is the following:

Binary representation of the three example keys

Note that smile is a prefix of smiled, and, analyzing the binary representation, we can see that the first bit that differs (from left to right) is 0 (highlighted in red in the second row); for this reason, smiled will be the left child of smile. Similarly, smiles will be the right child of smiled because they share the same prefix up to a bit whose value is 1 (highlighted in red in the third row). The resulting Patricia trie after inserting the three keys is the following:

Patricia trie with 3 nodes

If the radix was, for example, 4, then internal nodes could have, at most, four children (with their edges labeled 00, 01, 10, and 11, respectively). In this case, keys would be compared by chunks of 2 bits, and not 1 (as in Patricia tries).


In what ways are the two data structures different?

To my understanding, the only difference is the radix, which is equal to 2 in the case of Patricia tries. This value can be any power of 2 in regular radix trees.

Is the difference only in the way comparisons are made when doing look-ups?

In both data structures, the comparison operation is bitwise. However, the number of bits that are checked atomically varies according to the radix. In the case of Patricia tries, bits are compared individually (since radix = 2). This is not necessarily the case in radix trees. In general, bits are checked in chunks of size $\log_2{R}$, where $R$ is the radix of the trie.

How can each node be a "two-way branch"? Shouldn't there be at most ALPHABET_SIZE possible branches for a given node?

The radix establishes the maximum number of children that the nodes of a radix tree can have. For example, when radix = 2, each node can have at most two children. This is the case of Patricia tries (also known as binary radix trees).

Are radix tries typically implemented as Patricia tries (and, hence, often considered the same)? Or can no such generalizations be made?

To be honest, I do not have an answer for this question. It seems that both data structures were proposed around the same time by different authors. For historical reasons that I am unaware of, both terms still live today.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/63048

0 comments:

Post a Comment

Let us know your responses and feedback