[Solved]: Why does the Count-Min Sketch require pairwise independent hash functions?

posted by , on 8:19:00 AM , No Comments

Problem Detail:

The Count-Min Sketch is an awesome data structure for estimating the frequencies of different elements in a data stream. Intuitively, it works by picking a variety of hash functions, hashing each element with those hash functions, and incrementing the frequencies of various slots in various tables. To estimate the frequency of an element, the Count-Min sketch applies the hash functions to those elements and takes the minimum value out of all the slots that are hashed to.

The original paper on the Count-Min Sketch mentions that the data structure requires pairwise independent hash functions in order to get the necessary guarantees on its expected performance. However, looking over the structure, I don't see why pairwise independence is necessary. Intuitively, I would think that all that would be required would be that the hash function be a universal hash function, since universal hash functions are hash functions with low probabilities of collisions. The analysis of the collision probabilities in the Count-Min Sketch looks remarkably similar to the analysis of collision probabilities in a chained hash table (which only requires a family of universal hash functions, not pairwise independent hash functions), and I can't spot the difference in the analyses.

Why is it necessary for the hash functions in the Count-Min Sketch to be pairwise independent?

Thanks!

Asked By : templatetypedef

Answered By : Sasho Nikolov

You are right: universal hashing suffices. Pairwise independence, while stronger, is the usual method to construct a universal hash family. Also pairwise independence is contrasted in the paper with the 4-wise independence required by previous methods, such as the AMS sketch.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/7275

MCA IGNOU GROUP

[Solved]: Why does the Count-Min Sketch require pairwise independent hash functions?

Asked By : templatetypedef

Answered By : Sasho Nikolov

Best Answer from StackOverflow

Akshay Mishra

0 comments:

Post a Comment

Search By Course Code

Sponsored Links

Blog Archive

Popular Posts

Label Cloud

adsbygoogle

Ad Blocker

[Solved]: Why does the Count-Min Sketch require pairwise independent hash functions?

Asked By : templatetypedef

Answered By : Sasho Nikolov

Best Answer from StackOverflow

Akshay Mishra

0 comments:

Post a Comment

Search By Course Code

Sponsored Links

Blog Archive

Popular Posts

Label Cloud

adsbygoogle

Ad Blocker

Click to Download Now