[Solved]: Why do Bloom filters work?

posted by , on 2:08:00 AM , No Comments

Problem Detail:

Let's say I am using Bloom filters to create a function to check if a word exists in a document or not. If I pick a hash function to fill out a bit bucket for all words in my document. Then if for a given number of words, wouldn't the whole bit bucket be all 1s? If so then checking for any word will return true? What am I missing here?

Asked By : user220201

Answered By : Wandering Logic

Given that you want to insert $n$ words into the Bloom filter, and you want a false positive probability of $p$, the wikipedia page on Bloom filters gives the following formulas for choosing $m$, the number of bits in your table and $k$, the number of hash functions that you are going to use. They give $ m = - \frac{n \ln p}{(\ln 2)^2} $ and $$ k = \frac{m}{n}\ln 2=-\frac{\ln p}{\ln 2}=-\lg_2p, $$ so you should choose $$ m=\frac{nk}{\ln 2}. $$

That actually works out quite nicely. You are going to get a table with about half the bits set and half cleared, so the entropy per bit is going to be maximal, and the probability of a false positive is going to be $0.5^k$.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/12834

MCA IGNOU GROUP

[Solved]: Why do Bloom filters work?

Asked By : user220201

Answered By : Wandering Logic

Best Answer from StackOverflow

Akshay Mishra

0 comments:

Post a Comment

Search By Course Code

Blog Archive

Popular Posts

Label Cloud

Ad Blocker

[Solved]: Why do Bloom filters work?

Asked By : user220201

Answered By : Wandering Logic

Best Answer from StackOverflow

Akshay Mishra

0 comments:

Post a Comment

Search By Course Code

Blog Archive

Popular Posts

Label Cloud

Ad Blocker

Click to Download Now