World's most popular travel blog for travel bloggers.

[Solved]: Why do we need "Bloom Filters" if we can use hash tables?

, , No Comments
Problem Detail: 

A Bloom filter is a probabilistic data structure designed to tell, rapidly and memory-efficiently, whether an element is in the set or no.

If we can use hash tables where we have O(1) in best time, O(n) in a very bad situations. Why do we need a new abstract data structure that look-up for an element with less certainty. What are the scenarios where HashTables fails and demands the use of bloom filters?

When should bloom filters be used over hashtables and vice-versa?

Asked By : user3378649

Answered By : David Richerby

The second paragraph of the Wikipedia article on Bloom filters says the following, with a citation to Bloom's original 1970 paper.

Bloom proposed the technique for applications where the amount of source data would require an impracticably large hash area in memory if "conventional" error-free hashing techniques were applied. He gave the example of a hyphenation algorithm for a dictionary of 500,000 words, out of which 90% follow simple hyphenation rules, but the remaining 10% require expensive disk accesses to retrieve specific hyphenation patterns. With sufficient core memory, an error-free hash could be used to eliminate all unnecessary disk accesses; on the other hand, with limited core memory, Bloom's technique uses a smaller hash area but still eliminates most unnecessary accesses. For example, a hash area only 15% of the size needed by an ideal error-free hash still eliminates 85% of the disk accesses.


Burton H. Bloom, Space/Time Trade-offs in Hash Coding with Allowable Errors. Communications of the ACM 13(7): 422–426, 1970. (DOI)

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/32264

3.2K people like this

 Download Related Notes/Documents

0 comments:

Post a Comment

Let us know your responses and feedback