The Bloom filter
Google Tech TalksNovember, 15 2007ABSTRACTThe Bloom filter, conceived by Burton H. Bloom in 1970, is aspace-efficient probabilistic data structure that is used to testwhether an element is a member of a set. False positives are possible,but false negatives are not. Elements can be added to the set, but notremoved (though this can be addressed with a counting filter). Themore elements that are added to the set, the larger the probability offalse positives.For example, one might use a Bloom filter to do spell-checking in aspace-efficient way. A Bloom filter to which a dictionary of correctwords has been added will accept all words in the dictionary andreject almost all words which are not, which is good enough in somecases. Depending on the false positive rate, the resulting datastructure can require as little as a byte per dictionary word.In the last few years Bloom filter become hot topic again and therewere several modifications and improvements. In this talk I willpresent my last few improvements in this topic.Speaker: Ely PoratEly Porat received his Doctorate from Bar-Ilan University in 2000.Following that, he fulfilled his military service and, in parallel,worked as a faculty member at Bar-Ilan University. Having spent thespring 2007 semester as a Visiting Scientist in Google, he is now backat Bar-Ilan University.The main body of Ely Porat's work concerns matching problems: stringmatching, pattern matching, subset matching. He also worked on thenearest pair problem in high-dimensional spaces as well as sketchingand edit distance.