A **Bloom filter** is a space-efficient [probabilistic] [dat...

A Bloom filter is a space-efficient probabilistic data structure, conceived by Burton Howard Bloom in 1970, that is used to test whether an element is a member of a set. False positive matches are possible, but false negatives are not – in other words, a query returns either “possibly in set” or “definitely not in set”. Elements can be added to the set, but not removed (though this can be addressed with the counting Bloom filter variant); the more items added, the larger the probability of false positives.

Hash each item (consistently), then take the result and map it to an array of N positions and set a bit at each position. Store this as a bit array at the point of entry.

To support deletion of items indexed, look at cuckoo filters.

www.joshbeckman.org/notes/662146112