docs/modules/bloom.md
| Since | Origin / Contributor | Maintainer | Source |
|---|---|---|---|
| 2017-11-13 | Philip Gladstone | Philip Gladstone | bloom.c |
This module implements a Bloom filter. This is a probabilistic data structure that is used to test for set membership. There are two operations -- add and check that allow
arbitrary strings to be added to the set or tested for set membership. Since this is a probabilistic data structure, the answer returned can be incorrect. However,
if the string is a member of the set, then the check operation will always return true.
Create a filter object.
bloom.create(elements, errorrate)
elements The largest number of elements to be added to the filter.errorrate The error rate (the false positive rate). This is represented as n where the false positive rate is 1 / n. This is the maximum rate of check returning true when the string is not in the set.A filter object.
filter = bloom.create(10000, 100) -- this will use around 11kB of memory
Adds a string to the set and returns an indication of whether the string was already present.
filter:add(string)
string The string to be added to the filter set.true if the string was already present in the filter. false otherwise.
if filter:add("apple") then
print ("Seen an apple before!")
else
print ("Noted that the first apple has been seen")
end
Checks to see if a string is present in the filter set.
present = filter:check(string)
string The string to be checked for membership in the set.true if the string was already present in the filter. false otherwise.
if filter:check("apple") then
print ("Seen an apple before!")
end
Empties the filter.
filter:reset()
Nothing
filter:reset()
Get some status information on the filter.
bits, fns, occupancy, fprate = filter:info()
bits The number of bits in the filter.fns The number of hash functions in use.occupancy The number of bits set in the filter.fprate The approximate chance that the next check will return true when it should return false. This is represented as the inverse of the probability -- i.e. as the n in 1-in-n chance. This value is limited to 1,000,000.bits, fns, occupancy, fprate = filter:info()