Bloom Filter

一个简单而高效的 Dart 实现 Bloom Filter,一种节省空间的概率性数据结构,用于测试一个元素是否是集合的成员。

特点

  • 可自定义的误报率
  • 高效的空间利用
  • 快速的元素插入和查询
  • 序列化和反序列化支持

安装

serializable_bloom_filter 包添加到您的 pubspec.yaml 文件中

dependencies:
  serializable_bloom_filter: ^1.0.0

然后运行 dart pub get 来安装该包。

用法

导入 serializable_bloom_filter.dart 文件,并使用所需的误报率和预期的插入项数量创建一个新的 BloomFilter 对象。

import 'package:serializable_bloom_filter/serializable_bloom_filter.dart';

void main() {
  // Create a new Bloom filter with a false positive probability of 1% and an expected number of items of 100
  BloomFilter bloomFilter =
      BloomFilter(falsePositiveProbability: 0.01, numItems: 100);

  // Add items to the filter
  bloomFilter.add("Alice");
  bloomFilter.add("Bob");

  // Check if an item is in the filter
  print(bloomFilter.contains("Alice")); // true
  print(bloomFilter.contains("Bob")); // true
  print(bloomFilter.contains("Charlie")); // false (might be true due to false positive)
}

序列化和反序列化

您可以序列化 Bloom Filter 的位数组以供存储,并在之后检索它。

import 'package:serializable_bloom_filter/serializable_bloom_filter.dart';

void main() {
  BloomFilter bloomFilter =
      BloomFilter(falsePositiveProbability: 0.01, numItems: 100);
  bloomFilter.add("Alice");
  bloomFilter.add("Bob");

  // Serialize the Bloom filter's bit array to a byte array
  List<int> byteArray = bloomFilter.serialize();
  int numHashFunctions = bloomFilter.numHashFunctions;

  // Save numHashFunctions and the byte array to a file or database, etc.
  // ...

  // Load numHashFunctions and the byte array from a file or database, etc.
  // ...

  // Deserialize the byte array back to the Bloom filter's bit array
  final loadedBloomFilter = BloomFilter.fromNumHashFunctionsAndByteArray(
      numHashFunctions: numHashFunctions, byteArray: byteArray);

  print(loadedBloomFilter.contains("Alice")); // true
  print(loadedBloomFilter.contains("Bob")); // true
  print(loadedBloomFilter
      .contains("Charlie")); // false (might be true due to false positive)
}

GitHub

查看 Github