Skip to main content

Retrieval Modules Index

Trie

  • CEDAR — Dynamic trie with insert/delete support
  • DARTS — Static double-array trie
  • MARISA — Memory-mapped, static trie
  • TSL — Compact static trie

Bitmap

  • Roaring — Compressed, SIMD-friendly bitmap
  • Index — Overview and comparisons

Posting Lists

  • Index — Posting list structures and compression

Regular Expressions

  • PCRE — Full-featured regex engine
  • RE2 — Fast, safe regex engine
  • Index — Regex overview and selection guide

NLP

  • Hadar — Simplified / Traditional Chinese conversion
  • Jieba — Chinese word segmentation
  • SentencePiece — Subword tokenization
  • Index — NLP module overview

Vector Indexes

  • DiskANN — Disk-backed ANN search
  • FAISS — GPU / CPU vector search
  • HNSWLib — Graph-based ANN
  • NGT — Tree-based ANN
  • NMSLIB — Flexible ANN library
  • ScaNN — TensorFlow optimized ANN
  • SPTAG — Microsoft vector search
  • Index — ANN / vector index overview