tech

January 28, 2026

Optimizing BM25 for the Next Generation of Semantic Search Technology

Explore our AI research blog to learn how Exa is pushing the boundaries of search relevance by optimizing the BM25 algorithm for a modern neural network search engine.

Optimizing BM25 for the Next Generation of Semantic Search Technology

TL;DR

  • Exa optimized its BM25 index by over 50% across billions of documents.
  • The optimization process focused on both algorithmic improvements and efficient data structures.
  • Key algorithmic optimizations include selective initial retrieval and threshold-based pruning.
  • Data structure optimizations addressed memory inefficiencies like Vec allocation, uncompressed document IDs, structure padding, and redundant frequency storage.
  • Techniques used for data structure optimization include grouping documents by term frequency, delta encoding with variable-length representation, zstd compression, flattening inner lists, and special handling for singleton tokens.
  • The overall memory overhead was reduced by consolidating postings lists into a single buffer.
  • Query latency improved by 10% due to more predictable memory access patterns.
  • Benefits include greater retrieval capacity, lower operational costs, faster cold starts, and better hybrid search quality.
  • The optimization underscores the principle that 'every bit matters' in large-scale systems.