Understanding the BM25 full text search algorithm

published on 2024/11/20

BM25, or Best Match 25, is a widely used algorithm for full text search. It is the default in Lucene/Elasticsearch and SQLite, among others. Recently, it has become common to combine full text search and vector similarity search into "hybrid search". I wanted to understand how full text search works, and specifically BM25, so here is my attempt at understanding by re-explaining.

https://emschwartz.me/understanding-the-bm25-full-text-search-algorithm/