Understanding the BM25 full text search algorithm
published on 2024/11/20
BM25, or Best Match 25, is a widely used algorithm for full text search. It is the default in Lucene/Elasticsearch and SQLite, among others. Recently, it has become common to combine full text search and vector similarity search into "hybrid search". I wanted to understand how full text search works, and specifically BM25, so here is my attempt at understanding by re-explaining.
https://emschwartz.me/understanding-the-bm25-full-text-search-algorithm/