thesis

declarative recall for approximate nearest neighbor vector search

Declarative recall is a mechanism to elimate parameter tuning from approximate nearest neighbor vector search algorithms, since it allows users to specify a desired recall level instead of trying to tune the various hyperparameters of the underlying algorithm. Recently, declarative recall has been provided by using early termination techniques, which allow the search to stop early when a certain recall level is reached.

my thesis work (May 2025 - present)

Our work on declarative recall is based recall prediction, a simple1 but powerful early termination idea that estimates the recall of a query at runtime, and allows the search to stop early when the desired recall is reached.

Our work constitutes the first general solution to the problem: works for k-NN graph and IVF indexes, supports all common distance measures, allows users to choose a different target recall for each query at query time, and users do not need to set any parameters.

Simplicity is a great virtue but it requires hard work to achieve it and education to appreciate it. And to make matters worse: complexity sells better.

Edsger Wybe Dijkstra
  1. DARTH: Declarative recall through early termination for approximate nearest neighbor search [paper] [code]
    Manos Chatzakis, Yannis Papakonstantinou, and Themis Palpanas
    Proceedings of the ACM on Management of Data, 2025
    Volume 3, number 4, pages 1--26. ACM New York, NY, USA.
    DARTH is the basis of the declarative recall mechanism that we are currently developing. It proposes a recall prediction model that estimates the recall of individual queries at runtime, and allows the search to terminate early when the desired recall is reached or surpassed. DARTH idea is general, can be applied to any index and distance measure, and elimates all hyperparameter tuning.
  2. DARTH+: Approximate Nearest Neighbor Search with Declarative Recall and Quality Guarantees [preprint]
    Manos Chatzakis, Yannis Papakonstantinou, and Themis Palpanas
    March 2026
    DARTH+ extends and optimizes DARTH. It significantly improves recall prediction quality and required training times of DARTH, while it also provides optional probabilistic quality guarantees for the predicted recall of each query, allowing for robust early termination.
  3. DaiSy: A Library for Scalable Data Series Similarity Search [paper] [code]
    Francesca Del Gaudio, Manos Chatzakis, Gayathiri Ravendirane, Botao Peng, and Themis Palpanas
    ArXiv preprint, 2026.
    Learning-based methods for vector search usually require groundtruths. DaiSy, an open-source library for super-fast exact vector and data series similarity search, can be used to generate groundtruths for large datasets, and eliminates the overheads of brute-force search.

reading list

I am maintaining a github repo summarizing the latest and most interesting papers on early termination.