Haystack EU 2023 - Anubhav Bindlish: Combining Inverted and ANN Indexes for Scale
OpenSource Connections OpenSource Connections
1.69K subscribers
251 views
5

 Published On Oct 17, 2023

Search engines have traditionally employed inverted indexes to quickly filter documents. With the rise of vector embeddings and large language models, search engines are now adding ANN indexes.

Combining inverted indexes and ANN indexes into the same system introduces a number of implementation challenges including:

* How to handle the large amount of RAM required to hold vector data and indexed structures
* How to distribute an ANN graph across multiple shards and avoid expensive reindexing
* How to update vector embeddings or metadata quickly
* How to avoid contention between heavy indexing and vector search

We will discuss these challenges and how to elegantly design a system that can efficiently leverage multiple indexes in parallel for hybrid search. We’ll also discuss how combining traditional approaches and new approaches to search can yield an even better result than using two different database solutions.

Anubhav joined Rockset as a software engineer in 2021, and has been working in the data indexing and query execution space. Prior to this he worked at Meta Platforms (Facebook) for 5 years. Here he worked in the Integrity Infrastructure team building a platform that employed ML rules to keep bad actors off Facebook.

show more

Share/Embed