Late interaction models, like ColBERT, give you fine-grained passage scoring.
In normal vector search, every document has exactly one vector. You score it against the query’s vector. You get a similarity. You rank the document. Done.
We call this a single vector representation
Late interaction works with multi-vector representations
Setup:
- We’re scoring a passage “susan loved her baby sheep”
- For the query “mary had a little lamb”
- Every token in the document has a vector[1]
Scoring the passage:
- We encode our first query token [mary]
- We find the passage token with highest similarity. In this case, probably [susan]
- (This is the max sim operation)
- We continue with [had], finding the max sim token, summing it in
- Continuing with every query token, until we have our final score
How does one train a ColBERT to produce a multi vector representation? Learn more in the Colbert paper
1 - using words here for simplicity, but we’d use a BERTy tokenizer like WordPiece
-Doug
This is part of Doug’s Daily Search tips - subscribe here
Enjoy softwaredoug in training course form!
Starting June 22!
I hope you join me at Cheat at Search with LLMs to learn how to apply LLMs to search applications. Check out this post for a sneak preview.