Don't confuse similarity for relevance

It’s easy to be seduced by the out-of-the-box capabilities of an embedding model. Immediately you get results ranked with your meaning first.

But relevance ranking goes beyond just query to passage similarity. When we rank search results, we almost always need to consider:

Is this search result trustworthy? (what approaches like pageRank have historically tried to measure)
Is the result popular? (Is the product purchased regularly? Or is it some obscure product nobody wants?)
Is it a recent result? Or outdated information?

Ranking can have more to do with general statistics rather than anything to do with the query itself. We must incorporate both worlds.

-Doug

Services: Training (use code search-tips) · Consulting

This is part of Doug’s Daily Search tips - subscribe here

I hope you join me at Cheat at Search with Agents to learn use agents in search. build better RAG and use LLMs in query understanding.