Good vector search means more than embeddings.

Embeddings don’t know when a result matches / doesn’t match. Similarity floors don’t work consistently - a cutoff that works for one query might be disastrous for another. Even worse: your embedding usually can’t capture every little bit of meaning from your corpus.

You need to efficiently pick the best top N candidates from your vector database.

What do you need?

  • Query Understanding - translating the query to domain language (categories, colors, etc?) likely to produce the best results
  • Filters - Exclude from scoring results that would obviously be irrelevant
  • Boosts - Promote items close to the information need in ways not expressed in your embedding. Bring up the most popular, the one with shipping availability, etc.

Vector search is not enough, search requires a full suite of solutions to work.

-Doug

This is part of Doug’s Daily Search tips - subscribe here


Doug Turnbull

More from Doug
Twitter | LinkedIn | Newsletter | Bsky