Vector search isn’t that hard - Think about maps!

Nearest neighbors in 768 dimensions is like nearest neighbors in 2 dimensions.

Then solve “Find the 10 closest addresses’s to Doug”. Really there’s two systems humans have used to organize addresses:

  • Postal codes - clumping a constant set of addresses into a single grouping. Know Doug’s zip code? Well scan through the addresses in Doug’s zip code to find the nearest to him
  • Streets - connecting all the addresses together in a network. Know Doug’s street? Then walk down the streets to gather the other addresses nearby

The former approximates cluster-based retrieval methods like simple IVF files or SPFresh. The latter approximates a graph like HNSW connecting a vector to its neighbors, traversing to find top N closest neighbors.

-Doug

This is part of Doug’s Daily Search tips - subscribe here


Enjoy softwaredoug in training course form!

Starting June 22!

I hope you join me at Cheat at Search with LLMs to learn how to apply LLMs to search applications. Check out this post for a sneak preview.

Doug Turnbull

More from Doug
Twitter | LinkedIn | Newsletter | Bsky
Take My New Course - Cheat at Search with LLMs