What slows full-text search down? too many unique terms.

Consider search with numerical values. It’s unlikely you care about the distinction between 3.145927 and 3.14 when searching. Both are pi! 🥧

Instead of a postings list that looks like

3.145927 → [1, 5, 9]

3.14 → [1, 3, 9]

Collapse them to:

pi → [1, 3, 5, 9]

This requires you to pay attention to tokenization. Whether you actually have numbers, or more likely - you’re dealing with stemming or synonyms - collapsing terms to a single concept pays performance dividends.

And It helps improve recall too!

-Doug

This is part of Doug’s Daily Search tips - subscribe here


Doug Turnbull

More from Doug
Twitter | LinkedIn | Newsletter | Bsky