Don’t push complex ranking into the search engine. Layering in operation on top of plugin on top of who-knows-what-else harms user experience.

Why? Tail latency

In other words, in a distributed system, your query is as fast as your slowest node.

A rare event for a single node becomes frequent on the full cluster.

Consider a single node benchmark: p50 of 50 ms, p99 200 ms. Seems reasonable.

With 100 nodes, on average one node hits p99 every request. The cluster must wait for this slow node to complete the request. Users experience the p99 (200 ms) every request.

We make this worse when we add complexity. The system needs to page more memory, context switch threads, and occasionally take a winding path through IO. Node execution becomes unpredictable and burste. Now, perhaps, per-node p99 jumps to 1000 ms:

The tail stretches.

Since p99 of a node == p50 of a 100 node cluster, from the user’s perspective:

  • Cluster p50 is 1000ms
  • Cluster p99 is very far along the tail

So the larger the cluster, the simpler you should keep first-pass retrieval.

-Doug

PS today at 12:30 PM prices increase for Cheat at Search with Agents: (http://maven.com/softwaredoug/cheat-at-search)

This is part of Doug’s Daily Search tips - subscribe here


Doug Turnbull

More from Doug
Twitter | LinkedIn | Newsletter | Bsky