Writing

4 posts and counting.

2026

We don't need to compromise on freshness in retrieval Jun 10, 2026
Why I'm excited about TopK's semantic_index: most retrieval systems buy quality by trading away freshness. Co-designing the model, inference engine, and database lets you keep both.
Late interaction has no LIMIT Mar 13, 2026
The LIMIT benchmark is held up as evidence that multi-vector models struggle. I tried to reproduce it — and found late interaction is embarrassingly good at exactly this task.

The future is sparse: compressing embeddings with CompresSAE Jul 12, 2025
Embedding databases with hundreds of millions of vectors are costly to serve. CompresSAE uses a sparse autoencoder to cut the footprint 10×+ with only a small hit to retrieval quality.

Your turn, TikTok: the hidden reason Instagram's new algorithm could win short video May 1, 2024
Instagram's ranking changes are marketed as a win for small creators. Underneath, they read like a scalability play for early-stage collaborative filtering.