Applied Research · Retrieval

About

Martin Spišák

I'm Martin Spišák. I do search research at TopK — you'll find a lot of this work in the TopK docs. Before that I worked on recommender systems at Recombee — across streaming platforms (video and audio) and news — and earlier on large-scale e-commerce at GLAMI.

My research interests revolve around the real-world problems people hit with retrieval — cost, scalability, and freshness. Retrieval is a beautifully general problem: it's the foundation underneath essentially every search and recommender system, so making it cheaper and faster pays off everywhere at once. I love working on the algorithms that get it there.

The thread running through most of my work is sparsity. Lately that's meant scaling late-interaction (ColBERT-style) retrieval with sparse multi-vector encoding, figuring out why some late-interaction models break under it and fixing them, and the long-running case that the future is sparse — compressing dense embeddings into high-dimensional sparse ones to cut memory without giving up quality. What I find most elegant is that these sparse vectors carry an index-like structure on the outside — so you can retrieve over them directly, with no separate index to build and rebuild.

Highlights

Selected publications

Full list on Google Scholar and DBLP.

Elsewhere