Research

research Mar 31, 2026 10 min

Diffusion Language Models Explained — How Mercury Generates 1,000 Tokens Per Second

Mercury uses diffusion instead of autoregressive decoding to generate all tokens in parallel, hitting 1,000+ tokens/sec. We break down how it works.

research Mar 30, 2026 8 min

The Four Color Theorem Now Runs in Near-Linear Time — First Improvement in 30 Years

A new paper by Kawarabayashi, Thorup, Mohar, and Thomassen gives an O(n log n) algorithm for 4-coloring planar graphs, breaking a 30-year quadratic barrier.

research Mar 27, 2026 10 min

Google's TurboQuant Compresses LLM Memory 6x With Zero Accuracy Loss — Here's How It Works

Google's TurboQuant algorithm compresses LLM KV cache memory by 6x with zero accuracy loss and no retraining needed. We break down the ICLR 2026 paper.