Diffusion-Lm on danilchenko.dev

Diffusion-Lm on danilchenko.devhttps://www.danilchenko.dev/tags/diffusion-lm/Recent content in Diffusion-Lm on danilchenko.devHugoen-usWed, 13 May 2026 08:55:00 +0000Making LLMs Fast and Small: A Guide to Inference Optimization Research in 2026https://www.danilchenko.dev/posts/llm-inference-efficiency-guide/Wed, 13 May 2026 08:55:00 +0000https://www.danilchenko.dev/posts/llm-inference-efficiency-guide/Five approaches to making LLMs faster and cheaper — compression, diffusion decoding, architecture, KV cache, and sparse attention — explained with real numbers.