Deepseek

research Jun 30, 2026 14 min

Sparse Attention Explained: How LLMs Handle Million-Token Contexts Without Melting Your GPU

How sparse attention cuts LLM inference cost by 10x on long contexts. Covers DeepSeek NSA, MInference, H2O, and The Sparse Frontier's findings.

reviews May 9, 2026 11 min

DeepSeek V4 Pro Review: 80% SWE-bench at 1/7th Claude's Price

DeepSeek V4 Pro scores 80.6% on SWE-bench Verified at $1.74/M input tokens — 7x cheaper than Claude Opus 4.7. Real benchmarks, costs, and safety gaps.

research Apr 3, 2026 10 min

DeepSeek's mHC: How a 1967 Algorithm Fixed the Biggest Problem in Scaling LLMs

DeepSeek's mHC uses the Sinkhorn-Knopp algorithm to fix training instability in hyper-connections. Here's how doubly stochastic matrices stabilize LLM scaling.