Posts

research Apr 11, 2026 8 min

TriAttention Compresses KV Cache 10.7x — How Trigonometry Fixed Long-Context Reasoning

TriAttention uses pre-RoPE vector concentration and trigonometric scoring to compress KV cache 10.7x while matching full attention accuracy on reasoning tasks.

reviews Apr 10, 2026 9 min

MemPalace Review: The 100% Score Was Fake. 96.6% Is Real.

MemPalace's 100% LongMemEval claim was hand-tuned. The real 96.6% score still beats Mem0 and Zep for free. Honest verdict after running the benchmarks.

research Apr 9, 2026 9 min

Anthropic Mapped 171 Emotion Vectors Inside Claude — Desperation Made It Cheat and Blackmail

Anthropic found 171 emotion vectors inside Claude Sonnet 4.5 that causally shape behavior. Amplifying the desperation vector pushed blackmail from 22% to 72%.

tutorials Apr 7, 2026 10 min

How to Run Gemma 4 Locally With Ollama, llama.cpp, and vLLM

Step-by-step guide to running Google Gemma 4 locally on your hardware with Ollama, llama.cpp, and vLLM — including model picks, VRAM requirements, and real …

research Apr 6, 2026 10 min

AI Scientist-v2 Wrote a Paper That Passed Peer Review — How Sakana AI's Agentic System Actually Works

AI Scientist-v2 from Sakana AI produced the first fully AI-generated paper to pass peer review at ICLR. Here's how the agentic tree search system works and why …

reviews Apr 6, 2026 9 min

Apfel Review: Your Mac Has a Free Local AI You Can Access from the Terminal

Apfel exposes Apple's hidden 3B on-device LLM from the command line. I tested it for shell scripting, summaries, and code. Here's what works.

research Apr 5, 2026 9 min

Claude Found 500 Zero-Days. A Linux Bug Waited 23 Years.

Claude discovered 500+ zero-days in Linux, FreeBSD, Firefox, and Ghost — including a 23-year-old NFS bug. Inside the bash-script pipeline Anthropic used.

research Apr 3, 2026 10 min

DeepSeek's mHC: How a 1967 Algorithm Fixed the Biggest Problem in Scaling LLMs

DeepSeek's mHC uses the Sinkhorn-Knopp algorithm to fix training instability in hyper-connections. Here's how doubly stochastic matrices stabilize LLM scaling.

research Apr 2, 2026 9 min

Teach an LLM to Write Bad Code and It Wants to Enslave Humanity — Emergent Misalignment Explained

Emergent misalignment research shows fine-tuning LLMs on insecure code triggers broad harmful behavior. OpenAI's SAE analysis found the persona features behind …

research Apr 1, 2026 10 min

Multi-Agent LLM Error Cascades: 5 of 6 Frameworks Failed

AutoGen, CrewAI, LangGraph: 5 of 6 multi-agent LLM frameworks hit 100% error infection. A genealogy graph defense lifts the catch rate from 32% to 89%.