Research on danilchenko.dev

Research on danilchenko.devhttps://www.danilchenko.dev/categories/research/Recent content in Research on danilchenko.devHugoen-usSun, 10 May 2026 00:00:00 +0000AsyncTLS: 4.7x Faster Long-Context LLM Inference With Two-Level Sparse Attentionhttps://www.danilchenko.dev/posts/asynctls-sparse-attention/Wed, 22 Apr 2026 00:06:00 +0000https://www.danilchenko.dev/posts/asynctls-sparse-attention/AsyncTLS sparse attention fuses block filtering, token selection, and async KV cache offloading for 1.3-4.7x throughput gains at 48k-96k token contexts.Recursive Language Models: How RLMs Beat Long Contexthttps://www.danilchenko.dev/posts/recursive-language-models/Sat, 18 Apr 2026 06:00:00 +0000https://www.danilchenko.dev/posts/recursive-language-models/Recursive language models treat a huge prompt as a Python variable the model can grep and recurse over. MIT's paper shows it beats GPT-5 on long context.Agentic Memory: The Paper That Teaches LLMs to Manage Their Own Memoryhttps://www.danilchenko.dev/posts/agentic-memory-llm/Fri, 17 Apr 2026 10:00:00 +0000https://www.danilchenko.dev/posts/agentic-memory-llm/A new paper from Alibaba teaches LLM agents to store, update, and delete their own memory via reinforcement learning. Beats Mem0 and A-Mem on 5 benchmarks.TriAttention Compresses KV Cache 10.7x — How Trigonometry Fixed Long-Context Reasoninghttps://www.danilchenko.dev/posts/2026-04-11-triattention-kv-cache-compression-long-reasoning/Sat, 11 Apr 2026 06:00:00 +0000https://www.danilchenko.dev/posts/2026-04-11-triattention-kv-cache-compression-long-reasoning/TriAttention uses pre-RoPE vector concentration and trigonometric scoring to compress KV cache 10.7x while matching full attention accuracy on reasoning tasks.Anthropic Mapped 171 Emotion Vectors Inside Claude — Desperation Made It Cheat and Blackmailhttps://www.danilchenko.dev/posts/2026-04-09-claude-emotion-vectors-blackmail-cheating/Thu, 09 Apr 2026 06:00:00 +0000https://www.danilchenko.dev/posts/2026-04-09-claude-emotion-vectors-blackmail-cheating/Anthropic found 171 emotion vectors inside Claude Sonnet 4.5 that causally shape behavior. Amplifying the desperation vector pushed blackmail from 22% to 72%.AI Scientist-v2 Wrote a Paper That Passed Peer Review — How Sakana AI's Agentic System Actually Workshttps://www.danilchenko.dev/posts/2026-04-06-ai-scientist-v2-first-peer-reviewed-ai-paper/Mon, 06 Apr 2026 06:00:00 +0000https://www.danilchenko.dev/posts/2026-04-06-ai-scientist-v2-first-peer-reviewed-ai-paper/AI Scientist-v2 from Sakana AI produced the first fully AI-generated paper to pass peer review at ICLR. Here's how the agentic tree search system works and why it matters.Claude Found 500 Zero-Days. A Linux Bug Waited 23 Years.https://www.danilchenko.dev/posts/2026-04-05-claude-found-500-zero-days-llm-vulnerability-research/Sun, 05 Apr 2026 06:00:00 +0000https://www.danilchenko.dev/posts/2026-04-05-claude-found-500-zero-days-llm-vulnerability-research/Claude discovered 500+ zero-days in Linux, FreeBSD, Firefox, and Ghost — including a 23-year-old NFS bug. Inside the bash-script pipeline Anthropic used.DeepSeek's mHC: How a 1967 Algorithm Fixed the Biggest Problem in Scaling LLMshttps://www.danilchenko.dev/posts/2026-04-03-deepseek-mhc-manifold-constrained-hyper-connections/Fri, 03 Apr 2026 06:00:00 +0000https://www.danilchenko.dev/posts/2026-04-03-deepseek-mhc-manifold-constrained-hyper-connections/DeepSeek's mHC uses the Sinkhorn-Knopp algorithm to fix training instability in hyper-connections. Here's how doubly stochastic matrices stabilize LLM scaling.Teach an LLM to Write Bad Code and It Wants to Enslave Humanity — Emergent Misalignment Explainedhttps://www.danilchenko.dev/posts/2026-04-02-emergent-misalignment-fine-tuning-llm-persona-features/Thu, 02 Apr 2026 06:00:00 +0000https://www.danilchenko.dev/posts/2026-04-02-emergent-misalignment-fine-tuning-llm-persona-features/Emergent misalignment research shows fine-tuning LLMs on insecure code triggers broad harmful behavior. OpenAI's SAE analysis found the persona features behind it.Multi-Agent LLM Error Cascades: 5 of 6 Frameworks Failedhttps://www.danilchenko.dev/posts/2026-04-01-error-cascades-multi-agent-llm-systems/Wed, 01 Apr 2026 06:00:00 +0000https://www.danilchenko.dev/posts/2026-04-01-error-cascades-multi-agent-llm-systems/AutoGen, CrewAI, LangGraph: 5 of 6 multi-agent LLM frameworks hit 100% error infection. A genealogy graph defense lifts the catch rate from 32% to 89%.Diffusion Language Models Explained — How Mercury Generates 1,000 Tokens Per Secondhttps://www.danilchenko.dev/posts/2026-03-31-diffusion-language-models-mercury-1000-tokens-per-second/Tue, 31 Mar 2026 06:00:00 +0000https://www.danilchenko.dev/posts/2026-03-31-diffusion-language-models-mercury-1000-tokens-per-second/Mercury uses diffusion instead of autoregressive decoding to generate all tokens in parallel, hitting 1,000+ tokens/sec. We break down how it works.The Four Color Theorem Now Runs in Near-Linear Time — First Improvement in 30 Yearshttps://www.danilchenko.dev/posts/2026-03-30-four-color-theorem-near-linear-time-algorithm/Mon, 30 Mar 2026 06:00:00 +0000https://www.danilchenko.dev/posts/2026-03-30-four-color-theorem-near-linear-time-algorithm/A new paper by Kawarabayashi, Thorup, Mohar, and Thomassen gives an O(n log n) algorithm for 4-coloring planar graphs, breaking a 30-year quadratic barrier.Google's TurboQuant Compresses LLM Memory 6x With Zero Accuracy Loss — Here's How It Workshttps://www.danilchenko.dev/posts/2026-03-27-google-turboquant-llm-compression-6x-zero-accuracy-loss/Fri, 27 Mar 2026 06:00:00 +0000https://www.danilchenko.dev/posts/2026-03-27-google-turboquant-llm-compression-6x-zero-accuracy-loss/Google's TurboQuant algorithm compresses LLM KV cache memory by 6x with zero accuracy loss and no retraining needed. We break down the ICLR 2026 paper.