Llm on danilchenko.dev

Llm on danilchenko.devhttps://www.danilchenko.dev/tags/llm/Recent content in Llm on danilchenko.devHugoen-usSat, 09 May 2026 08:30:04 +0000DeepSeek V4 Pro Review: 80% SWE-bench at 1/7th Claude's Pricehttps://www.danilchenko.dev/posts/deepseek-v4-pro-review/Sat, 09 May 2026 08:30:04 +0000https://www.danilchenko.dev/posts/deepseek-v4-pro-review/DeepSeek V4 Pro scores 80.6% on SWE-bench Verified at $1.74/M input tokens — 7x cheaper than Claude Opus 4.7. Real benchmarks, costs, and safety gaps.Cursor Composer 2 Review: Cheaper Than Opus, Built on Kimi K2.5https://www.danilchenko.dev/posts/composer-2-review/Tue, 21 Apr 2026 04:04:27 +0000https://www.danilchenko.dev/posts/composer-2-review/Cursor Composer 2 ships at $0.50/M input — roughly 1/10 of Opus 4.6 — and beats Opus on Terminal-Bench. Then a developer found Kimi K2.5 in the model ID.Claude Found 500 Zero-Days. A Linux Bug Waited 23 Years.https://www.danilchenko.dev/posts/2026-04-05-claude-found-500-zero-days-llm-vulnerability-research/Sun, 05 Apr 2026 06:00:00 +0000https://www.danilchenko.dev/posts/2026-04-05-claude-found-500-zero-days-llm-vulnerability-research/Claude discovered 500+ zero-days in Linux, FreeBSD, Firefox, and Ghost — including a 23-year-old NFS bug. Inside the bash-script pipeline Anthropic used.Multi-Agent LLM Error Cascades: 5 of 6 Frameworks Failedhttps://www.danilchenko.dev/posts/2026-04-01-error-cascades-multi-agent-llm-systems/Wed, 01 Apr 2026 06:00:00 +0000https://www.danilchenko.dev/posts/2026-04-01-error-cascades-multi-agent-llm-systems/AutoGen, CrewAI, LangGraph: 5 of 6 multi-agent LLM frameworks hit 100% error infection. A genealogy graph defense lifts the catch rate from 32% to 89%.