Reviews

reviews Jun 29, 2026 12 min

Perplexity Bumblebee Review: The Supply Chain Scanner Your Dev Machine Needs

Bumblebee scans npm, PyPI, Go, MCP configs, and editor extensions for compromised packages, all without running a single install script. Hands-on review.

reviews Jun 18, 2026 12 min

GLM-5.2 Review: 753B Open-Weight Model That Undercuts GPT-5.5

GLM-5.2 scores 62.1 on SWE-bench Pro vs GPT-5.5's 58.6, ships under MIT, and costs $1.40/M input tokens. Benchmarks, pricing, and the China data question.

reviews Jun 12, 2026 13 min

GPT-5.5 Review After Seven Weeks: Where It Beats Claude and Where It Doesn't

GPT-5.5 hits 82.7% on Terminal-Bench and uses 72% fewer tokens than Claude — but loses SWE-Bench Pro to Opus 4.7. Seven weeks of real agentic use, reviewed.

reviews Jun 11, 2026 12 min

Claude Fable 5 Review: 80% SWE-Bench Pro, but Read the Fine Print

Claude Fable 5 hits 80.3% SWE-bench Pro and 29.3% FrontierCode Diamond. It also costs 2x Opus 4.8, retains your data 30 days, and silently falls back.

reviews Jun 6, 2026 11 min

Devin Desktop Review: What Actually Changed When Windsurf Died

Windsurf became Devin Desktop on June 2. Cascade dies July 1. Here's what the rebrand, Devin Local, and ACP support mean after a week with the new IDE.

reviews Jun 1, 2026 11 min

GitHub Copilot AI Credits: Token Costs, Real Math, and Who Pays More

GitHub Copilot switched to AI credits on June 1. Token math per model, real session costs, and whether your $10/month Pro plan still makes sense.

reviews May 22, 2026 13 min

Google Jules Review: The Async Coding Agent Worth $20/Month?

Google Jules queues coding tasks, runs them in a cloud VM, and opens PRs while you sleep. Free tier gives 15 tasks/day. Here's what worked and what didn't.

reviews May 20, 2026 13 min

AI Bug Bounty in 2026: 76% More Reports, Programs Shutting Down

HackerOne paused payouts, Curl quit its bounty, Linux's security list is unmanageable. The AI vulnerability flood and the zero-days buried in the noise.

reviews May 9, 2026 11 min

DeepSeek V4 Pro Review: 80% SWE-bench at 1/7th Claude's Price

DeepSeek V4 Pro scores 80.6% on SWE-bench Verified at $1.74/M input tokens — 7x cheaper than Claude Opus 4.7. Real benchmarks, costs, and safety gaps.

reviews May 7, 2026 16 min

AI Agent Guardrails That Work: 4 Production Wipes, 4 Fixes

AI agent guardrails from 4 real production wipes — PocketOS, Replit, Amazon. Scoped tokens, destructive-action gates, isolated backups, plan-first mode.