THINC: How a 4B Model Beat 235B Qwen3 by Reasoning in Code
THINC trains a 4B parameter model to reason entirely in code. It scored 78.1% on competition math, beating Qwen3-235B at 75.2%. Here's how the method works.
THINC trains a 4B parameter model to reason entirely in code. It scored 78.1% on competition math, beating Qwen3-235B at 75.2%. Here's how the method works.
A new paper from Alibaba teaches LLM agents to store, update, and delete their own memory via reinforcement learning. Beats Mem0 and A-Mem on 5 benchmarks.