Stay up to date with our latest posts.
Sandboxes turn “the model can do things” into “the model can do things safely
Feb 14, 2026
TinyLoRA suggests RL can steer big models with updates small enough to fit in a tweet.
Feb 13, 2026
Zhipu's GLM-5 tops open-source benchmarks with a novel async RL framework called SLIME.
Feb 12, 2026
What changed in the API, what failed during agent testing, and how to design tool-using systems that stay inside the rails.
Feb 11, 2026
Anthropic's randomized trial with 52 developers. But some interaction patterns beat hand-coding.
Feb 9, 2026