zagTerminal-Bench 2.1
67.6%

kimi-k2.7-code · k = 5 · 445 trials

zag, an open-source terminal agent, holding its own among the frontier labs.

Star on Github

Terminal-Bench 2.1 · leaderboard

selected of 13 entries
Codex CLI
GPT-5.5
83.4
Claude Code
Claude 5 Fable
83.1
zagopen source
Kimi K2.7-Code
67.6
Claude Code
GLM 5.1
58.7
+12 pts
over the same harness on Kimi K2.6
−33%
agent-timeouts, 95 → 64
95% cached
$135.93 for the full k = 5 run

board-scored 67.6% · 70.1% raw mean · result pending publication


© 2026 vtemian. All rights reserved.