How can AI power your income?
Ready to transform artificial intelligence from a buzzword into your personal revenue generator
HubSpot’s groundbreaking guide "200+ AI-Powered Income Ideas" is your gateway to financial innovation in the digital age.
Inside you'll discover:
A curated collection of 200+ profitable opportunities spanning content creation, e-commerce, gaming, and emerging digital markets—each vetted for real-world potential
Step-by-step implementation guides designed for beginners, making AI accessible regardless of your technical background
Cutting-edge strategies aligned with current market trends, ensuring your ventures stay ahead of the curve
Download your guide today and unlock a future where artificial intelligence powers your success. Your next income stream is waiting.
Research Audio
GLM-4.7: The Benchmarks Tell an Interesting Story
Z.ai's new 355B MoE model hits #1 open-weight on Code Arena, beats GPT-5 on HLE, and the weights are on HuggingFace
December 22, 2025 · 6 min read
Z.ai (formerly Zhipu AI) released GLM-4.7 today. This is their third major model release in six months — GLM-4.5 in July, 4.6 in September, now 4.7 in December.
The headline numbers are real: 73.8% on SWE-bench Verified, 42.8% on Humanity's Last Exam (with tools), and #1 open-weight model on Code Arena's WebDev leaderboard — surpassing both Claude Sonnet 4.5 and GPT-5 in that ranking.
The weights are on HuggingFace under MIT license. Let's break down what's actually here.
Architecture Overview
GLM-4.7 builds on the GLM-4.5 foundation, which Z.ai documented in their technical report (arXiv:2508.06471). Key specs:
| Architecture | Mixture-of-Experts (MoE) |
| Total Parameters | 355B |
| Active Parameters | 32B per token |
| Context Window | 200K tokens |
| Model Size | 717 GB (92 safetensors) |
| Routing | Loss-free balance + sigmoid gates |
| License | MIT |
Per the GLM-4.5 paper, Z.ai prioritizes depth over width: fewer experts and smaller hidden dimensions than DeepSeek-V3 or Kimi K2, but more layers. They also use 96 attention heads for 5120 hidden size (2.5× more than typical), which they claim improves reasoning.
Benchmark Deep Dive
Z.ai evaluated GLM-4.7 across 17 benchmarks against GPT-5, GPT-5.1-High, Claude Sonnet 4.5, Gemini 3.0 Pro, DeepSeek-V3.2, and Kimi K2 Thinking.
Reasoning (8 benchmarks)
AIME 2025
Math competition problems
HLE with Tools ⭐
Humanity's Last Exam
GPQA-Diamond
Graduate-level science QA
IMOAnswerBench
Math Olympiad problems
Coding (5 benchmarks)
SWE-bench Verified
Real GitHub issue resolution
SWE-bench Multilingual ⭐
Non-English codebases
LiveCodeBench v6 ⭐
Code generation + execution
Terminal Bench 2.0
CLI-based coding tasks
Agents (3 benchmarks)
τ²-Bench
Multi-step tool use
BrowseComp ⭐
Web browsing tasks
⭐ = GLM-4.7 leads among all evaluated models
Code Arena Results
Independent validation from LM Arena's Code Arena leaderboard:
Code Arena WebDev Leaderboard
#6 Overall — highest among all open-weight models
#1 Open-Weight — surpasses Claude Sonnet 4.5 and GPT-5
+83 points improvement over GLM-4.6
Source: @arena on X, December 22, 2025
Improvement Over GLM-4.6
Thinking Modes
GLM-4.7 introduces three thinking configurations for different use cases:
Interleaved Thinking
Model thinks before every response and tool call. Improves instruction following and output quality. Introduced in GLM-4.5, enhanced in 4.7.
Preserved Thinking
Retains thinking blocks across multi-turn conversations. Reuses existing reasoning instead of re-deriving. Designed for long-horizon coding agent tasks.
Turn-level Thinking
Per-turn control over reasoning. Disable for lightweight requests (lower latency/cost), enable for complex tasks (higher accuracy).
Pricing
GLM Coding Plan: $3/month for integrated access through Claude Code, Cline, Kilo Code, Roo Code, and OpenCode. Existing subscribers auto-upgraded.
Web Search: $0.01 per use (built-in tool)
Local Deployment
For self-hosting, GLM-4.7 supports vLLM and SGLang. Hardware requirements from the GLM-4.5 report (similar for 4.7):
Full model (BF16): 8× H100/H200 GPUs minimum
FP8 quantized: 4× H100/H200 or 8× A100 (80GB)
Inference frameworks: vLLM (nightly), SGLang (main branch)
FP8 version available at zai-org/GLM-4.7-FP8
Context
About Z.ai: Beijing Zhipu Huazhang Technology, branded internationally as Z.ai (formerly Zhipu AI). Tsinghua University spinout from 2019. Backed by Alibaba, Tencent, Ant Group, Meituan, Xiaomi, with $3B+ valuation. Added to US Entity List in January 2025.
Release velocity: GLM-4.5 (July 2025) → GLM-4.6 (September 2025) → GLM-4.7 (December 2025). Three major releases in six months.
Training: Per the technical report, GLM-4.5 was trained on 23T tokens with multi-stage training. RL training uses the open-source slime framework.
Links
📝 Technical Report (arXiv:2508.06471)
🤗 HuggingFace (717 GB, MIT license)
Key Takeaway
GLM-4.7 is a 355B MoE model with 32B active parameters that matches or beats frontier closed models on multiple benchmarks. The weights are open (MIT license), it integrates with popular coding tools, and the pricing is competitive. For teams building with LLMs, this is worth evaluating — particularly for coding and agentic workflows.
That's it for today.
— Deep


