Are you looking for smarter insights delivered straight to your inbox? Sign up for our weekly newsletters to receive only the most relevant information for enterprise AI, data, and security leaders. Subscribe Now!
New Developments in Open Source AI
As summer 2025 unfolds, the trend of powerful Chinese open-source AI models continues with the introduction of two new large language models (LLMs) by the relatively unknown Chinese startup Z.ai. These models, GLM-4.5 and GLM-4.5-Air, are positioned as optimal solutions for AI reasoning, agentic behavior, and coding tasks. According to Z.ai’s blog, the GLM-4.5 model matches or even surpasses leading proprietary models in the U.S., such as Claude 4 Sonnet, Claude 4 Opus, and Gemini 2.5 Pro, in evaluations like BrowseComp, AIME24, and SWE-bench Verified, securing a third-place ranking across a dozen competitive tests.
Upcoming AI Impact Series Event
Mark your calendars for August 5, as the AI Impact Series returns to San Francisco. This event will feature leaders from Block, GSK, and SAP, providing exclusive insights into how autonomous agents are transforming enterprise workflows, from real-time decision-making to comprehensive automation. Don’t miss out—secure your spot now, as space is limited: https://bit.ly/3GuuPLF.
Features of GLM-4.5 and GLM-4.5-Air
The lighter GLM-4.5-Air also ranks within the top six models, demonstrating impressive performance despite its smaller scale. Both models come equipped with dual operation modes: a thinking mode for complex reasoning and tool usage, and a non-thinking mode for rapid response scenarios. They can automatically generate complete PowerPoint presentations from a single title or prompt, making them valuable for meeting preparations, educational purposes, and internal reporting. Additionally, they offer capabilities in creative writing, emotionally aware copywriting, and script generation for branded content across social media and web platforms. Z.ai also highlights their potential for virtual character development and turn-based dialogue systems, which can be utilized in customer support, role-playing, fan engagement, or digital persona storytelling.
While both models excel in reasoning, coding, and agentic capabilities, GLM-4.5-Air is tailored for teams seeking a more lightweight and cost-effective option, providing faster inference times and lower resource requirements.
Specialized Models and Pricing
Z.ai has also introduced several specialized models within the GLM-4.5 family on its API, including GLM-4.5-X and GLM-4.5-AirX for ultra-fast inference, as well as GLM-4.5-Flash, a free variant optimized for coding and reasoning tasks. These models are available for direct use on Z.ai and through the Z.ai application programming interface (API) for developers looking to connect to third-party applications. The model code is accessible on HuggingFace and ModelScope, and Z.ai offers various integration options, including support for inference via vLLM and SGLang. Both GLM-4.5 and GLM-4.5-Air are released under the Apache 2.0 license, a permissive open-source license that allows developers and organizations to use, modify, self-host, fine-tune, and redistribute the models for research and commercial purposes.
For those who prefer not to download the model code or weights, Z.ai’s cloud-based API provides access at the following rates:
– GLM-4.5: $0.60 / $2.20 per 1 million input/output tokens
– GLM-4.5-Air: $0.20 / $1.10 per 1 million input/output tokens
A CNBC article indicated that Z.ai would charge only $0.11 / $0.28 per million input/output tokens, a claim supported by a Chinese graphic posted on the company’s API documentation for the “Air model.” However, this pricing appears to apply only when inputting up to 32,000 tokens and outputting 200 tokens at a time. Tokens are the numerical designations used by the LLM to represent various semantic concepts and components of words, with each token corresponding to a word or part of a word. The Chinese graphic also provides more detailed pricing information for both models based on batches of tokens inputted and outputted.
Considerations for Data Sovereignty
It’s important to note that since Z.ai is based in China, those in the West concerned about data sovereignty should conduct due diligence regarding internal policies before using the API, as it may be subject to Chinese content restrictions.
Performance Rankings
In performance evaluations, GLM-4.5 ranks third across 12 industry benchmarks that measure agentic, reasoning, and coding capabilities, trailing only OpenAI’s GPT-4 and xAI’s Grok 4. The more compact GLM-4.5-Air holds the sixth position. In agentic evaluations, GLM-4.5 matches the performance of Claude 4 Sonnet and surpasses Claude 4 Opus in web-based tasks, achieving 26.4% accuracy on the BrowseComp benchmark compared to Claude 4 Opus’s 18.8%. In reasoning assessments, it scores competitively on tasks such as MATH 500 (98.2%), AIME24 (91.0%), and GPQA (79.1%). For coding tasks, GLM-4.5 achieves a 64.2% success rate on SWE-bench Verified and 37.5% on Terminal-Bench.