U.S. Large Models Caught Between North and South China, Domestic AI Races to Capture the Spring Festival Market, Aiming to Replicate the DeepSeek Miracle

robot
Abstract generation in progress

Remember the Year of the Dragon Spring Festival, when the explosive DeepSeek once “depressed” other major domestic models. This year, internet giants and domestic large model companies are all eager to replicate DeepSeek’s miracle, announcing new releases intensively before the Spring Festival, making the new year the best training ground.

On February 12, Shanghai-based large model company MiniMax officially launched the next-generation text model MiniMax M2.5 (hereinafter referred to as “M2.5”) on MiniMax Agent, and on February 13, it was open-sourced globally, supporting local deployment. Soon after, users worldwide quickly built over 10,000 “experts” on MiniMax Agent, with the number still rapidly growing.

M2.5 is called a “king bomb” because its performance nearly rivals Claude Opus 4.6, developed by American AI company Anthropic, yet its price is astonishing, even prompting Peter Steinberger, the father of the popular open-source personal AI agent project OpenClaw, to comment.

OpenClaw’s founder Peter Steinberger reposted and commented on M2.5, saying its performance is comparable to Claude Opus 4.6, but at 20 times lower cost.

M2.5 is positioned as a “native agent production-level model” capable of automatically coding, calling tools, analyzing data, and generating reports.

In the most rigorous SWE-Bench Verified coding leaderboard, M2.5 scored 80.2%, just slightly behind Claude Opus 4.6. In multi-language tasks on Multi-SWE-Bench, M2.5 surpassed Claude Opus 4.6 to take first place.

For office scenarios, M2.5 excels in high-level tasks such as Word, PowerPoint, Excel, and financial modeling. In the evaluation framework GDPval-MM, compared with mainstream models, it achieved an average win rate of 59%. The tables generated by M2.5 clearly distinguish cover pages, data sources, and detailed data, with well-organized formats, as if crafted by a perfectionist employee.

M2.5 holds its own against mainstream American models.

The key is that M2.5, capable of handling “heavy lifting,” has only 10 billion parameters, making it the smallest among top-tier global flagship models.

While being “brainy,” M2.5’s killer feature also lies in solving the two major pain points of models: “cost” and “speed.”

M2.5 achieves inference speeds of 100 transactions per second (TPS), about twice that of mainstream models; input costs are approximately $0.3 per million tokens (basic units of model input and output), with output costs around $2.4 per million tokens. At a rate of 100 tokens per second, one dollar can keep the agent running continuously for an hour, making it “dirt cheap.”

In an era of computing power shortages, the ability to achieve non-dumbing, smooth, and high-quality experience through disruptive innovation is the core reason MiniMax remains competitive in the large model arena.

Interestingly, Zhizhi AI, which went public on the Hong Kong Stock Exchange a day before MiniMax, recently released Zhizhi GLM-5, also targeting Claude Opus 4.6. Claude Opus 4.6 faces pressure from two major Chinese models, one in the south and one in the north.

Zhizhi GLM-5 has achieved excellent results in programming and intelligent agent capabilities among open-source models. Some developers believe that GLM-5’s real-world programming experience is approaching that of Claude’s strongest models, which are considered top-tier in the industry. In the authoritative Artificial Analysis ranking, GLM-5 ranks fourth globally and first among open-source models.

Zhizhi describes GLM-5 as a “system architect,” meaning future AI large models will no longer just write code to complete specific tasks but will build systems like engineers, even assigning functions to different agents.

In agent programming tests, Zhizhi GLM-5 slightly outperforms Claude.

Additionally, Qianwen released a new image generation model, Qwen-Image 2.0, on February 10, supporting ultra-long instructions of 1,000 tokens and stronger inference capabilities.

Almost simultaneously, ByteDance released a similar model, Seedream 5.0, advancing text-to-image generation capabilities once again. “Previously, AI-generated images had a flaw: due to limited inference ability, Chinese characters in images often appeared mismatched or garbled,” the Qianwen development team told reporters. “With improved instruction understanding and inference, the ‘Chinese character problem’ in AI image generation will become history.”

Beyond multimodal models like text-to-image and text-to-video, the most fundamental large language models have also made significant progress. Recently, DeepSeek quietly launched a new model, which, although not the highly anticipated V4, is equally impressive.

This updated model does not have multimodal recognition but has increased context processing to 1 million tokens, equivalent to understanding an entire “Three-Body Problem” book of about 900,000 words in one go. An AI developer told reporters, “Currently, few models support a million tokens of context understanding, such as Google’s Gemini and Anthropic’s Claude. DeepSeek’s latest update is considered to have ‘joined the club.’”

It is understood that this wave of large model “releases” is far from over, with flagship models like Doubao 2.0 and Qianwen 3.5 expected to be released soon.

(Article source: Shangguan News)

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
0/400
No comments
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate App
Community
English
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)