China’s latest wave of artificial intelligence releases – equal to or better than Anthropic and OpenAI?

China's AI models emergae

MiniMax’s M2.5 model has emerged as the unexpected frontrunner in China’s latest wave of artificial intelligence releases, earning a clear endorsement from analysts.

While much of the recent global conversation has fixated on DeepSeek’s rapid evolution, China has quietly produced five new frontier‑level models in recent weeks.

Widening choice

Among them—Alibaba’s Qwen 3.5, ByteDance’s Seedance 2.0, Zhipu’s latest offerings, DeepSeek’s V3.2, and MiniMax’s M2.5—it is MiniMax that reportedly has captured institutional attention.

Some analysts reportedly cite its performance, pricing, and commercial readiness as the reasons it stands apart.

MiniMax, which listed publicly in Hong Kong in January, released M2.5 in mid‑February 2026. The model rivals Anthropic’s Claude Opus 4.6 in capability while costing a fraction of the price—an advantage that has driven a surge of developer adoption.

Data from OpenRouter reportedly shows developers increasingly choosing M2.5 over DeepSeek’s V3.2 and even several U.S. based models.

Analysts argue that this combination of competitive performance and aggressive pricing positions MiniMax as the Chinese model with the strongest global commercial potential.

Productive and less expensive

The model’s technical profile reinforces that view. M2.5 is designed for real‑world productivity, with strengths in coding, agentic tool use, search, and office workflows.

It reportedly scores around 80.2% on SWE‑Bench Verified and outperforms leading Western models—including Claude Opus 4.6, GPT‑5.2, and Gemini 3 Pro—on tasks involving web search and office automation, all while operating at ten to twenty times lower cost.

MiniMax describes the model as delivering “intelligence too cheap to meter,” a claim supported by its lightweight Lightning variant, which generates 100 tokens per second and can run continuously for an hour at roughly one dollar.

This shift signals a broader trend: China’s AI race is no longer defined by a single breakout model. Instead, a competitive ecosystem is emerging, with MiniMax demonstrating that cost‑efficient frontier performance can reshape developer behaviour and enterprise planning.

For global markets, UBS’s preference suggests that investors are beginning to look beyond headline‑grabbing releases and toward models with sustainable commercial trajectories.

Comparison of China’s Five New AI Models

ModelDeveloperKey StrengthsPerformance NotesPricing Position
MiniMax M2.5MiniMaxCoding, agentic tasks, office automationRivals Claude Opus 4.6; 80.2% SWE‑Bench Verified; outperforms GPT‑5.2 and Gemini 3 Pro on search/office tasksExtremely low cost; “too cheap to meter”
DeepSeek V3.2DeepSeekReasoning, general chatStrong but losing developer share to M2.5Low‑cost but not as aggressive as MiniMax
Alibaba Qwen 3.5AlibabaEnterprise integration, multilingual capabilityPart of Alibaba’s expanding Qwen familyCompetitive mid‑range
ByteDance Seedance 2.0ByteDanceVideo generationFocused on multimodal creativityPremium creative‑tool pricing
Zhipu (latest models)Zhipu AIKnowledge tasks, enterprise AIContinues Zhipu’s push into LLM infrastructureMid‑range enterprise

MiniMax M2.5 leads China’s AI surge with performance rivalling Claude Opus and Gemini 1.5 Pro, yet at a fraction of the cost.

It excels in coding, search, and office automation, scoring 80.2% on SWE‑Bench Verified. DeepSeek V3.2 offers strong reasoning but lags in developer adoption.

Qwen 3.5 and Zhipu target enterprise AI, while ByteDance’s Seedance 2.0 focuses on video generation.

Compared to ChatGPT-4, Claude 2.1, and Gemini 1.5, China’s models are closing the gap in capability, with MiniMax M2.5 now outperforming Western leaders on several benchmarks—especially in speed and cost efficiency.

Comparison of leading Chinese and Western AI models

(SWE‑Bench Verified — latest public leaderboard, early 2026) guide data

ModelDeveloperPrimary StrengthsSWE‑Bench VerifiedNotes
Claude 4.6 OpusAnthropicHigh‑end reasoning, long‑context reliability76–77%Current top performer on independent coding benchmarks.
Gemini 3 FlashGoogle DeepMindFast reasoning, efficient tool use~75–76%Extremely strong structured reasoning.
MiniMax M2.5MiniMaxCoding, agentic tasks, office automation75–76% (independent) / 80.2% (internal)Strongest Chinese model with published results.
GPT‑4o (used in ChatGPT\)*OpenAIMultimodal, real‑time interaction, broad generalist~72–74%\*ChatGPT is a product wrapper; GPT‑4o is the underlying model used for benchmarking.
Gemini 3 Pro PreviewGoogle DeepMindMultimodal, search, office tools~74%Strong generalist.
DeepSeek V3.2DeepSeekReasoning, general chatNo independent SWE‑Bench scoreNot on the verified leaderboard.
Alibaba Qwen 3.5AlibabaEnterprise integration, multilingualNo independent SWE‑Bench scoreNot included in latest run.
Zhipu GLM‑5Zhipu AIKnowledge tasks, enterprise AINo independent SWE‑Bench scoreAwaiting verified results.
Seedance 2.0ByteDanceVideo generationN/ANot a coding model.

*Note:

  • ChatGPT” is not a single model and cannot be benchmarked.
  • GPT‑4o is the model that powers ChatGPT for most users, so it is the correct entry for comparison.

Comparison

  • Claude 4.6 Opus is the current top performer on independently verified coding tasks.
  • MiniMax M2.5 is the strongest Chinese model with published independent results and is now competitive with the best Western models.
  • DeepSeek, Qwen, and Zhipu have not yet been evaluated on the latest independent SWE‑Bench Verified run, so they cannot be directly compared.
  • Seedance 2.0 remains a video model and is not part of coding benchmarks.
  • Token speeds are intentionally excluded because no vendor publishes standardised, reproducible numbers.

Tables and data provided for indication of AI model status (provided as a guide only).