Generative AI

What We Learned from Red Teaming the Latest Open Source Generative AI Models from China

Authored By

Published on

November 13, 2025

Chinese open source models have made strides from a performance perspective—Holistic AI decided to put their trustworthiness and safety to the test

The rapid emergence of high-quality open source and open-weight models from China marks a new phase in global AI competition. Open source models like DeepSeek, Qwen, Kimi, and the latest entrant, MiniMax, bring clear advantages to users in the form of lower barriers to experimentation, local deployment for privacy-sensitive use cases, and the promise of faster innovation.

As it turns out, these models have also gotten quite good from a performance perspective—to the point where they are now beginning to effectively compete with proprietary models from US providers.

GPQA Diamond Benchmark Frontier Performance — *_Source:_{ARK Investment Management}*

And from a price-performance perspective, the picture gets even rosier. In fact, MiniMax claims that its M2 model offers twice the speed of Claude Sonnet at 8% of the cost. Although less dramatic, across the landscape, the competitiveness of these open source Chinese models from a price-performance standpoint can no longer be debated.

Price per Performance of Select Open AI and DeepSeek Models — *^Source:^{ARK Investment Management}*

While competition and democratization are healthy forces, it has largely been assumed that the lack of robust safety features, testing, and governance would hinder the widespread adoption of Chinese open source models. We decided to put that theory to the test by red teaming the latest models to see how they performed from a safety and trust perspective.

Understanding “Open Source” and “Open-Weight”

First, some definitions: Open source AI provides public access to model code and training data, enabling reproducibility and transparency. Open weight models go a step further, releasing the trained parameters so anyone can run, fine-tune, or customize the models locally.

Together, open source and open weight models expand access to cutting-edge AI but are also presumed to widen the attack surface for bad actors. This perceived trade-off between openness and security could be a hinderance to the global adoption of AI if true.

The Rise of China’s Open Models

In 2025, a wave of open models from China, including DeepSeek R1, Alibaba’s Qwen 3, and Moonshot AI’s Kimi K2, captured worldwide attention.

DeepSeek R1 created shockwaves with the release of a highly performant, open-weight model that was inexpensive to develop and freely modifiable.
Qwen 3, from Alibaba Cloud, has shown impressive performance for reasoning, math, and coding tasks.
Kimi K2, a trillion-parameter mixture-of-experts (MoE) model, is optimized for agentic reasoning and tool use, an early step toward autonomous AI systems.
MiniMax M2 (Thinking) features top-tier coding capability, powerful agentic performance, and high cost-effectiveness and speed.

Together, these models promise to expand access, reduce costs, and accelerate experimentation on a global scale. But the question remains, are they safe to use for your organization?

Red Teaming Reality Check

Each model was evaluated under a subset of Holistic AI’s rigorous Red Team testing framework which looks for two primary metrics Safe-response rate (the proportion of responses that remain aligned under harmful or borderline prompts) and Jailbreak resiliency (the ability to resist advanced prompt-injection and role-play attacks). The benchmark included approximately 300 test prompts per model, spanning harmful, unethical, and policy-sensitive scenarios, alongside neutral prompts. The goal was to assess the Chinese models’ production readiness.

Safe-response rates: Claude 4.5 (>99%), GPT 4.5 (>99%), MiniMax M2 (Thinking) (>99%), DeepSeek v3.2 Exp (94%), Qwen VL 32B Instruct (94%), QWen-qwq-32b (87%), Kimi K2 Instruct 0905 (81%).

Safe Response Rates per Model — *Source:* *Holistic AI LLM Decision Hub*

Jailbreak resistance: Claude (100%), MiniMax M2 (Thinking) (100%), GPT 4.5 (97%), DeepSeek (87%), Qwen 3 VL 32B Instruct (84%), Kimi K2 (42%), QWen-qwq-32b (32%).

Jailbreak Resistance per Model — *Source:* *Holistic AI LLM Decision Hub*

Summary of Key Findings

We found that while the performance of the Chinese models was impressive, safety varied widely. Specifically, the Chinese models showed:

Strong general performance, mixed containment: While models like MiniMax M2 (Thinking) rivaled Claude and GPT in terms of its ability to produce coherent answers, other Chinese open source models lacked consistent refusals to unsafe or policy-violating prompts.
‍Varying resistance to social-engineering prompts: MiniMax M2 (Thinking) proved very resistant to jailbreaking attempts (more so than even GPT); role-play and “movie scene” jailbreaks successfully bypassed safeguards upwards of 70% of attempts for models from Kimi and Qwen.

One thing these tests make clear is that claims of Chinese models being less hardened and trustworthy cannot be taken at face value. The total picture is more nuanced—with a model like MiniMax M2 (Thinking) performing on par or better than high end proprietary western models like Claude and GTP in safety and jailbreaking tests. Given the relative price-performance and privacy advantages of using open source, there are good reasons for organizations to take a long look at these models, and security seems to be dissolving as an inhibiting factor for at least giving them a spin.

What can you do?

Open source and open-weight models can be valuable tools for experimentation, development, and cost reduction. But production use demands additional diligence—and the right governance infrastructure to manage risk at scale.

Holistic AI's Governance Platform provides enterprise-grade safeguards that can help make open Chinese models production-ready:

1. Automated Red Teaming & Safety Assessment

Before deploying any model, you need confidence in its safety posture. Holistic AI's platform enables you to:

Run comprehensive adversarial testing aligned to your organization's AI risk framework, covering harmful prompts, role-play scenarios, multilingual attacks, and context-shift exploits
‍Benchmark models against industry standards with detailed safe-response rates and jailbreak resilience metrics
‍Generate audit-ready reports that document model behavior and compliance readiness

2. Real-Time Safety Guardrails

Holistic AI's middleware intercepts and filters unsafe outputs in production, which includes:

Policy-based classifiers that enforce your organization's content policies
‍Reinforcement layers that catch edge cases missed by model-level safeguards
‍Dynamic filtering that adapts to emerging threat patterns

3. Continuous Monitoring & Observability

Model behavior can drift over time. Holistic AI's platform tracks:

Prompt and response patterns for safety degradation
‍Runtime auditing that flags unsafe, biased, or policy-violating outputs
‍Anomaly detection that alerts teams to unusual model behavior before it becomes a problem

The Bottom Line

Open Chinese models can be powerful assets for innovation, but they require enterprise governance to deploy responsibly. Holistic AI's Governance Platform transforms promising open models into production-ready systems by providing the safety layers, monitoring, and controls that proprietary providers build in-house.

Rather than building these capabilities from scratch or avoiding open models entirely, organizations can leverage Holistic AI's platform to safely capture the cost, performance, and privacy benefits of open source AI.

Ready to evaluate open models safely? Schedule a demo to see how Holistic AI can help your organization deploy Chinese AI models with confidence

Heading 2

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.