Huawei denies Pangu model was based on Alibaba’s Qwen

Date:


Huawei’s newest large language model, Pangu Pro MoE 72B, was released under an open-source license in late June. A week later, the company found itself accused of copying parts of a rival model made by Alibaba.

The allegation came from a little-known entity called HonestAGI, which posted a research paper on GitHub. It claimed Huawei’s Pangu model showed unusually high similarity to Alibaba’s Qwen 2.5 14B, suggesting it may not have been trained from scratch. The paper used model fingerprinting techniques and statistical analysis to point out what it called “extraordinary correlation” between the two.

The authors also raised concerns about how Huawei described the model’s training process and whether it had complied with open-source licensing rules. They said the similarities could point to copyright issues or overstated development claims.

Huawei pushed back the next day. Its research arm, Noah’s Ark Lab, said the model was built independently using the company’s in-house Ascend chips. It said the team made “key innovations” in architecture design and followed licensing rules for any third-party code involved, though it didn’t name which open-source models it may have used as a reference.

Scrutiny over model origin

As open-source models become more common in China’s AI race, scrutiny over where code and data come from is growing. In this case, the core issue isn’t just about attribution—it’s about whether the model was trained using original data or adapted from existing work without clear disclosure.

That matters for developers who rely on these models, especially when they are used in sectors like finance, government, or health. If a model closely resembles another but doesn’t follow licensing rules, users could face legal or reputational risk.

The dispute also exposes how hard it is to prove originality in AI training. With many companies pulling from the same open-source datasets or frameworks, overlaps are likely. But without clear documentation, those overlaps raise questions.

Alibaba hasn’t commented on the allegations. HonestAGI’s identity remains unknown.

Safety and speed

The Huawei-Alibaba conflict isn’t happening in isolation. China’s AI sector is seeing a wave of model launches, many of them open-sourced to increase adoption and avoid reliance on foreign tech.

In January, startup DeepSeek released its R1 model. It attracted interest for its low cost and performance, but also concern over how it handled harmful content. Tests found that it responded to dangerous prompts more often than expected, which drew attention to safety risks in newer models.

Huawei and Alibaba’s models haven’t been widely benchmarked for safety in the same way. But their use in enterprise and government projects means any vulnerability could have wider impact. If the dispute over Pangu’s origin slows down its adoption, it may also push users toward other local models perceived as more stable or transparent.

Chips and control

The other piece of this story is hardware. US export restrictions have limited China’s access to chips like Nvidia’s A100 and H100, which are widely used for training large models. In response, Chinese firms have been building around domestic chipsets such as Huawei’s Ascend.

By releasing an open-source model trained fully on Ascend, Huawei signals its push toward self-reliance. But that also raises questions about performance, efficiency, and transparency—especially if the training process isn’t fully disclosed.

These concerns aren’t just local. Governments in the US and Europe are reviewing export policies around AI and semiconductors, particularly when models could be repurposed for military use. That includes open-source models developed in China, such as DeepSeek’s R1, which has already appeared on some regulatory watchlists.

Trust still in question

The Pangu vs Qwen debate may fade in a few weeks, but it reflects bigger challenges in AI development—namely trust, safety, and ownership. With so many new players entering the space, and national strategies pushing for faster releases, questions about how these models are built are only going to grow.

China’s AI companies are now expected to balance fast development with clear disclosures and safer deployment. For open-source projects, that means more than just releasing code—it also means explaining how it was made and what’s inside.

Whether Huawei’s model was trained independently or not, the public conversation has already shifted toward how much transparency users should expect.

(Photo by BoliviaInteligente)

See also: Huawei Cloud rolls out Pangu Models 5.5 to cover more industries

Looking to revamp your digital transformation strategy? Learn more about Digital Transformation Week taking place in Amsterdam, California, and London. The comprehensive event is co-located with IoT Tech Expo, AI & Big Data Expo, Cyber Security & Cloud Expo, and other leading events.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

Tags: ai, data, development, github, huawei, llm, open source, security

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Share post:

Subscribe

spot_imgspot_img

Popular

More like this
Related

1st Look at Local Housing Markets in June

by Calculated Risk on 7/08/2025 11:05:00 AM Today, in...

Browns Given Unflattering Division Prediction For 2025

  The Cleveland Browns enter this season on the...