Skip to main content
  • Home
  • Tech
  • [China Semiconductor] ‘Cambricon’ Takes Aim at NVIDIA’s Monopoly Ecosystem — the Paradox of U.S. Sanctions

[China Semiconductor] ‘Cambricon’ Takes Aim at NVIDIA’s Monopoly Ecosystem — the Paradox of U.S. Sanctions

Picture

Member for

1 year 3 months
Real name
Anne-Marie Nicholson
Bio
Anne-Marie Nicholson is a fearless reporter covering international markets and global economic shifts. With a background in international relations, she provides a nuanced perspective on trade policies, foreign investments, and macroeconomic developments. Quick-witted and always on the move, she delivers hard-hitting stories that connect the dots in an ever-changing global economy.

Modified

Building a Fully Autonomous Software Stack
Featuring “One-Click” GPU Migration Capability
Simultaneous Hardware–Software Push Amid U.S. Sanctions
Cambricon’s MLU270 chip/Photo=Cambricon

Chinese AI semiconductor company Cambricon has thrown down the gauntlet to NVIDIA’s CUDA ecosystem. Long dependent on NVIDIA’s proprietary architecture, China’s AI industry is now taking a decisive step toward technological self-reliance by developing an independent computing framework. The move reflects a confluence of factors — AMD’s open-source strategy, China’s state-led investment drive, and the U.S. government’s export restrictions. With NVIDIA chips in short supply due to Washington’s sanctions, Cambricon is rapidly expanding its foothold in the domestic market, leveraging its accessibility advantage. The sharper the geopolitical confrontation between the U.S. and China grows, the stronger Chinese substitutes become.

AI Training Without CUDA — Ecosystem Transition via “One-Click” Migration

According to Taiwan-based IT outlet DigiTimes on November 7, Cambricon, an AI chip designer under the Chinese Academy of Sciences, officially announced that its self-developed software suite NeuWare has reached a mature stage. For years, Chinese AI chipmakers had caught up in hardware design but remained constrained by NVIDIA’s dominance in software. Cambricon’s latest progress is widely seen as narrowing that gap.

NeuWare encompasses the entire software stack — from drivers and compilers to operator libraries, development tools, and cluster management. This enables developers to train and deploy AI models directly on Cambricon’s MLU chips without relying on CUDA. Cambricon said NeuWare fully supports the latest versions of PyTorch and Triton, the programming language for operator development, allowing fast model and operator migration.

The newly introduced “one-click GPU migration” tool lets developers port existing GPU-based AI models to Cambricon’s MLU chips at minimal cost, significantly simplifying the transition out of the CUDA ecosystem. This joint optimization of hardware and software has already proven its performance in large-scale training and inference environments. NeuWare currently supports major Chinese AI models such as DeepSeek V3, Qwen 3, GLM 4.5, and Hunyuan-Video. It also supports FP8 and FP4 low-precision computation formats, along with sparse and linear attention mechanisms, dramatically improving energy efficiency and latency.

Cambricon has also released dedicated training and inference solutions for AI search, advertising, and recommendation systems. The platform supports large-scale cluster deployment and maximizes the performance of key operators such as LayerNorm and XLA. The company expects this to advance commercial AI applications beyond passive recommendation toward active comprehension.

To meet industry demands for efficiency, Cambricon has optimized NeuWare for multi-node, multi-card cluster training. NeuWare supports all PyTorch versions from 2.1 to 2.8, and the company provides MLU updates within two weeks of every new release to stay aligned with the open-source community. Cambricon’s flagship Siyuan 590 chip is also optimized for inference, delivering about 80% of the performance of NVIDIA’s A100 but with greater affordability and easier installation.

The Side Effect of AMD’s “Open-Source Strategy”

Industry experts cite AMD’s open-source framework as one key factor that enabled Cambricon to build its CUDA clone. CUDA, which transformed NVIDIA from a hardware vendor into an “AI infrastructure ruler,” gave developers a bridge to harness GPU computing power without needing graphics expertise. Consequently, the global AI research ecosystem naturally flowed into NVIDIA’s orbit.

Today, CUDA encompasses more than 15 years of accumulated libraries, frameworks, and code — effectively the history of AI development itself. To use a rival chip, one would have to build an entirely new system from scratch, an extremely costly and time-consuming task. This has turned CUDA into a formidable moat — not merely a technical edge but a barrier of market dependency.

AMD has emerged as the most serious challenger to NVIDIA’s dominance, leveraging its price advantage and open-source platform to pry open the CUDA ecosystem. Its core initiative is ROCm, the software enabling GPU computing for AMD Radeon cards. At the Advancing AI 2025 event in May, AMD unveiled ROCm 7, highlighting support for the latest models and algorithms, advanced AI-building functions, the MI350 series, distributed resource management, and industrial applications.

Compared with ROCm 6, the new version delivers a 3.2× performance gain on Meta’s Llama 3.1 70B, 3.4× on Alibaba’s Qwen2-72B, and 3.8× on DeepSeek R1 inference. Training performance has tripled on Llama2 70B, Llama 3.1 8B, and Qwen 1.5 7B models. The combination of AMD Instinct MI355 and ROCm 7 achieves about 1.3× higher FP8 efficiency than NVIDIA’s B200.

However, with NVIDIA’s near 90% market share and a software ecosystem refined over two decades, AMD still faces an uphill battle. In contrast, Cambricon has been able to rapidly build a CUDA alternative using AMD’s open-source architecture as a blueprint. Ironically, AMD’s open-source strategy, intended to weaken NVIDIA, has instead provided Cambricon with the roadmap to become “China’s NVIDIA.”

Cambricon’s MLU220 chip/Photo=Cambricon

The Resurrection of a Fading Company

Cambricon’s rise to become “China’s NVIDIA” was made possible by massive state-backed funding. Founded in 2016, the company drew attention with the world’s first commercial AI chip, Cambricon-1A. Although NVIDIA GPUs were already popular for AI training, this was before NVIDIA had launched its dedicated AI GPU lineup. Riding that wave, Cambricon secured major clients like Alibaba and Huawei, sending its valuation soaring as investors rushed in. In July 2020, Cambricon debuted on Shanghai’s STAR Market — China’s equivalent of Nasdaq — with its share price quadrupling from the IPO price of 64 to over 250 on day one.

The euphoria proved short-lived. Heavy R&D expenditures kept the company in the red for years. Its troubles deepened in late 2022 when the U.S. Department of Commerce added Cambricon to the Entity List, forcing it to suspend its autonomous driving chip project and lay off hundreds of employees. As the global “ChatGPT boom” erupted in early 2023, Cambricon’s stock collapsed. Venture capital firms that had invested before its listing spent 2023 offloading their holdings, while local media questioned whether the company could ever produce competitive AI chips.

Then came a twist of fate — U.S. semiconductor export restrictions. When Washington banned the export of high-end AI chips such as NVIDIA’s A100 and H100 to China in September 2023, Cambricon suddenly found itself with a golden opportunity. It launched its upgraded Siyuan 590 chip, offering roughly 90% of the A100’s total processing performance at a far lower cost. The high-performance yet affordable chip caught the attention of Chinese tech firms eager to reduce exposure to U.S. sanctions.

The turning point came when ByteDance, the parent company of TikTok, began large-scale investment in AI infrastructure and ordered over 20,000 Siyuan 590 chips in 2024, becoming Cambricon’s largest customer. This propelled the company to its first-ever quarterly profit in the fourth quarter of 2024.

By 2025, Cambricon’s potential has fully materialized. Its first-half revenue soared 4,348% year-on-year — from about $94 million in 2024 to $4.2 billion in 2025 — while net income swung from a $75 million loss to a $150 million profit. The company continues to reap the windfall of U.S. export curbs. When Washington temporarily halted exports of NVIDIA’s lower-end H20 chips to China in April 2025, Cambricon swiftly filled the void by releasing its Siyuan 670, priced at roughly half the H20’s cost. Beijing has since moved to reinforce this momentum. Even after the U.S. reinstated export approval for the H20 in July, the Chinese government advised domestic firms not to use NVIDIA’s chip, citing “security concerns.” The message was clear: buy Chinese AI chips instead.

Picture

Member for

1 year 3 months
Real name
Anne-Marie Nicholson
Bio
Anne-Marie Nicholson is a fearless reporter covering international markets and global economic shifts. With a background in international relations, she provides a nuanced perspective on trade policies, foreign investments, and macroeconomic developments. Quick-witted and always on the move, she delivers hard-hitting stories that connect the dots in an ever-changing global economy.