In one line: traditional "dense" models run every parameter for every query. MoE models activate only the relevant specialists. Same intelligence on tap, a fraction of the compute per answer — which is why Chinese models are 10–30x cheaper to run.
Mixture of Experts, in plain English
Picture a consultancy of 257 specialists. A dense model puts all 257 in the room for every question — expensive and slow. An MoE model has a router that picks the ~9 specialists who actually know the topic and asks only them. DeepSeek V3 does exactly this: 671B total parameters, ~37B active per query (4.8%). You keep the breadth of a huge model while paying for a small one each time.
The numbers
| Dense model (typical US) | MoE model (DeepSeek V3) | |
|---|---|---|
| Total parameters | All activated | 671B |
| Active per query | 100% | ~37B (4.8%) |
| Training cost | Baseline | ~5x faster, ~80% lower |
| Run cost | Baseline | 10–30x cheaper |
It matches GPT-3.5-class performance on benchmarks while training roughly 5x faster at about 80% lower cost — and competes with GPT-4o on coding at a tiny fraction of the price.
The twist: sanctions caused the breakthrough
China's open-weight dominance is partly a forced response to US export controls. Cut off from Nvidia's H100 and A100 chips since 2022, Chinese labs had to innovate on software efficiency instead of throwing more hardware at the problem. That constraint produced MoE refinements that now benefit the whole industry. The intended handicap became the edge.
Why US labs mostly use dense models
With abundant compute, US labs had less pressure to optimise — they could afford to run everything. MoE isn't unique to China (Western labs use it too), but the relentless efficiency focus that scarcity forced is why the cheapest capable models today come from Chinese labs. Background in the power map.
The catch
Cheap and clever doesn't mean risk-free. DeepSeek's hosted API stores data in China under Chinese law, and its safety guardrails trail Western frontier labs. The mitigation — because the weights are open — is self-hosting. See the risk assessment and self-host vs API.
Want the budget picture? Compare real costs in the cheapest AI API guide and the full Chinese models guide.