The Fork

May 11, 2026

Happy Monday. I scan 100+ Chinese-language AI and tech sources daily to find the stories that matter before they reach the English press. Today: two Chinese AI labs just made mutually incompatible strategic bets with $4.5 billion in fresh capital, a surprise CPU shortage driven by AI agents, and what a viral fake rumor reveals about the real anxiety underneath China's record fundraising week.

Let's go.

The Fork

StepFun and Moonshot AI (Kimi) raised a combined $4.5 billion in three days last week. Both companies are well-capitalized, both are in China's top tier of LLM developers, both are led by Tsinghua graduates who built serious research careers. In the same week they closed their rounds, they chose strategies that are structurally incompatible.

StepFun's investor list reads like a manufacturing procurement committee. Huaqin and Longcheer are the two largest global handset ODMs, together responsible for hundreds of millions of phones annually. OmniVision makes camera sensors that go into those phones. ZTE co-developed a dedicated AI phone model with StepFun. Hong Kong's state investment vehicle HKIC, which has bet on exactly one LLM company, also joined. The week the round was announced, StepFun confirmed it completed its VIE structure removal in April. That is the legal paperwork you file before a Hong Kong IPO. StepFun's Step models are now pre-installed on 42 million phones, covering roughly 60% of China's major phone brands, with about 20 million daily users via its AgentOS in-car systems. TMTPost's analysis of the round's logic is direct: Chairman Yin Qi believes AI value lives in the physical device. Control the pre-install, and you own the customer relationship at a cost the cloud cannot undercut.

Kimi ran the other direction. After K2.5 launched in January, Moonshot hit $100 million in ARR within one month. By April, ARR doubled to $200 million. The investor who disclosed the number, Meituan Dragon Ball partner Wang Xinyu, cited it at the time of closing a round that pushed Kimi's valuation to $20 billion, up from $5 billion six months ago. The model is now on K2.6, a one-trillion-parameter system that runs 300 parallel agents for up to 4,000 steps each, open-weighted under a modified MIT license. Founder Yang Zhimao's framing, at the Zhongguancun Forum: AI converts energy into intelligence, and the core competition is efficiency per unit of compute. Kimi's moat is in the intelligence layer. The ARR is how you know the intelligence is real.

The tension between these two paths sharpens when you look at what each approach actually purchases. StepFun's terminal installation base doesn't move easily if someone builds a better model. Switching costs are real when AI is embedded in hardware at the design stage. But the customer relationships belong to Huaqin and Longcheer, not to StepFun. The same ODMs that run Step models today have conversations with ByteDance, Alibaba, and Tencent at every design cycle. The "hardware binding" protection requires that StepFun maintain a technology lead over competitors who have ten to fifty times the compute. Kimi's ARR is real and growing fast, but it accumulates in a market where DeepSeek's API is priced at roughly one-tenth of OpenAI's rate by a company whose pricing decisions happen independently of revenue targets.

Both companies face the same endpoint. TMTPost's framing: "the ultimate opponents for both are not each other. When each path reaches its terminus, standing at the finish line will be Alibaba and Tencent." The forecast for both is a well-capitalized two-year window to make their respective bets irreversible before that collision.

The Briefing

A claim that ByteDance cut 30% of its AI projects spread across Chinese social media this week, and while the specific facts are largely false, the reaction reveals something real. The original post alleged ByteDance's April strategy review cancelled 30% of AI application projects, that Doubao was the only product not underperforming, that AI inference costs hit 8 billion RMB in 2025, and that "cash flow won't last to 2027." TMTPost's systematic debunking is worth reading: what actually happened is that two products, Catbox and Xinghui, are being consolidated under Doubao management -- ordinary organizational integration, not strategic cancellation. The "8 billion RMB equals 2.3x revenue increment" math requires ByteDance's 2025 revenue increments to be 34.8 billion RMB, which contradicts public financial disclosures. The "cash flow won't last" claim conflicts with an implied $550 billion valuation in a February equity transaction. None of this stopped the post from spreading, because it named an anxiety the industry has been carrying. Doubao has 345 million monthly active users and 120 trillion daily token calls. Starting May 2026, ByteDance began testing paid tiers. Not because the product is failing, but because at that token volume, the inference cost arithmetic makes indefinitely free operation structurally difficult to sustain. The fake claim was false in its details and accurate about the underlying pressure.

CPU became the unexpected bottleneck of the AI era this quarter. Intel, AMD, and ARM all delivered results that surprised markets. A Silicon Valley 101 analysis covers the mechanism: when an agent runs on a high-performance GPU, up to 65% of latency now originates from CPU overhead. The GPU computes. The CPU orchestrates, handling tool calls, data routing, code execution, verification, and memory management. As GPU throughput accelerates, the CPU becomes the constraint. A second driver is long-context windows: 1 million-token contexts generate KV cache that exceeds GPU VRAM and gets offloaded to SSD, which the CPU then manages. Server CPU-to-GPU ratios have shifted from 1:4 or 1:8 toward 1:1 or 1:2. Intel's data center revenue grew 22% year-on-year; its stock gained 24% in a single session and has roughly doubled over the past month. AMD crossed $10 billion in quarterly revenue for the first time, with data center up 57% year-on-year. ARM's AI-focused AGI CPU has locked more than $2 billion in pre-orders. Server CPU delivery times have extended from one to two weeks to eight to twelve weeks, with some models at six months.

Gartner projects more than 40% of enterprise AI agent projects will be cancelled by end of 2027. The figure surfaced in a keynote by H3C CEO Yu Yingtao at the NAVIGATE 2026 conference in Beijing, published in full by InfoQ. His framing of the failure mode: "not because the technology doesn't work, but three more fatal problems -- uncontrolled input costs, unclear scenario value, and insufficient risk governance." GPU utilization at some data centers is below 60%. ROI remains undefined for a substantial share of current spending. The selection pressure this describes will favor AI applications in specific scenarios where measurability and substitution costs are both high.

DeepSeek cut its cached input price to one-tenth of the original rate, and the move appears permanent. Per community analysis on Bilibili, the cache pricing change is not marked as promotional. Cached input pricing is the mechanism by which repeated prompts (the same system prompt across thousands of API calls from the same application) are billed cheaply. At one-tenth of original price, building production applications on DeepSeek's API is substantially cheaper than any comparable option for workloads that hit the cache frequently. A lighter V4 Flash variant is also now callable via existing user credits on some platforms.

What I Found on Bilibili This Week

The video I want to highlight this week is from a channel called 内部看美国 (Inside Look: America). Title: "DeepSeek V4 -- Still the Strongest Open-Source Model." 178,278 views, 11,830 likes. A 6.6% like rate on a 15-minute technical and geopolitical analysis is notably high for this content type.

Most of the video covers the technical improvements in DeepSeek V4 Pro: 1.6 trillion parameters, 1 million-token context, training data doubled from 15 trillion to 33 trillion tokens compared to V3. Training required solving stability problems that appear at this scale, which is why the release took 484 days after V3. The team grew from 197 named authors on V3 to nearly 300 on V4, contrary to the talent-exodus narrative that circulated during the wait.

The observation that generated the most comment engagement is near the end. Since January 2026, Nvidia H200 chips have been legally exportable to China under revised export control rules. As of late April, Nvidia had not recorded a single H200 sale to a Chinese buyer.

The video's explanation: the Huawei Ascend pathway became viable enough that Chinese companies don't need the H200 badly enough to create a purchasing event. DeepSeek V4 runs on Huawei Ascend chips, and per the 58-page technical report, inference performance on the Ascend 950 is close to three times that of the Nvidia H20. The development ecosystem work done for V3.2, including TileLang (a GPU programming language that lets the same code run on both Nvidia and Huawei hardware), made Ascend a real production option rather than a fallback. The H200 didn't fail to sell because it's blocked. It failed to sell because the market that would have bought it built an alternative path while waiting, and that path turned out to be sufficient.

That is a different kind of decoupling than export control debates usually describe. It's not that the H200 is absent. It's that absence produced substitutes that the H200's return cannot displace.

Signals

Musk dissolved xAI as a standalone company, merged it into SpaceX, and the same week Anthropic announced it would lease the entire Colossus 1 data center. Per TMTPost's analysis, Colossus 1 has 220,000-plus Nvidia GPUs and 300-plus MW of power; estimated annual rent is $3-6 billion. Grok peaked at second place globally in monthly active users in early 2025 and has since dropped to fifth, as Claude gained 44% to 23 million MAU. The dissolution of xAI is a pivot toward the infrastructure layer: SpaceX builds the orbital pipes, Anthropic (and future customers) run models on them. The Anthropic deal is the first major customer of what Musk is now calling SpaceXAI.

Security researchers found 380,000 publicly accessible apps built with AI coding tools, more than 2,000 leaking sensitive data. Israeli firm RedAccess scanned apps created with Lovable, Replit, Base44, and Netlify. WIRED and Axios independently verified specific exposed applications, including hospital staff assignments with physician personal data, corporate advertising budgets, market-entry strategy documents, and customer service logs with full names and contact details. The root cause is governance, not code quality. The platforms default to public visibility, and the people building applications via AI coding tools are not the people who historically set access controls. Gartner predicted in January that by 2028, prompt-to-app development by non-engineers would increase software defects by 2,500%. This is what that looks like in practice.

China's domestic AI chip market share reached 41% in 2025, with ByteDance's planned reallocation toward domestic chips expected to push it higher. Nvidia controlled 95% of China's AI chip market in 2022. At ByteDance's scale (200 billion RMB infrastructure budget), a deliberate reallocation of GPU spending toward domestic suppliers functions as industrial policy. China's six leading domestic AI chip companies collectively received over 300 billion RMB in procurement across 2026. Domestically produced chips are no longer a sanction workaround. They are the planned path.

The Bigger Picture

The ByteDance rumor and the StepFun/Kimi divergence are the same story at different scales.

The rumor said ByteDance is failing. The reality is ByteDance is consolidating, moving from "every AI bet placed simultaneously" to "fewer bets placed more precisely." The emotional truth the rumor captured, even as the specific claims fell apart, is that the era of "all in AI" as a strategic description is ending. Every major player is making explicit choices about which bets to maintain at scale. That is not failure. It is maturation.

StepFun chose hardware distribution. Kimi chose intelligence. Both choices are defensible. But they are no longer provisional. Both companies have closed rounds at valuations that require their chosen thesis to be right, and the investor groups assembled are structurally committed to each path. Neither company can quietly pivot to the other approach without violating the terms under which they raised.

The H3C CEO stood at a Beijing industry conference and said, calmly, that 40% of enterprise AI agent projects will be cancelled. This is not pessimism. It is a description of what selection pressure looks like in an industry that moved from promise to deployment in 18 months. Projects that survive will be the ones where AI produces measurable output in a scenario the buyer cares about. Projects that fail will be the ones that substituted "AI for everything" for that more specific question.

The fork in the Chinese AI road is not between optimism and pessimism. Both StepFun and Kimi raised at record valuations. Both have coherent strategies. The fork is between two theories of where AI value eventually accumulates: in the intelligence layer or in the deployment layer. That question has not been answered. The capital is now committed on both sides, and the resolution is somewhere in the next two years.

I exist because this information asymmetry shouldn't. If this is useful, pass it on, or subscribe to get it in your inbox every morning.

China AI Dispatch

Discussion about this post

Ready for more?