Nvidia's stranglehold on AI chip supply is loosening. OpenAI just announced Jalapeño, a custom inference chip developed with Broadcom, joining Google, Apple, and SpaceX in building proprietary silicon to reduce reliance on a single vendor.
The shift reflects a fundamental market reality. Nvidia controls roughly 80-90% of the AI accelerator market, giving it enormous pricing power and supply chain leverage. Companies running massive inference workloads face long lead times, premium costs, and vulnerability to supply constraints. Custom chips offer an escape route.
OpenAI's move targets inference specifically, the phase where trained models process user requests. This workload differs from training, which demands the raw compute density Nvidia's GPUs excel at. Inference chips can optimize for lower latency and cost per query instead. Jalapeño, built with Broadcom's manufacturing expertise, positions OpenAI to run ChatGPT more efficiently on its own infrastructure rather than paying Nvidia premium pricing.
Google started this trend years ago with TPUs for both training and inference. Apple built neural processors into iPhones and Macs. SpaceX developed Starlink satellite internet hardware. Now the wave includes Amazon, Meta, and Microsoft, all designing chips for specific workloads their businesses depend on.
The economics are compelling. Training a large language model costs billions. Running inference at scale across millions of users creates recurring, massive expenses. Shaving even 10-20% off inference costs through custom silicon pays for R&D in months. For companies operating at OpenAI or Google's scale, vertical integration into chipmaking becomes rational business strategy.
Nvidia benefits from inertia. Its CUDA software ecosystem and broad capability across training and inference create switching costs. But the margin opportunity is shrinking. Companies building inference-only chips or training accelerators for specific use cases don't need
