OpenAI Taped Out Jalapeño in Nine Months. The Custom-Silicon Loop Just Closed.
OpenAI and Broadcom went on stage Wednesday and walked the reveal step by step. The chip is called Jalapeño. The branding is "Intelligence Processor", not GPU or NPU. It is a custom ASIC, designed by OpenAI, built by Broadcom, fabbed at TSMC, packaged with Celestica, and aimed at a single workload: LLM inference at production scale. OpenAI is claiming roughly 50 percent lower cost per token than current Nvidia GPUs in early testing. Initial deployment lands by the end of 2026, and the multi-generation program targets 10 gigawatts of capacity across OpenAI facilities and partner data centers, with the build-out completing by 2029.
The number that does the real work here is not 10 gigawatts, and it is not 50 percent. It is nine months. That is how long OpenAI and Broadcom took to go from initial design to manufacturing tape-out on a reticle-sized chip on an advanced node. The companies are calling it the fastest ASIC development cycle ever in high-performance semiconductors, and the surrounding evidence supports the claim. Nine months is what changes the industry math, because it tells you the tape-out clock for a lab-led inference chip is now short enough to ship inside a single model generation.
The Math
| Number | Value | Notes |
|---|---|---|
| Tape-out time | 9 months | Initial design to manufacturing tape-out |
| Die size | ~840 mm² | 25.46 mm by 33 mm, near EUV reticle limit of 858 mm² |
| Fab | TSMC | Broadcom co-design, Celestica on packaging |
| Workload | LLM inference | Purpose-built, not a repurposed training accelerator |
| Cost-per-token claim | ~50% lower | Versus current Nvidia GPUs, OpenAI early testing |
| First deployment | Late 2026 | OpenAI facilities plus partner data centers |
| Total program scale | 10 GW | Multi-generation, completing by 2029 |
| Partnership announced | Oct 13, 2025 | 18 months of joint design preceded the announcement |
The cost-per-token number is the one to weight carefully. It is a vendor claim in pre-production silicon, comparing a single inference workload on an LLM-tuned ASIC against a general-purpose GPU. The right way to read it is as a directional statement on the per-token economics OpenAI now controls, not as a fixed multiple. The die size, the fab, and the timeline are harder numbers. Reticle-sized at TSMC, on an advanced node, in nine months, with deployment slated for inside the calendar year. The chip is real.
The Nine-Month Story
The detail OpenAI volunteered on stage, and the one most worth chewing on, is that the design schedule got compressed because OpenAI used its own models inside the design loop. Greg Brockman framed the speed-up as "very surprising to us", and the implication is that frontier LLMs are now a productive layer of the ASIC tooling stack. Floorplanning, place-and-route exploration, verification test generation, RTL synthesis hint passes: all of these are now jobs you can hand to a model with enough context, and OpenAI was running them on its own internal models against its own chip.
This is the second OpenAI deliverable in a week that mentions the house models inside the loop. The two science results from June 17 and 18 were the cleaner story, but Jalapeño is the more expensive one. If a frontier model can shave even a quarter off the design schedule for a 840 mm² ASIC at TSMC, the next inference chip from any lab that has a similar in-house pipeline (Anthropic, Google DeepMind, Meta FAIR, and a handful of state labs) starts looking like a nine-month problem rather than a two-year one. That is the floor that moved this week, and it moves the entire custom-silicon roadmap of every competitor who can use the same trick.
The Pattern Is Complete
Jalapeño is the entry that closes the table. Every top-three frontier lab and every top-three US hyperscaler now has a custom inference chip in production or in the pipe.
| Lab or cloud | Silicon | Status |
|---|---|---|
| TPU v7 | In production, $200B Anthropic offtake through 2031 | |
| Amazon | Trainium 2 and 3 | Project Rainier with Anthropic at multi-GW scale |
| Microsoft | Maia 2 | Inference focus, Anthropic on-ramp now public |
| Meta | MTIA | In production for ads and ranking inference |
| OpenAI | Jalapeño | First deployment late 2026, 10 GW program |
| Anthropic | TPU plus Trainium plus Maia | Three-silicon buyer, no in-house ASIC of its own |
Anthropic is the interesting line. It is the only top-three lab without its own ASIC, and the only one with named offtake at scale on three different custom platforms (TPU via the $200B Google commitment, Trainium via Project Rainier on AWS, and Maia via the Azure inference ramp). The Anthropic posture is buyer of last resort on custom silicon, on purpose: the company has been explicit that it does not want to spend founder-cycle bandwidth on a fab program, and it would rather sit at the procurement table on three platforms than commit to one. OpenAI just made the opposite call, in public, with a chip on the floor.
What Nvidia Loses
Less than the headlines will say, but in a specific place that matters. Nvidia keeps training. Nvidia keeps the long tail of buyers who cannot afford an ASIC program. Nvidia keeps the second-half Vera Rubin deployment with OpenAI itself, because the 10 GW Broadcom program does not replace the existing Nvidia commitment, it sits next to it. What Nvidia loses is the part of the inference workload at the frontier-lab top of the curve where the per-token economics are tightest and the cost-per-token claim of a 50 percent gap actually compounds across billions of queries a day.
The cleaner way to say it: Nvidia's pricing power on inference GPUs at the top of the buyer list is now bounded by the cost-per-token of whatever custom silicon the buyer can stand up. For OpenAI, that bound is Jalapeño starting late 2026. For Anthropic, it is TPU v7 starting in 2027. For Meta, it is MTIA today. The GB300-and-up shelf still sells, because frontier training and the long tail of buyers without an ASIC program still need it, but the inference shelf has a ceiling on it that did not exist eighteen months ago.
One more wrinkle: Broadcom now sits on both sides of the most expensive silicon contract in the industry. The Anthropic $200B TPU commitment is Google plus Broadcom on the silicon. Jalapeño is OpenAI plus Broadcom on the silicon. The arms dealer of the custom-silicon era is not the lab and not the cloud. It is the partner that knows how to tape out a reticle-sized inference ASIC in nine months on advanced node. Broadcom equity reflected this on Wednesday, and the repricing has further to go.
The 10-Gigawatt Floor
The capacity in this program does not show up in 2026 at scale. The first racks deploy late this year, and the full 10 gigawatts arrives in stages through 2029. That is the same physical constraint that governs the Anthropic TPU build-out and the Vera Rubin deployments: gigawatt-scale data centers need power purchase agreements, fiber, substations, and TSMC fab allocations that none of the labs can compress past 18 to 24 months. Every frontier lab is now pre-paid through 2029 for compute that physically does not yet exist, and Jalapeño just added another wedge to that aggregate.
What that does to the inference price floor is the part builders should be reading for. Our inference floor analysis in May tracked per-token prices coming down faster than capacity could explain, on the back of TPU and Trainium absorbing OpenAI-equivalent workloads at lower marginal cost. Jalapeño is the next term in that series. By late 2027, the marginal cost of an inference call against a frontier OpenAI model is set by an OpenAI-designed ASIC, not a Nvidia GPU margin. The price floor falls further, on schedule.
Our Take
Two signposts in the next ninety days. First, whether OpenAI publishes a real benchmark against the GB300 reference platform on a named inference workload (long-context Claude-style chat, agentic tool-calling, multimodal serving). The 50 percent number needs an apples-to-apples comparison before any buyer outside OpenAI sizes their next contract around it. Second, whether Anthropic responds by announcing the in-house chip program everyone in the procurement community has been waiting for, or by doubling down on the three-silicon buyer posture. The Jumper-to-Anthropic hire (June 19) and the wet-lab program suggest the second answer, but the equity conversation will press on the first.
The cleanest read on this week. Jalapeño is not a Nvidia killer and it is not a hyperscaler killer. It is the closing entry on a custom- silicon table the industry has been filling out one cell at a time since 2022, and it sets a new tape-out floor that every other lab is now going to be measured against. Nine months from design to TSMC tape-out, with the labs running their own models inside the design loop, is a different competitive regime from the one that produced MTIA v1, TPU v3, or Maia 1. We are tracking the deal cadence on the OpenAI provider page and the silicon side on the Broadcom page. Next data point to watch: the first independent Jalapeño bench in a customer hand, and the first OpenAI API price cut that quietly credits the new silicon for the headroom.
