Google I/O Is in Eight Days. Here Is What Gemini 4 Needs to Do to Matter.
Eight days. That is what stands between Google and the most consequential I/O keynote since Sundar Pichai walked on stage with Bard in 2023. The Android Show: I/O Edition opens the cycle tomorrow morning, the main keynote lands at 10 a.m. PT on Tuesday May 19, and a Gemini 4 announcement is a near-lock on the rumor side. Google walks into the room a different competitor than the one that left I/O 2025.
The fortnight that just closed reshaped the board. Anthropic crossed a $30 billion revenue run rate after a Q1 that grew 80x year over year. It pre-bought $200 billion of Google Cloud and Broadcom TPU capacity over five years, then turned around and rented every accelerator at SpaceX's Colossus 1 facility in Memphis (more than 220,000 GPUs, 300 megawatts) just to keep Claude Code online. OpenAI shipped GPT-Realtime-2 and two adjacent voice models that put reasoning inside the audio loop. Apple confirmed that iOS 27 will let users pick Claude, Gemini, or any other compatible model to power Apple Intelligence, ending the OpenAI exclusive that defined the first year of that surface.
Google has the largest pool of Apple-distributed AI demand it has ever had a clean shot at, more compute on its own silicon than any other frontier vendor, and a Gemini 3.1 line that is already the best value answer on the market. It also walks into the keynote behind on agentic share, voice, and the cyber tier. Here is the actual punch list, and what each item on it costs Google to miss.
Item One: The Context Window Has to Cross 2 Million and Cost Less to Use
The most-cited Gemini 4 leak puts the context window at 2 million tokens or higher. Gemini 3.1 Pro already sits at 2M at $1.25/$5 per million tokens. Maintaining that lead is not optional; it is the one place where Google still holds a structural edge over every other lab's flagship.
The number to watch is not the context length itself. It is the cost per million tokens at long context. Claude Opus 4.7 charges $15/$75 with a 200K window, and OpenAI's GPT-5.5 charges $5/$30 with a 1M window. Both jack up effective pricing as context grows. If Gemini 4 holds Gemini 3.1 Pro's pricing curve across a 2M+ window, it stays the only viable economical choice for full-codebase reasoning and long-document agentic workflows. That is the moat to defend.
| Model | Context | Input ($/1M) | Output ($/1M) |
|---|---|---|---|
| Gemini 3.1 Pro | 2M | $1.25 | $5.00 |
| GPT-5.5 | 1M | $5.00 | $30.00 |
| Claude Opus 4.7 | 200K | $15.00 | $75.00 |
| Gemini 4 (rumored) | 2M+ | TBA | TBA |
A Gemini 4 priced at $1.50/$6 with a 3M window would put the rest of the frontier on notice. A Gemini 4 priced at $4/$20 to chase parity on benchmarks would forfeit the one quadrant Google still owns. The cost calculator on our site keeps the live comparison once the numbers drop.
Item Two: An Agentic Coding Story That Closes the Anthropic Gap
Anthropic did not hit a $30 billion run rate on chat. Claude Code went from launch in mid-2025 to a billion dollar annualized business in six months and is now the fastest growing product in the company's history. The reason Anthropic had to rent every accelerator at Colossus 1 is that Claude Pro and Claude Max usage outran its own data center commits.
Google has the underlying capability. Jules has been shipping. Gemini Code Assist ships in every IDE Google can negotiate into. But there is no Google-published coding harness that wins on SWE-Bench Verified or Terminal-Bench at the level Claude Code does, and the harness gap is doing most of the work on those benchmarks anyway. A first-party agentic coding stack tied to Gemini 4, with measurable Terminal-Bench and SWE-Bench Verified numbers under a public harness, is the deliverable Google owes the developer track at this I/O.
The confirmed I/O agenda already lists agentic coding as a track. The question is whether what shows up is a wrapped Gemini API, or a genuine Claude Code competitor with its own loop, memory layer, and dreaming-style offline reflection. The bar Anthropic set on May 6 with the dreaming research preview is real, and we are watching for whether Google answers it at I/O or punts the response to I/O Connect in the fall.
Item Three: The Omni Video Model Has to Actually Compete With Veo, Sora, and Happy Horse
The Gemini UI leak that surfaced earlier this month showed an internal model named Omni sitting next to Toucan (the codename for Gemini's current Veo 3.1-backed video tool). The most likely path is that Omni ships at I/O as the public face of Google's next-generation video generation, possibly fused into Gemini 4 itself rather than living as a separate product surface.
The video crown is contested in a way it has not been in two years. Alibaba opened public beta on Happy Horse 1.0 on April 27 and a 15B parameter joint audio-video model now sits at the top of the Artificial Analysis Video Arena. OpenAI killed Sora in March, which removed one competitor entirely. Runway and Luma both shipped updates in the last two weeks. Google can take the leaderboard back, but only if Omni ships with credible numbers and a price point that does not embarrass the Gemini 3.1 Flash-Lite $0.25/M cost story Google has been building.
Item Four: An Answer to the Cyber Tier
The single biggest policy shift of 2026 is the formal recognition that a model can be capability-restricted as a product category. Anthropic's Claude Mythos Preview, restricted via Project Glasswing to a vetted set of about four dozen organizations including AWS, Apple, Cisco, JPMorgan, Microsoft, and NVIDIA, has forced every other frontier lab to decide whether it ships a cyber tier and on what governance terms.
OpenAI answered with GPT-5.5-Cyber to vetted security teams in the first week of May. The CAISI pre-launch evaluation framework now covers Google DeepMind alongside OpenAI, Microsoft, Anthropic, and xAI. Google has the policy plumbing in place. The unknown is whether DeepMind shows up at I/O with a Mythos-class capability behind a gated tier, or hands the cyber tier conversation to Anthropic for another quarter. Either is a tenable strategic choice. Saying nothing is not.
Item Five: A Distribution Story That Cashes in on Apple Intelligence Extensions
Bloomberg confirmed on May 5 that iOS 27, iPadOS 27, and macOS 27 will introduce an Extensions system letting users pick Claude, Gemini, or any other compatible model to power Apple Intelligence. Apple ships at WWDC on June 8, three weeks after I/O. The window for Google to set the default conversation about why a billion iPhone users should pick Gemini is now.
Google has structural advantages here that Anthropic and OpenAI do not. Google Workspace, Drive, Photos, Maps, and YouTube are already the data layer most consumers live inside. A Gemini surface that pulls cleanly across that footprint via Apple Intelligence Extensions is the first opportunity Google has had at the iPhone since Maps got booted in 2012. I/O is the venue to plant the flag, even if the plumbing ships at WWDC.
The Quadrant Map Going In
The model wars are now four-dimensional. Frontier capability (the FrontierMath, ARC-AGI, SWE-Bench Pro race) is one axis. Cost per useful task is the second. Agentic surface coverage is the third. Distribution into consumer and enterprise surfaces is the fourth. Each frontier lab owns a different corner of that map going into I/O week.
| Lab | Owns | Weak Spot |
|---|---|---|
| OpenAI | Top of leaderboards (GPT-5.5), voice stack, ChatGPT brand | Highest price per token, dependency on Microsoft/AWS compute |
| Anthropic | Agentic coding (Claude Code), enterprise vertical agents, cyber tier | Compute scarcity, consumer surface, no first-party hardware |
| Cost per million tokens, context length, own silicon (TPU) | Agentic coding share, voice, cyber tier silence | |
| Meta & Open Source | Inference floor (DeepSeek V4, Mistral Medium 3.5) | No first-party consumer assistant surface at scale |
Google's weak spots are precisely the surfaces that grew the most in the last two weeks for everyone else. That is what makes this I/O the highest-stakes one since 2023. Match the rumored numbers on Gemini 4 context and pricing, ship a real Claude Code answer, put Omni up against Happy Horse and Veo with shippable benchmarks, take a public position on the cyber tier, and stake the Apple Intelligence flag, and Google walks out of I/O with momentum carrying into WWDC. Punt any one of those and a competitor closes the gap further.
Our Take
The cost story is the one Google cannot afford to lose. Anthropic and OpenAI both run premium-priced flagships and accept that the price floor will eat the long tail. Gemini 3.1 Pro at $1.25/$5 with a 2M context is currently the only frontier-class model that scales economically into full codebase reasoning, multi-document research, and long-horizon agent workflows. Pricing Gemini 4 to chase OpenAI on output tokens would forfeit that. Pricing it to defend the value position would frame the second half of 2026 around who can ship the most useful work per dollar, and that is the conversation Google wins.
The agentic coding story is the one Google most needs to fix. Anthropic earned a $30 billion run rate on a product Google has all the ingredients to ship and has not. Eight days is enough to ship one keynote demo that closes the perception gap, then twelve months to back it up with usage. Skip the demo and the Claude Code lead compounds.
The cyber tier and the Omni model are the two surprise factors. Either could move the needle on the keynote independent of the Gemini 4 stat sheet. Both have credible paths from current public capability to a launch artifact in eight days. Neither is guaranteed.
We'll be tracking the keynote live across our Today feed and updating the models tracker, benchmarks page, and cost calculator as the numbers land on May 19. The verdict on whether Gemini 4 cleared the bar will be in the data within twenty-four hours of the keynote.
The one thing I am most confident about: this is the I/O that will redraw the quadrant map for the rest of the year. Google has the tools, the silicon, and the surfaces. Eight days from now we find out whether it had the will.
