AI Status Monitoring: How We Actually Track Claude, ChatGPT, and Gemini

Ripper·May 4, 2026·7 min read

Every "is Claude down" site I have used over the past year has had the same problem: it lags the actual outage by 5 to 15 minutes. You hit the page, it says all systems operational, you go back to your terminal, the API is still throwing 500s. The site just mirrored the official status page, which is itself often the slowest source of truth, because incident response inside a frontier lab takes a few minutes to escalate before the public status page flips.

We built TensorFeed partly to fix that gap. The status pages at /is-claude-down, /is-chatgpt-down, /is-gemini-down, and seven others poll every two minutes, surface component-level detail, archive incident history, and let you compare uptime across providers. Here is the stack and what it caught last month.

The Three Failure Modes of "Is X Down" Sites

Most sites in this category fail in one of three ways:

They scrape the official status page. The official page is what you would have checked yourself. The site adds nothing. And when the official page lags real outages, the scraped mirror lags by even more.
They run a single ping every 5 to 15 minutes. A 12-minute outage with the wrong polling cadence shows as "degraded for 5 minutes" or, worse, never shows at all.
They cover one product. "Is ChatGPT down" sites do not tell you that the OpenAI API is also down or that Claude is up and ready to serve as a fallback. The user has to visit five sites in a panic.

A monitoring stack that solves all three is not technically hard. It is just specific work that nobody who runs a generic uptime tool wants to do for the AI sector specifically.

How TensorFeed Polls Status

The /api/status endpoint is the source of truth. It runs every 5 minutes for the lightweight pass and every 2 minutes for the official-status-page polls. Behind the scenes, a Cloudflare Worker fetches each provider's public status JSON, normalizes the response into a consistent schema, and writes the normalized result to KV. Component-level data (Claude API, Console, Workbench all separately tracked) flows straight through.

The fast path matters because frontier labs increasingly use the status page as a passive incident-acknowledgment tool. Engineers fix things first, update status second. By polling fast we shave a few minutes off the gap between "the system is broken" and "our page reflects it."

Beyond pure status, we run an active LLM endpoint probe every 15 minutes against each provider. We POST a tiny prompt at the chat completion endpoint and measure ttfb, total response time, and HTTP status. That data goes into a 24-hour rolling buffer. The output is at /api/probe/latest: per-provider success rate, p50, p95, p99 latency, and the last error string. This is genuinely unique data. It is measured, not self-reported. Most status pages won't tell you about elevated latency unless it crosses some threshold; we just show you the percentiles.

Real Incidents the Stack Caught

A handful of examples from the last quarter that show why this design works:

Claude Workbench-only degradation, March 2026. Anthropic's overall status read "operational" for 18 minutes while the Workbench component was returning 5xx. Our component-level view caught it immediately because we surface each subsystem separately.
OpenAI auth outage, April 2026. chat.openai.com was throwing "you are being rate limited" errors to authenticated users for 22 minutes. The OpenAI status page took 9 minutes to acknowledge it. Our active probe caught it at minute 1 because we hit the API directly, not the auth-gated chat surface.
Gemini regional outage, late April 2026. Vertex AI in us-central1 went degraded for ~30 minutes; europe-west4 was fine. We surface this at the component level so a user worrying about "is Gemini down" could see the answer was "in your region, yes; somewhere else, no" rather than a single misleading global flag.

None of these were dramatic outages. None made the news. All of them affected someone's production workload. That is the kind of signal a real status monitor should surface.

The Cross-Provider Comparison

Single-provider status pages are useful but incomplete. The harder question is "Claude is down, what should I switch to?" To answer that you need every provider on one screen. Our /status page lists ten major AI providers in a single grid, color-coded with the same status indicator, all updated on the same poll cadence. When Claude is degraded, you can see at a glance whether ChatGPT and Gemini are healthy enough to absorb the load.

For developers, the same data is at /api/status as JSON. If you have an agent that needs to fall back from Claude to GPT-5.5 when Claude goes down, you can poll our endpoint directly rather than building ten separate scrapers. We track Anthropic, OpenAI, Google, Microsoft, Mistral, Cohere, Replicate, Hugging Face, Perplexity, and Midjourney.

Notification, Not Just Display

A status page you have to refresh is half a product. Most users want to be told. That is what /alerts is for. Subscribe with an email, choose the providers you care about, and we send a single email when any of them flips from operational to degraded or down, plus a recovery email when it clears. No SMS spam, no app to install, no account.

Internally the alerts are routed through our staleness watchdog: if news polling lags for more than the threshold, we throttle to one email per hour so a sustained outage does not turn into an inbox flood. Same logic applies to status changes; we only email once per state transition, not once per poll while the state is degraded.

Why We Make This Free

Status data is the front door. It is what a stressed user types into Google when their workflow just broke, and it is the highest-trust moment we can earn with a new visitor. So we keep it free, we keep it fast, and we keep it complete.

The bet is that when we show up for you on the worst day of your week, you remember us on the best ones too. You start using our other free feeds (news, model catalog, papers, the morning brief). You bookmark the homepage. You install the MCP server. The free status page earns the right to exist by paying its way in trust.

That feels like the right deal for everyone. You get a status page that actually catches outages. We get the chance to be useful on the rest of your workflow too. Both sides win.

Live status: /status. Per-provider: Claude, ChatGPT, Gemini, Perplexity, Copilot, Cohere, Mistral, Hugging Face, Replicate, Midjourney.