LIVE
ANTHROPICOpus 4.7 benchmarks published2m ago
CLAUDEOK142ms
OPUS 4.7$15 / $75per Mtok
CHATGPTOK89ms
HACKERNEWSWhy has not AI improved design quality the way it improved dev speed?14m ago
MMLU-PROleader Opus 4.788.4
GEMINIDEGRADED312ms
MISTRALMistral Medium 3 released6m ago
GPT-4o$5 / $15per Mtok
ARXIVCompositional reasoning in LRMs22m ago
BEDROCKOK178ms
GEMINI 2.5$3.50 / $10.50per Mtok
THE VERGEFrontier Model Forum expansion announced38m ago
SWE-BENCHleader Claude Opus 4.772.1%
MISTRALOK104ms
ANTHROPICOpus 4.7 benchmarks published2m ago
CLAUDEOK142ms
OPUS 4.7$15 / $75per Mtok
CHATGPTOK89ms
HACKERNEWSWhy has not AI improved design quality the way it improved dev speed?14m ago
MMLU-PROleader Opus 4.788.4
GEMINIDEGRADED312ms
MISTRALMistral Medium 3 released6m ago
GPT-4o$5 / $15per Mtok
ARXIVCompositional reasoning in LRMs22m ago
BEDROCKOK178ms
GEMINI 2.5$3.50 / $10.50per Mtok
THE VERGEFrontier Model Forum expansion announced38m ago
SWE-BENCHleader Claude Opus 4.772.1%
MISTRALOK104ms
Back to Originals

We Made Our AI Bot Traffic Public. Here's What We're Seeing.

Ripper··6 min read

Most sites hide their bot traffic. They treat it like a janitor problem, something to rate-limit and forget. We just published ours at /agent-traffic: a live dashboard of every AI crawler hitting TensorFeed, refreshed every 30 seconds, no auth required, with per-bot breakdown, top hit endpoints, and a live tail.

Two reasons we did it. First, TensorFeed was built for AI agents. Hiding the agent footprint while telling agents we welcome them is incoherent. Second, the data itself is interesting in a way that nobody else has surfaced yet. Which crawlers index which surfaces? Where does the agent web actually live? You cannot answer that from your own server logs alone, but you can answer it from a network of public dashboards. So we are publishing the first one.

What we track

Twenty-five user-agent patterns at the Cloudflare Worker layer. ClaudeBot and anthropic-ai for Anthropic. GPTBot, ChatGPT-User, and OAI-SearchBot for OpenAI. PerplexityBot. Google-Extended (the Gemini training opt-in crawler) and Googlebot. Bingbot. Applebot. Bytespider for ByteDance. Amazonbot. cohere-ai. YouBot. Plus generic patterns for Scrapy, python-requests, axios, node-fetch, and any user agent that says bot, crawler, spider, or agent.

We do not block any of them. We do not rate-limit them. We log a hit and move on. The identification is a string match, nothing fancier. Sophisticated agents who want to impersonate a browser can; we are not playing that game.

What the dashboard shows

Today's running counter (resets at 00:00 UTC). The most recent fifty bot hits as a rolling buffer. A derived breakdown by bot with vendor labels and a one-line description of what each crawler does. The top eight endpoints those bots are pulling. A 10-deep live tail with timestamps. The breakdown updates every thirty seconds.

The data already tells a story, even before we have a multi-day archive built up. OpenAI's OAI-SearchBot is the most frequent visitor. ClaudeBot and GPTBot are steady but lower volume. PerplexityBot shows up but irregularly. Bytespider is a surprise, given that we are an English-language site. The most-hit endpoints are the obvious ones: /feed.xml, /feed.json, /api/news, and /api/payment/info. That last one is interesting. Agents are scraping our wallet address, presumably to verify before sending USDC.

Why this is on-brand

We have been saying for months that TensorFeed is built for AI agents. Concretely that means we welcome them in robots.txt by name, ship a discovery manifest at /llms.txt, publish an x402 V2 manifest at /.well-known/x402, maintain an MCP server, and accept payment in USDC. All of that is plumbing.

The dashboard is the first surface that makes the thesis visceral. You land on /agent-traffic and see a live ticker of AI agents pulling data from us. Right now. The story we have been telling becomes a thing you can watch. That kind of demonstration compounds: every screenshot of the dashboard is implicit proof that the agent web exists and that we are part of it.

The data moat angle

Bot traffic patterns are themselves a dataset. Which crawler indexes which surface? How often? When did a new crawler show up? When did an existing one go quiet? Today we have one site's worth, but the daily snapshots also land in our public history at /api/history, so the archive grows by a row per day. In thirty days we will have a real time series. In a year we will have a meaningful longitudinal view of how the AI crawler population shifts.

Sister sites running the same Worker pattern can publish their own dashboards. Aggregate across enough of them and you have a public ledger of how the agent web behaves, which is information that does not exist anywhere right now. We are not racing to build that aggregator today, but the per-site dashboards are the building block.

The KV math, since people will ask

We do not write to KV on every bot hit. We buffer in a Worker isolate's in-memory array and flush once per fifty hits or once per sixty seconds, whichever comes first. That keeps us inside the 100,000 ops/day budget on the Cloudflare free tier. The dashboard reads from a 30-second in-memory cache fronted by Cache API, so the /api/agents/activity endpoint costs roughly two KV ops per minute regardless of how many people are watching it. We covered the full pattern in The 100,000 KV Ops Daily Budget and What Fits in It. This dashboard is just one more thing that fits inside it.

If you run a site, you should do this too

Whatever your stack is, you can probably spend a Saturday afternoon parsing user-agent strings on the request path and writing the breakdown to a stable URL. Most operators are nervous about exposing bot traffic because they think it makes them look small or gives crawlers a target. The opposite turns out to be true. Publishing your bot traffic is the strongest signal you can send that your site is built for the agent web. Crawlers themselves index it, your humans find it interesting, and your prospective AI users treat it as a credibility marker.

Our worker code that does it is roughly 200 lines, sits in worker/src/activity.ts, and is open source under the same MIT license as the rest of the repo. Take it. Run your own dashboard. Send us the URL and we will link it from ours.

What the dashboard means in one sentence

The agent web is real, it is hitting our servers right now, and we believe the right response to that is to count and publish, not to count and hide.

The dashboard is at /agent-traffic. Free, no auth, refreshed every thirty seconds. The raw data is at /api/agents/activity. The MCP shortcut is get_agent_activity (no token required). Welcome to the agent web. Start watching.