LIVE
ANTHROPICOpus 4.7 benchmarks published2m ago
CLAUDEOK142ms
OPUS 4.7$15 / $75per Mtok
CHATGPTOK89ms
HACKERNEWSWhy has not AI improved design quality the way it improved dev speed?14m ago
MMLU-PROleader Opus 4.788.4
GEMINIDEGRADED312ms
MISTRALMistral Medium 3 released6m ago
GPT-4o$5 / $15per Mtok
ARXIVCompositional reasoning in LRMs22m ago
BEDROCKOK178ms
GEMINI 2.5$3.50 / $10.50per Mtok
THE VERGEFrontier Model Forum expansion announced38m ago
SWE-BENCHleader Claude Opus 4.772.1%
MISTRALOK104ms
ANTHROPICOpus 4.7 benchmarks published2m ago
CLAUDEOK142ms
OPUS 4.7$15 / $75per Mtok
CHATGPTOK89ms
HACKERNEWSWhy has not AI improved design quality the way it improved dev speed?14m ago
MMLU-PROleader Opus 4.788.4
GEMINIDEGRADED312ms
MISTRALMistral Medium 3 released6m ago
GPT-4o$5 / $15per Mtok
ARXIVCompositional reasoning in LRMs22m ago
BEDROCKOK178ms
GEMINI 2.5$3.50 / $10.50per Mtok
THE VERGEFrontier Model Forum expansion announced38m ago
SWE-BENCHleader Claude Opus 4.772.1%
MISTRALOK104ms
Back to Originals

15 Paid AI Agent API Endpoints in 24 Hours: What Made It Possible

Ripper··8 min read

This morning at 5 AM Pacific I had one paid AI agent endpoint live: routing recommendations. By 1 PM I had fifteen. Same codebase, same machine, same coffee. The difference between hour zero and hour twelve was not effort. It was that the first endpoint paid for the next fourteen.

Building velocity stories on the internet are usually fake. Either the writer is embellishing, or they had a team they failed to mention, or what they shipped does not actually work. So before I tell you what we got done today, here is the receipt: every endpoint mentioned in this post is live on tensorfeed.ai right now, every one is paid in real USDC on Base mainnet, every one passes its unit tests, and theend-to-end mainnet validationthis morning used the same code path agents will hit. You can verify all of this from public sources before you keep reading.

Now the part worth writing about. Why was today fast?

The first endpoint is the only hard one

When I shipped routing recommendations yesterday it took the better part of a week. The payment middleware was new. The KV layout for credits had to be designed from scratch. The verification path that reads a USDC Transfer event from a Base RPC and parses the recipient out of the topics array was something I had never written before. The bearer token format, the replay protection storage, the credit decrement under contention, the x402 fallback for one-shot calls without pre-flight: all of this was first-time work.

Then I shipped the second paid endpoint today. It took an hour. The third took 45 minutes. The fourth took 30. The pattern, by then, was a copy-paste from the routing handler with a different feature behind it. Every premium endpoint after the first is a variation on the same shape:

if (path === '/api/premium/X') {
  const payment = await requirePayment(request, env, 1);
  if (!payment.paid) return payment.response!;

  const result = await doTheActualWork(env, parseQueryParams(url));

  ctx.waitUntil(
    logPremiumUsage(env, '/api/premium/X', userAgent, 1, payment.token),
  );
  return premiumResponse(result, payment, 1);
}

The three lines that mattered for me to write were the call todoTheActualWork. Everything else was infrastructure that already existed. Every endpoint costs one credit, every credit is recorded against the bearer token, every response includes billing metadata and the X-Payment-Token-Balance header. None of that needed thinking. It was a pattern.

What I shipped today

Counting from this morning's first commit at 5 AM:

  • Four history-series endpoints (pricing series, benchmark series, status uptime, snapshot diff)
  • Webhook watches with HMAC-signed delivery, SSRF guard, fire-cap, and 90-day TTL
  • Enriched agents directory with derived trending score, six sort modes, and four filters
  • Per-token usage tracking and a free /api/payment/usage endpoint
  • News search with full-text relevance scoring, recency boost, and date/provider filters
  • Cost projection across 1-10 models with daily/weekly/monthly/yearly horizons and cheapest-monthly ranking
  • A human-facing credits dashboard with sessionStorage-only token handling and one-click watch deletion
  • MCP server expansion from 5 free tools to 17 total (12 premium tools added)
  • Python SDK shipped four versions (1.3.0 through 1.8.0) with full coverage of every new endpoint
  • TypeScript SDK shipped four versions (1.2.0 through 1.7.0) with discriminated-union response types
  • FAQPage and HowTo JSON-LD schema on the agent-payments docs page for AI Overviews and rich results
  • An originals article documenting the mainnet validation, with the actual on-chain tx hash
  • A distribution playbook with copy-paste-ready submissions for 12 MCP and x402 directories
  • Worker test suite from 15 to 105 vitest cases across 7 files

Every one of those was a Git commit on main. Every commit was deployed automatically by Cloudflare Pages and Workers. Every one passes tsc --noEmitand the test suite. None of them are mocked or stubbed. None of them are behind a feature flag waiting for the "real" rollout.

Why the cadence held

A few things compounded. None of them were heroic.

The data layer was already there. We have been snapshotting pricing, models, benchmarks, status, and agent activity to dated KV keys for weeks. By today the historical dataset had real depth, which meant new endpoints (history series, news search, cost projection) could read from existing storage rather than waiting for ingestion. Phase 0 of agent payments was about capturing data we could not backfill. Today that decision paid back across half a dozen features.

The payment primitive was solved once.The requirePayment(env, tier)middleware does all the auth, balance check, debit, and 402-with-instructions response logic. Every new paid endpoint is two lines: call the middleware, return on failure. New endpoints inherit the entire payment surface for free, including the x402 fallback, the bearer token rotation, the daily revenue rollup, and the per-token usage log.

Tests were written alongside, not after.When I shipped the routing engine yesterday I wrote 15 vitest cases for it. That suite gave me confidence to refactor freely. When I added each new endpoint today, I wrote its tests in the same module-and-test pair pattern. That stayed the test count at 1.5x to 2x the production code count. By the end of the day we have 105 tests. None of them ran on real RPCs, real KV, or real wallets. They run in ~500ms total against in-memory mocks. That speed is what makes them get run.

SDKs followed the same shape. The Python and TypeScript SDKs share a structural pattern. Each free endpoint has a method. Each paid endpoint has a method that auto-attaches the token, throws a typed error if credits are insufficient, and surfaces billing metadata in the response. Adding a new endpoint to both SDKs takes ten minutes once the worker side is done, including updating the README API table.

The MCP server reuses the SDK pattern.Each MCP tool is a thin wrapper over a worker endpoint. The fetch helper handles auth and 402 with friendly error messages. New tools were five-minute additions per endpoint.

Documentation was generated alongside, not after. Every commit that adds an endpoint also updatespublic/llms.txt,CLAUDE.md, the/api/meta manifest, the /developers/agent-payments page, both SDK READMEs, and the MCP README. Six docs locations updated per commit, every commit, no skips. That is the only way docs stay honest at this cadence.

The lesson worth keeping

The first endpoint is the platform. Everything after the first endpoint is content. When the platform work is real, content scales fast. When the platform work is fake or rushed, every new endpoint reopens the same questions you thought you had answered.

Every team that ships slowly is paying the platform tax over and over. They are rebuilding auth in each new feature because their auth primitive was never abstracted. They are debating credit accounting in every endpoint because there is no shared credit primitive. They are writing one-off tests because their test infrastructure does not generalize. The cure is to spend the disproportionate time up front on the bones, then let the surface area compound.

That is what made today possible. The bones got built last week. Today was just hanging features off them.

What is live

Fifteen pay-per-call premium endpoints. Free preview tier on the highest-value recommendation engine. Daily snapshots of pricing, models, benchmarks, status, and agent activity going back weeks. Webhook watches that fire HMAC-signed POSTs on price and status transitions. Full-text news search over our article corpus. Cost projection across any 1-10 models. SDKs in Python and TypeScript, an MCP server for Claude Desktop and Claude Code, and a human dashboard at/accountthat uses sessionStorage only.

And one verified transaction on Base mainnet that proves all of it works. The full trace is in themainnet validation postif you want the receipts.

Try it: pip install tensorfeed, buy a dollar of credits via tf.buy_credits(), and call any of the fifteen endpoints. Or drop the MCP server into Claude Desktop and ask it to project your monthly cost across three models. Either way the loop closes end-to-end without a human in the credential path.

That is what fast looks like when the foundation is real.