LIVE
ANTHROPICOpus 4.7 benchmarks published2m ago
CLAUDEOK142ms
OPUS 4.7$15 / $75per Mtok
CHATGPTOK89ms
HACKERNEWSWhy has not AI improved design quality the way it improved dev speed?14m ago
MMLU-PROleader Opus 4.788.4
GEMINIDEGRADED312ms
MISTRALMistral Medium 3 released6m ago
GPT-4o$5 / $15per Mtok
ARXIVCompositional reasoning in LRMs22m ago
BEDROCKOK178ms
GEMINI 2.5$3.50 / $10.50per Mtok
THE VERGEFrontier Model Forum expansion announced38m ago
SWE-BENCHleader Claude Opus 4.772.1%
MISTRALOK104ms
ANTHROPICOpus 4.7 benchmarks published2m ago
CLAUDEOK142ms
OPUS 4.7$15 / $75per Mtok
CHATGPTOK89ms
HACKERNEWSWhy has not AI improved design quality the way it improved dev speed?14m ago
MMLU-PROleader Opus 4.788.4
GEMINIDEGRADED312ms
MISTRALMistral Medium 3 released6m ago
GPT-4o$5 / $15per Mtok
ARXIVCompositional reasoning in LRMs22m ago
BEDROCKOK178ms
GEMINI 2.5$3.50 / $10.50per Mtok
THE VERGEFrontier Model Forum expansion announced38m ago
SWE-BENCHleader Claude Opus 4.772.1%
MISTRALOK104ms
All use cases

Coding agents

Pick the right model for code work, stay current on SWE-bench leaders, catch price drops, and integrate everything in Claude Code or Claude Desktop. Specific TensorFeed endpoints for the jobs a coding agent actually does.

The four jobs of a coding agent

When a coding agent boots up to handle a task, it does some subset of these four things: pick a model, fetch project context, generate code, verify the output. TensorFeed sits in the "pick a model" step. The other three are your application logic.

Job 1: Pick the right model for the task

Use the premium routing endpoint with task=code:

from tensorfeed import TensorFeed

tf = TensorFeed(token="tf_live_...")

# Pick the best model for a code task under $5/1M tokens
rec = tf.routing(task="code", budget=5.0, top_n=3)
for r in rec["recommendations"]:
    print(f"#{r['rank']}: {r['model']['name']} (score: {r['composite_score']:.2f})")
    print(f"  Quality {r['components']['quality']:.2f}, Cost {r['components']['cost']:.2f}")

The composite score weights quality (SWE-bench + HumanEval scores from live benchmarks), availability (current provider status), cost (normalized blended $/1M across the candidate set), and latency. Tweak the weights via w_quality=, w_cost=, etc. for your specific tradeoff. Costs 1 credit per call.

Job 2: Stay current on the SWE-bench leaderboard

New frontier models ship every 2-4 weeks now. The model that was the SWE-bench leader last month may not be this month. Two paths to stay current:

  • Free: /benchmarks/swe_bench for the public leaderboard, refreshed weekly.
  • Paid: tf.benchmark_series(model="Claude Opus 4.7", benchmark="swe_bench", lookback=90) for the daily score evolution over the last 90 days. 1 credit per call.

Job 3: Catch price drops as they happen

Coding workloads burn through tokens fast and pricing matters. Register a price watch on the model you currently use:

tf.create_watch(
    spec={
        "type": "price",
        "model": "Claude Opus 4.7",
        "field": "blended",
        "op": "lt",
        "threshold": 30,  # cents per blended 1M tokens
    },
    callback_url="https://your-agent.example.com/webhooks/price",
    secret="any-shared-secret",
)

Or use a digest watch for a daily/weekly summary of pricing changes regardless of whether anything dramatic happened: tf.create_digest_watch(cadence="daily", callback_url=...).

Job 4: Cost projection for a workload

Before committing to a model, project the cost across a few options:

tf.cost_projection(
    models=["Claude Opus 4.7", "GPT-5.5", "DeepSeek V4 Pro"],
    input_tokens_per_day=2_000_000,   # ~2M input/day for a busy coding agent
    output_tokens_per_day=500_000,
)

Returns daily / weekly / monthly / yearly projections per model with a cheapest-monthly ranking. Useful for picking a budget tier or for monthly board slides.

Job 5 (optional): Run inside Claude Code

If your coding agent runs inside Claude Code, you can call all of the above through the MCP server directly. Add this to your Claude Code config:

{
  "mcpServers": {
    "tensorfeed": {
      "command": "npx",
      "args": ["-y", "@tensorfeed/mcp-server"],
      "env": { "TENSORFEED_TOKEN": "tf_live_..." }
    }
  }
}

Then in a Claude Code session, ask: "Recommend the best AI model for code under $5 per million tokens, then project the monthly cost at 2M input tokens per day." The MCP tools fire under the hood; you get the answer in chat.

Recommended TensorFeed endpoints (in priority order)