LIVE
ANTHROPICOpus 4.7 benchmarks published2m ago
CLAUDEOK142ms
OPUS 4.7$15 / $75per Mtok
CHATGPTOK89ms
HACKERNEWSWhy has not AI improved design quality the way it improved dev speed?14m ago
MMLU-PROleader Opus 4.788.4
GEMINIDEGRADED312ms
MISTRALMistral Medium 3 released6m ago
GPT-4o$5 / $15per Mtok
ARXIVCompositional reasoning in LRMs22m ago
BEDROCKOK178ms
GEMINI 2.5$3.50 / $10.50per Mtok
THE VERGEFrontier Model Forum expansion announced38m ago
SWE-BENCHleader Claude Opus 4.772.1%
MISTRALOK104ms
ANTHROPICOpus 4.7 benchmarks published2m ago
CLAUDEOK142ms
OPUS 4.7$15 / $75per Mtok
CHATGPTOK89ms
HACKERNEWSWhy has not AI improved design quality the way it improved dev speed?14m ago
MMLU-PROleader Opus 4.788.4
GEMINIDEGRADED312ms
MISTRALMistral Medium 3 released6m ago
GPT-4o$5 / $15per Mtok
ARXIVCompositional reasoning in LRMs22m ago
BEDROCKOK178ms
GEMINI 2.5$3.50 / $10.50per Mtok
THE VERGEFrontier Model Forum expansion announced38m ago
SWE-BENCHleader Claude Opus 4.772.1%
MISTRALOK104ms
All endpoints

Routing Recommendations

1 credit
GET /api/premium/routing

The premium routing endpoint synthesizes live pricing, benchmark scores, and provider status into a single composite score per model and returns the top-N ranked recommendations for your task. Custom weights via query params let you tune the tradeoff (quality-heavy, cost-heavy, etc.) without re-deriving the scoring yourself.

When to use this endpoint

When your agent needs to pick a model for a task and wants the decision to consider quality, cost, and provider status simultaneously. Faster and more accurate than maintaining your own scoring code.

Parameters

NameInTypeDescription
taskquerystringTask family: code, reasoning, creative, general (default general)
budgetquerynumberMax blended USD per 1M tokens
min_qualityquerynumberMinimum quality score in [0, 1]
top_nqueryintegerHow many models to return (1-10, default 5)
w_qualityquerynumberCustom quality weight (defaults to 0.4)
w_costquerynumberCustom cost weight (defaults to 0.2)

* required

Example response

{
  "ok": true,
  "task": "code",
  "weights": { "quality": 0.4, "availability": 0.3, "cost": 0.2, "latency": 0.1 },
  "recommendations": [
    {
      "rank": 1,
      "model": { "name": "Claude Opus 4.7", "provider": "anthropic" },
      "pricing": { "input": 15, "output": 75 },
      "status": "operational",
      "composite_score": 0.87,
      "components": { "quality": 0.94, "availability": 1.0, "cost": 0.65, "latency": 0.5 }
    }
  ],
  "billing": { "credits_charged": 1, "credits_remaining": 49 }
}

Code samples

Python SDK

from tensorfeed import TensorFeed

tf = TensorFeed(token="tf_live_...")
rec = tf.routing(task="code", budget=5.0, top_n=3)
for r in rec["recommendations"]:
    print(f"#{r['rank']}: {r['model']['name']} ({r['composite_score']:.2f})")

TypeScript SDK

import { TensorFeed } from 'tensorfeed';

const tf = new TensorFeed({ token: 'tf_live_...' });
const rec = await tf.routing({ task: 'code', budget: 5.0, topN: 3 });

MCP tool

Available via the TensorFeed MCP server as premium_routing. Add npx -y @tensorfeed/mcp-server to your Claude Desktop or Claude Code MCP config.

FAQ

How is the composite score computed?

The composite is a weighted sum of four sub-scores in [0, 1]: quality (per-task benchmark blend), availability (provider status), cost (normalized blended price across the candidate set), latency (placeholder 0.5 in v1). Default weights: 40% quality, 30% availability, 20% cost, 10% latency. Override via w_quality, w_availability, w_cost, w_latency.

How does the quality score change by task?

Code task weights HumanEval (40%) and SWE-bench (40%) plus MMLU-Pro (20%). Reasoning weights GPQA-Diamond (40%) and MATH (40%) plus MMLU-Pro (20%). Creative weights MMLU-Pro (50%) plus HumanEval and MATH (25% each). General is a balanced blend across all five.

What is the difference between /api/premium/routing and /api/preview/routing?

Preview is free, returns the top-1 result with no score breakdown, and is rate-limited to 5 calls per UTC day per IP. Premium returns top-N with full component breakdown, no rate limit, 1 credit per call.

Related endpoints