Routing Recommendations

1 credit

GET /api/premium/routing

The premium routing endpoint synthesizes live pricing, benchmark scores, and provider status into a single composite score per model and returns the top-N ranked recommendations for your task. Custom weights via query params let you tune the tradeoff (quality-heavy, cost-heavy, etc.) without re-deriving the scoring yourself.

When to use this endpoint

When your agent needs to pick a model for a task and wants the decision to consider quality, cost, and provider status simultaneously. Faster and more accurate than maintaining your own scoring code.

Parameters

Name	In	Type	Description
task	query	string	Task family: code, reasoning, creative, general (default general)
budget	query	number	Max blended USD per 1M tokens
min_quality	query	number	Minimum quality score in [0, 1]
top_n	query	integer	How many models to return (1-10, default 5)
w_quality	query	number	Custom quality weight (defaults to 0.4)
w_cost	query	number	Custom cost weight (defaults to 0.2)

* required

Example response

{
  "ok": true,
  "task": "code",
  "weights": { "quality": 0.4, "availability": 0.3, "cost": 0.2, "latency": 0.1 },
  "recommendations": [
    {
      "rank": 1,
      "model": { "name": "Claude Opus 4.7", "provider": "anthropic" },
      "pricing": { "input": 15, "output": 75 },
      "status": "operational",
      "composite_score": 0.87,
      "components": { "quality": 0.94, "availability": 1.0, "cost": 0.65, "latency": 0.5 }
    }
  ],
  "billing": { "credits_charged": 1, "credits_remaining": 49 }
}

Code samples

Python SDK

from tensorfeed import TensorFeed

tf = TensorFeed(token="tf_live_...")
rec = tf.routing(task="code", budget=5.0, top_n=3)
for r in rec["recommendations"]:
    print(f"#{r['rank']}: {r['model']['name']} ({r['composite_score']:.2f})")

TypeScript SDK

import { TensorFeed } from 'tensorfeed';

const tf = new TensorFeed({ token: 'tf_live_...' });
const rec = await tf.routing({ task: 'code', budget: 5.0, topN: 3 });

MCP tool

Available via the TensorFeed MCP server as premium_routing. Add npx -y @tensorfeed/mcp-server to your Claude Desktop or Claude Code MCP config.

FAQ

How is the composite score computed?

The composite is a weighted sum of four sub-scores in [0, 1]: quality (per-task benchmark blend), availability (provider status), cost (normalized blended price across the candidate set), latency (placeholder 0.5 in v1). Default weights: 40% quality, 30% availability, 20% cost, 10% latency. Override via w_quality, w_availability, w_cost, w_latency.

How does the quality score change by task?

Code task weights HumanEval (40%) and SWE-bench (40%) plus MMLU-Pro (20%). Reasoning weights GPQA-Diamond (40%) and MATH (40%) plus MMLU-Pro (20%). Creative weights MMLU-Pro (50%) plus HumanEval and MATH (25% each). General is a balanced blend across all five.

What is the difference between /api/premium/routing and /api/preview/routing?

Preview is free, returns the top-1 result with no score breakdown, and is rate-limited to 5 calls per UTC day per IP. Premium returns top-N with full component breakdown, no rate limit, 1 credit per call.

Related endpoints

Cost Projection

/api/premium/cost/projection

Compare Models

/api/premium/compare/models

OpenAPI 3.1 spec ·Agent payments docs·For AI agents