Routing Recommendations
1 creditGET /api/premium/routingThe premium routing endpoint synthesizes live pricing, benchmark scores, and provider status into a single composite score per model and returns the top-N ranked recommendations for your task. Custom weights via query params let you tune the tradeoff (quality-heavy, cost-heavy, etc.) without re-deriving the scoring yourself.
When to use this endpoint
When your agent needs to pick a model for a task and wants the decision to consider quality, cost, and provider status simultaneously. Faster and more accurate than maintaining your own scoring code.
Parameters
| Name | In | Type | Description |
|---|---|---|---|
| task | query | string | Task family: code, reasoning, creative, general (default general) |
| budget | query | number | Max blended USD per 1M tokens |
| min_quality | query | number | Minimum quality score in [0, 1] |
| top_n | query | integer | How many models to return (1-10, default 5) |
| w_quality | query | number | Custom quality weight (defaults to 0.4) |
| w_cost | query | number | Custom cost weight (defaults to 0.2) |
* required
Example response
{
"ok": true,
"task": "code",
"weights": { "quality": 0.4, "availability": 0.3, "cost": 0.2, "latency": 0.1 },
"recommendations": [
{
"rank": 1,
"model": { "name": "Claude Opus 4.7", "provider": "anthropic" },
"pricing": { "input": 15, "output": 75 },
"status": "operational",
"composite_score": 0.87,
"components": { "quality": 0.94, "availability": 1.0, "cost": 0.65, "latency": 0.5 }
}
],
"billing": { "credits_charged": 1, "credits_remaining": 49 }
}Code samples
Python SDK
from tensorfeed import TensorFeed
tf = TensorFeed(token="tf_live_...")
rec = tf.routing(task="code", budget=5.0, top_n=3)
for r in rec["recommendations"]:
print(f"#{r['rank']}: {r['model']['name']} ({r['composite_score']:.2f})")TypeScript SDK
import { TensorFeed } from 'tensorfeed';
const tf = new TensorFeed({ token: 'tf_live_...' });
const rec = await tf.routing({ task: 'code', budget: 5.0, topN: 3 });MCP tool
Available via the TensorFeed MCP server as premium_routing. Add npx -y @tensorfeed/mcp-server to your Claude Desktop or Claude Code MCP config.
FAQ
How is the composite score computed?
The composite is a weighted sum of four sub-scores in [0, 1]: quality (per-task benchmark blend), availability (provider status), cost (normalized blended price across the candidate set), latency (placeholder 0.5 in v1). Default weights: 40% quality, 30% availability, 20% cost, 10% latency. Override via w_quality, w_availability, w_cost, w_latency.
How does the quality score change by task?
Code task weights HumanEval (40%) and SWE-bench (40%) plus MMLU-Pro (20%). Reasoning weights GPQA-Diamond (40%) and MATH (40%) plus MMLU-Pro (20%). Creative weights MMLU-Pro (50%) plus HumanEval and MATH (25% each). General is a balanced blend across all five.
What is the difference between /api/premium/routing and /api/preview/routing?
Preview is free, returns the top-1 result with no score breakdown, and is rate-limited to 5 calls per UTC day per IP. Premium returns top-N with full component breakdown, no rate limit, 1 credit per call.