Inference Cheapest
FreeGET /api/inference-providers/cheapestAgent-friendly entry point into the inference provider matrix. Pass a canonical model id and get back the top 3 cheapest offers (default sort is blended price). Skips the full matrix payload, which is useful when an agent is just picking the cheapest path and does not need every column.
When to use this endpoint
When your agent needs the cheapest inference path for a specific open-weight model in a single call. For a different sort, pass ?sort=input|output|tps_desc.
Parameters
| Name | In | Type | Description |
|---|---|---|---|
| model* | query | string | Canonical model id (llama-4-maverick, llama-4-scout, llama-3.1-70b, llama-3.1-405b, deepseek-v4-pro, deepseek-v4-flash, mixtral-8x22b, qwen-2.5-72b)e.g. llama-4-scout |
| sort | query | string | Sort order: blended (default), input, output, tps_desce.g. tps_desc |
* required
Example response
{
"ok": true,
"modelId": "llama-4-scout",
"modelName": "Llama 4 Scout",
"family": "Meta",
"sortBy": "blended",
"cheapest": { "provider": "DeepInfra", "blendedPrice": 0.355, "inputPrice": 0.16, "outputPrice": 0.55, "contextWindow": 10000000, "outputTPS": 170 },
"top3": [
{ "provider": "DeepInfra", "blendedPrice": 0.355 },
{ "provider": "OpenRouter", "blendedPrice": 0.385 },
{ "provider": "Groq", "blendedPrice": 0.385 }
]
}Code samples
Python SDK
from tensorfeed import TensorFeed
tf = TensorFeed()
result = tf.inference_cheapest("llama-4-scout")
print(f"Cheapest: {result['cheapest']['provider']} at ${result['cheapest']['blendedPrice']:.3f}/1M blended")TypeScript SDK
const res = await fetch("https://tensorfeed.ai/api/inference-providers/cheapest?model=llama-4-scout");
const result = await res.json();
console.log(`Cheapest: ${result.cheapest.provider} @ $${result.cheapest.blendedPrice}`);FAQ
What if my model is not in the matrix?
Returns 404 model_not_found. List available models at /api/inference-providers. We track the most-served open-weight models; if you need one we do not cover, the catalog is editorial and we add new models on demand.
Why is the sort default blended and not input?
Because real workloads have non-zero output usage. Blended at 1:1 input:output ratio is a better proxy for actual cost than input-only. If your workload is heavy-input or heavy-output, pass ?sort=input or ?sort=output.