API cost monitoring
Project workload cost across providers, catch price drops as they happen, audit per-call spend. The endpoints FinOps and platform teams use to keep AI API spend predictable.
Three jobs of cost monitoring
Most cost-monitoring work falls into three buckets: project future cost, react to price changes, audit historical spend. TensorFeed has a dedicated endpoint for each.
Job 1: Project future cost across providers
Pure math, but agents pay 1 credit for the canonical abstraction so they do not have to maintain pricing tables in their own code:
from tensorfeed import TensorFeed
tf = TensorFeed(token="tf_live_...")
projection = tf.cost_projection(
models=["Claude Opus 4.7", "GPT-5.5", "DeepSeek V4 Pro"],
input_tokens_per_day=5_000_000, # busy production agent fleet
output_tokens_per_day=1_500_000,
horizon="monthly",
)
# Cheapest at the top
for r in projection["ranked_cheapest_monthly"]:
print(f"{r['model']}: ${r['monthly_total']}/mo")Use the same call to validate budget changes: re-run with new token volume estimates before signing off on a quarterly budget. Returns daily/weekly/monthly/yearly breakdowns so finance teams can plug into their existing reporting cadence.
Job 2: React to price changes
The AI pricing market moves fast. New models drop every few weeks; existing models get repriced; budget tiers compete on input cost. Three reactive patterns:
Realtime price watch:
tf.create_watch(
spec={
"type": "price",
"model": "GPT-5.5",
"field": "blended",
"op": "lt",
"threshold": 20, # blended $/1M tokens
},
callback_url="https://your-finops.example.com/webhooks/price",
secret="any-shared-secret",
)Daily/weekly digest:
tf.create_digest_watch(
cadence="weekly",
callback_url="https://your-finops.example.com/webhooks/weekly",
)
# Fires every 7 days with a curated summary of pricing changes,
# new/removed models, regardless of whether anything dramatic happened.Forecast:
forecast = tf.forecast(
target="price",
model="Claude Opus 4.7",
field="blended",
lookback=60,
horizon=14,
)
# forecast["confidence"]["label"] - low / medium / high
# forecast["forecast"] - day-by-day predicted with 95% CIThe forecast is a conservative linear-regression projection with explicit "statistical inference, not a guarantee" disclaimers. Treat low-confidence forecasts as no signal rather than a directional call.
Job 3: Audit historical spend
For every TensorFeed bearer token your agents use, the per-token usage history is one free call away:
usage = tf.usage()
# usage["total_calls"]
# usage["total_credits_spent"]
# usage["by_endpoint"] - per-endpoint count and credits
# usage["recent"] - last 25 calls with timestampThe same data renders in the human-facing /account dashboard if you prefer a UI. Token-scoped, no audit dance, no extra access requests.
Job 4 (optional): Compare cost-effectiveness
When picking between two models for a workload, compare them side by side:
tf.compare_models(ids=[
"Claude Sonnet 4.6",
"GPT-4o",
"DeepSeek V4 Flash",
"Gemini 2.0 Flash",
])
# Returns rankings.cheapest_blended, plus benchmarks side-by-sideFor the daily TensorFeed slack channel
A common pattern: a digest watch fires every morning to a Slack webhook. The digest includes any pricing changes overnight and any new models launched. The Slack channel then has a daily message at 7am UTC that takes the team 10 seconds to read. No human has to remember to check the pricing page.
Recommended TensorFeed endpoints (in priority order)
/api/premium/cost/projection— workload cost across 1-10 models, 4 horizons/api/premium/watches— realtime price-drop notifications/api/premium/forecast— linear-regression price forecasting with CI/api/payment/usage— per-token audit log (free with bearer token)/account— human-facing dashboard for the same data