TensorFeed Intelligence Index
TFII is a versioned composite score from 0 to 100 that summarizes how capable a model is across public benchmarks. It is the same quality signal that powers our model comparisons and the Route Verdict routing decision. Free headline scores live on /models.
How we compute it
- Inputs: MMLU-Pro (knowledge), HumanEval and SWE-bench (code), GPQA-Diamond (science reasoning), MATH (math).
- Each raw score is discounted for contamination risk and benchmark saturation, so a saturated or gameable benchmark counts for less.
- The discounted scores are combined by category weight into the headline composite. Per-task subscores (code, reasoning, creative, general) are available in the premium breakdown.
- Models scored on too few benchmarks are flagged low coverage and held out of the ranking.
- The exact weights and discount multipliers are proprietary; the index is an editorial derivation over public inputs.
For agents
The free ranked table is at /api/intelligence. The signed per-benchmark breakdown is at /api/premium/model-intelligence (optional ?model=), and the historical series at /api/premium/model-intelligence/history?model=. Premium responses carry an AFTA-signed receipt and are no-charge when the data is stale.
FAQ
What is the TensorFeed Intelligence Index?
TFII is a composite score from 0 to 100 that summarizes a model's capability across public benchmarks (MMLU-Pro, HumanEval, GPQA-Diamond, MATH, SWE-bench). Raw scores are discounted for contamination risk and benchmark saturation, then combined by category weight.
Is the methodology public?
The inputs, categories, the discount approach, and the version are public. The exact category weights and discount multipliers are proprietary. The score is an editorial derivation over public inputs, not a guarantee.
How fresh is it?
The index recomputes daily. The free headline is at /api/intelligence; the signed per-benchmark breakdown and the historical series are premium at /api/premium/model-intelligence.