Harnesses
FreeGET /api/harnessesThe /api/harnesses endpoint returns the full cross-harness coding agent leaderboard: every tracked harness (Claude Code, Cursor Agent, Codex CLI, Aider, OpenHands, Devin, Cline, Windsurf Cascade, Amp, Continue, Roo Code) cross-joined against every base model that vendor has published a score for, on SWE-bench Verified, Terminal-Bench, Aider Polyglot, and SWE-Lancer. Each harness object also has metadata: vendor, type (cli, ide, agent-platform), open-source status, and model lock-in.
When to use this endpoint
When your agent needs to know which coding harness leads which agentic benchmark, or to surface "the harness gap" (same model, different harness, different score). The response also includes a `rollups` field with each harness's best base-model score per benchmark for quick "who wins SWE-bench" queries.
Example response
{
"ok": true,
"lastUpdated": "2026-04-30",
"benchmarks": [
{ "id": "swe_bench_verified", "name": "SWE-bench Verified", "maxScore": 100, "unit": "% resolved", "sourceUrl": "https://www.swebench.com/" }
],
"harnesses": [
{ "id": "claude-code", "name": "Claude Code", "vendor": "Anthropic", "type": "cli", "openSource": false, "modelLockIn": "Anthropic models only" }
],
"results": [
{ "harness": "claude-code", "model": "Claude Opus 4.7", "scores": { "swe_bench_verified": 74.5, "terminal_bench": 52.3, "aider_polyglot": 84.2, "swe_lancer": 41.8 } }
],
"rollups": [
{ "harness": "claude-code", "best": { "swe_bench_verified": { "model": "Claude Opus 4.7", "score": 74.5 } } }
]
}Code samples
Python SDK
import urllib.request, json
with urllib.request.urlopen("https://tensorfeed.ai/api/harnesses") as r:
data = json.loads(r.read())
# Top harness on SWE-bench Verified
ranked = sorted(
[r for r in data["results"] if r["scores"].get("swe_bench_verified") is not None],
key=lambda r: r["scores"]["swe_bench_verified"],
reverse=True,
)
print(ranked[0]["harness"], ranked[0]["model"], ranked[0]["scores"]["swe_bench_verified"])TypeScript SDK
const res = await fetch("https://tensorfeed.ai/api/harnesses");
const data = await res.json();
const ranked = data.results
.filter((r: { scores: Record<string, number | null> }) => typeof r.scores.swe_bench_verified === "number")
.sort((a: { scores: Record<string, number> }, b: { scores: Record<string, number> }) => b.scores.swe_bench_verified - a.scores.swe_bench_verified);
console.log(ranked[0]);FAQ
What is a coding harness?
The agent scaffolding around a base LLM: tool-use loop, file-edit primitives, shell sandbox, planning logic, retrieval, and approval gating. The same model can score 5-15 percentage points apart on the same benchmark depending on which harness wraps it.
Are these scores measured by TensorFeed?
No. Each row is the harness vendor's best published score for the named base model on the named benchmark. We aggregate, normalize, and link back to the upstream report. The exception is our LLM Probe data (provider latency and availability) which we measure independently at /api/probe/latest.
How often does the harness data update?
Refreshed on each redeploy. Vendor publish cadences vary, so a daily cron does not match the data; editorial cadence is roughly weekly.