CVE Watch: Security Data for AI Agents
The security data stack is fragmented across MITRE, CISA, FIRST.org, Google, GitHub, and a dozen smaller registries, each with its own schema and refresh cadence. TensorFeed aggregates them into one agent-callable layer with LLM-ready transforms, cross-database corroboration, and republish-friendly licensing throughout.
AI-driven vulnerability discovery has gone from theoretical to load-bearing. Anthropic Claude Mythos surfaced 271 Firefox zero-days in one cycle. A third major Linux kernel flaw in two weeks was attributed to AI-assisted research. The agents finding and triaging these vulns need a clean security data layer to call from. That layer is what we provide.
2026 incidents the data layer touched
Curated registry of recent disclosed incidents with structural relevance to the agent security data layer. Added when an authoritative source confirms; each entry preserves source URLs.
Typosquat / dependency-confusion worm targeting the @mistralai namespace on npm. Surfaced via GitHub Security Advisories shortly after public package publication. Part of the broader Shai-style npm worm wave expanding into AI-tagged packages.
Third major Linux kernel flaw in two weeks, with public reporting attributing the discovery cadence to AI-assisted vulnerability research. The cluster reset community assumptions about how fast a single discovery pipeline can move.
Anthropic Claude Mythos surfaced 271 Firefox zero-days in a single autonomous discovery cycle. Sets a new floor for how productive an AI cyber tier can be when given the right harness. Companion to OpenAI Daybreak which targets workflow integration over autonomous discovery.
OpenAI launched the Daybreak cyber stack: GPT-5.5, GPT-5.5 with Trusted Access for Cyber, and GPT-5.5-Cyber, plus the Codex Security harness and 20-plus security partners (Cisco, Palo Alto Networks, CrowdStrike, Cloudflare, Trail of Bits, SpecterOps). OpenAI's explicit answer to Claude Mythos. Structural event for the AI-cyber data layer.
Self-propagating npm worm pattern expanding into AI-tagged packages beyond the original target set. Triggered the cadence bump on TF's AI supply-chain IOC feed from daily to every 6 hours on 2026-05-13.
Data sources aggregated
Each source is fetched on its own cadence, transformed into a consistent shape, and exposed through endpoints below. Attribution blocks ship on every response. Commercial redistribution is permitted on every source we use.
The canonical CVE identifier registry. TensorFeed walks the cvelistV5 GitHub commit history daily at 04:30 UTC, harvesting CVE IDs added or modified in the last 36h. Single-CVE record fetches are lazy from MITRE. License: MITRE CVE Terms of Use, commercial redistribution permitted.
The catalog of CVEs that CISA confirms are actively exploited in the wild. Refreshed daily at 06:30 UTC. ~1,500+ entries. License: US Government public domain (17 USC 105), commercial redistribution explicitly permitted.
Daily probability that a given CVE will be exploited in the next 30 days. Used by agents that triage by likelihood rather than severity alone. Free for any use per FIRST.org policy, commercial redistribution permitted.
Google-stewarded OSV records across npm, PyPI, Maven, Cargo, Go, NuGet, and OS distros. Per-CVE OSV records are corroboration signals for the verified-CVE fact card.
CISA-issued SSVC (Stakeholder-Specific Vulnerability Categorization) scores layered on top of CVE entries. Adds Decision, Automatable, and TechnicalImpact signals an agent can use without scoring vulns from scratch.
GitHub Security Advisories filtered to AI/MCP/LLM-relevant advisories every 6 hours. Republish + cite posture. The feed that caught the @mistralai npm worm on day one (2026-05-12) before most defenders had it.
The cross-database verified-CVE call
One paid call composes MITRE CVE + CISA KEV + FIRST.org EPSS + OSV.dev + CISA Vulnrichment into one LLM-ready fact card with aconfirmed_by array and corroboration_count. Reduces 5 fan-out calls and 5 different parsing schemas to one. The anti-hallucination lookup for security agents: sources that do not have the CVE simply do not appear in confirmed_by; the call still returns whatever is available.
Why TensorFeed covers security data
We are not building a security agent. We are the data layer underneath any agent that needs vuln signals: SOC triage, patch prioritization, red-team research, exploit chain discovery, dashboards. The agents have different goals but the same data problem.
The data problem is that the canonical sources do not agree on schema, refresh cadence, or what a CVE record contains. MITRE publishes the CVE ID. CISA publishes which CVEs are actively exploited. FIRST.org publishes a 30-day exploitation probability. OSV catalogs cross-ecosystem records. CISA Vulnrichment adds SSVC scores. Agents stitching these together waste tokens and time on five different schemas before they can decide anything.
One verified-CVE call returns the merged fact card. Pair with the LLM-ready transforms (clean/cve, clean/kev, clean/epss) for ~80% token reduction vs the raw upstream records.
FAQ
What does this page cover?
The security data layer TensorFeed provides for AI agents. We aggregate MITRE CVE, CISA KEV, FIRST.org EPSS, OSV.dev, CISA Vulnrichment, and AI-filtered GHSA advisories into one agent-callable surface with LLM-ready transforms and a cross-database verified-CVE fact card.
Why does TensorFeed cover security data at all?
AI agents are now a meaningful share of vulnerability discovery (Anthropic Claude Mythos finding 271 Firefox zero-days in one cycle, third major Linux kernel flaw in two weeks attributed to AI). The agents doing this need a single clean security data layer instead of parsing five different schemas across five different sources. That layer is what we ship.
What is the cross-database verified-CVE call?
One paid call (/api/premium/security/verified/{id}) composes MITRE CVE Record + CISA KEV + FIRST.org EPSS + OSV.dev + CISA Vulnrichment into one LLM-ready fact card with a confirmed_by array and corroboration_count. Reduces 5 fan-out calls and 5 different parsing schemas to one. The anti-hallucination lookup for security agents.
How current is the data?
CVE additions: daily 04:30 UTC walk of cvelistV5 commit history covering the last 36h. KEV: daily 06:30 UTC full refresh. EPSS: daily refresh against FIRST.org. AI-supply-chain IOCs: every 6 hours via authenticated GHSA. Single-CVE Record v5.2 fetches are lazy from MITRE on read. Editorial content on this page updates when material structural changes land.
What are the licensing terms?
Every source we use permits commercial redistribution. MITRE CVE Terms of Use (commercial redistribution permitted). CISA KEV: US Government public domain (17 USC 105). FIRST.org EPSS: free for any use. OSV.dev: CC-BY-4.0. GHSA: republish + cite posture per GitHub Terms. Attribution blocks ship on every response. We do not paywall data we cannot legally paywall.
Is TensorFeed building a security agent?
No. We are the data layer underneath any agent that needs vuln signals. Agents calling our endpoints can be defensive (SOC triage, patch prioritization), offensive (red-team research, exploit chain discovery), or analytical (researchers, vendors building dashboards). We do not take a position on what gets built on top.
How does this connect to the rest of TensorFeed?
Security data joins the AI-ecosystem coverage at the inference layer: agents calling our /models, /pricing, and /routing endpoints to pick the right LLM for a security task can also pull the underlying vuln data from /api/premium/security/*. Cross-linked with /originals coverage of the AI cyber tier (Claude Mythos, OpenAI Daybreak) and the cyber-tier data-layer post.