Multimodal Models

Free

GET /api/multimodal

The /api/multimodal endpoint returns the catalog of production image generation, video generation, text-to-speech, and speech-to-text models. Pricing is in modality-native units (per image, per second of video, per 1k characters, per minute of audio) so cross-modality comparison is not meaningful, but within-modality sorting by price is the primary use.

When to use this endpoint

When your agent needs to pick a multimodal model for image, video, TTS, or STT work. The /api/models endpoint covers chat models; this is the missing peer for the other modalities.

Parameters

Name	In	Type	Description
modality	query	string	Filter to "image", "video", "tts", or "stt"`e.g. video`

* required

Example response

{
  "ok": true,
  "lastUpdated": "2026-04-30",
  "count": 6,
  "models": [
    {
      "id": "veo-3",
      "name": "Veo 3",
      "provider": "Google",
      "modality": "video",
      "pricingUnit": "per_second_video",
      "pricingAmount": 0.50,
      "maxOutput": "8s @ 1080p with audio",
      "features": ["native audio", "lip-sync", "image-to-video"]
    }
  ]
}

Code samples

Python SDK

from tensorfeed import TensorFeed
tf = TensorFeed()
data = tf.multimodal(modality="video")
for m in sorted(data["models"], key=lambda x: x["pricingAmount"] or 0):
    print(f"{m['name']:<24} {m['pricingAmount']}/sec  ({m['provider']})")

TypeScript SDK

const res = await fetch("https://tensorfeed.ai/api/multimodal?modality=video");
const { models } = await res.json();
for (const m of models) console.log(`${m.name}: ${m.pricingAmount}/sec`);

FAQ

How are different modalities priced?

Image: per image. Video: per second of generated video. TTS: per 1k characters of input text. STT: per minute of input audio. The pricingUnit field on each entry says exactly which unit applies. Cross-modality price comparison is not meaningful; sort within a modality.

Multimodal Models

When to use this endpoint

Parameters

Example response

Code samples

Python SDK

TypeScript SDK

FAQ

How are different modalities priced?

Related endpoints

Embedding Models

Inference Provider Matrix

Models