Provider Capabilities
Feature support by provider and tier. Unsupported modifiers return 400.
Capability Matrix
| Provider | Tier | SSML | Markup | Speed | Multi-Speaker | Model Selection | Prompt | Bitrate Config |
|---|---|---|---|---|---|---|---|---|
| Google[1] | Premium | Conditional | Conditional | Conditional | No | No | No | Yes |
| Ultra | Yes | No | No | Yes | Yes | Yes | Yes | |
| Polly | Premium | Yes | No | No | No | No | No | No |
| Polly | Ultra | Yes | No | No | No | No | No | No |
| Kokoro | Premium | No | No | Yes | No | No | No | Yes |
[1] SSML, markup, and speed vary by voice family — check GET /api/v1/voices.
Models By Tier
These are the public voice families you will see in GET /api/v1/voices, grouped by provider and tier.
| Provider | Tier | Models |
|---|---|---|
| Premium | Casual, Chirp-HD, Chirp3-HD, Neural2, News, Polyglot, Standard, Studio, Wavenet | |
| Ultra | Gemini | |
| Polly | Premium | Generative, Neural, Standard |
| Polly | Ultra | Long-Form |
| Kokoro | Premium | Kokoro |
Looking for supported languages and live voice availability? Browse the Voice Library for a visual view, or use GET /api/v1/voices for the full API response.
Text And Prompt Limits
The limits below show the maximum accepted text for each provider/tier combination, plus prompt limits where prompt is supported.
| Provider | Tier | Max Text(bytes) | Prompt Limits(bytes) |
|---|---|---|---|
| Premium | 500,000 | Not supported | |
| Ultra | 4,000 | 4,000 | |
| Polly | Premium | 100,000 | Not supported |
| Polly | Ultra | 100,000 | Not supported |
| Kokoro | Premium | 5,000 | Not supported |
When prompt is used, chars_charged includes both text and prompt bytes.
Voice-Level Capabilities
The matrix above covers provider/tier-level features. For voice-specific capabilities, use GET /api/v1/voices. Each voice includes:
| Field | Description |
|---|---|
| supports_ssml | Whether this voice supports SSML input format |
| supports_markup | Whether this voice supports markup input format |
| supports_multispeaker | Whether this voice supports multi-speaker mode |
| model_type | Voice tier: "premium" or "ultra" |
| language | Voice language code (e.g., en-US) |
| provider | Voice provider (e.g., google, polly, kokoro) |
Always check voice-level fields before sending optional modifiers. See GET /api/v1/voices for the full response shape.
© 2026 AI TTS Microservice. All rights reserved.