How to Use a TTS API: From First Request to Production

Integrating a text-to-speech API into your app is straightforward once you understand the pattern: authenticate, submit text, poll for the result, and retrieve the audio. This tutorial walks through each step with working examples in curl, Python, and Node.js — using the AI TTS Microservice API as the reference. It covers the async flow (submit, poll, download) and introduces streaming delivery for real-time playback.
Step 1: Get an API Key
API access requires a key. In AI TTS Microservice, you create one from the Dashboard under the API tab. The key is shown once — copy it immediately. Keys are prefixed with tts_.
API keys are available to pay-as-you-go, Pro, and Enterprise accounts.
Step 2: Submit Your First Request
The generation endpoint accepts text and a voice ID, and returns a job ID immediately. The audio is generated asynchronously — you don't wait for it in the same request.
curl
curl -X POST https://aitts.theproductivepixel.com/api/v1/tts \
-H "Authorization: Bearer tts_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"text": "Hello world! This is my first TTS request.",
"voice_id": "kokoro:en-US-Kokoro-Bella"
}'Python
import requests
API_KEY = "tts_YOUR_KEY"
BASE = "https://aitts.theproductivepixel.com/api/v1"
response = requests.post(
f"{BASE}/tts",
headers={"Authorization": f"Bearer {API_KEY}"},
json={
"text": "Hello world! This is my first TTS request.",
"voice_id": "kokoro:en-US-Kokoro-Bella"
}
)
data = response.json()["data"]
print(f"Job ID: {data['job_id']}")
print(f"Characters charged: {data['chars_charged']}")Node.js
const API_KEY = "tts_YOUR_KEY";
const BASE = "https://aitts.theproductivepixel.com/api/v1";
const response = await fetch(`${BASE}/tts`, {
method: "POST",
headers: {
"Authorization": `Bearer ${API_KEY}`,
"Content-Type": "application/json"
},
body: JSON.stringify({
text: "Hello world! This is my first TTS request.",
voice_id: "kokoro:en-US-Kokoro-Bella"
})
});
const { data } = await response.json();
console.log(`Job ID: ${data.job_id}`);The response comes back immediately with HTTP 202:
{
"success": true,
"data": {
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "pending",
"poll_url": "/api/v1/tts/550e8400-e29b-41d4-a716-446655440000",
"audio_endpoint": "/api/v1/tts/550e8400-e29b-41d4-a716-446655440000/audio",
"chars_charged": 43
}
}Step 3: Poll for the Result
Use the job ID to check status. Poll until the status is completed or failed.
curl
curl https://aitts.theproductivepixel.com/api/v1/tts/YOUR_JOB_ID \
-H "Authorization: Bearer tts_YOUR_KEY"Python
import time
job_id = data["job_id"]
while True:
status_resp = requests.get(
f"{BASE}/tts/{job_id}",
headers={"Authorization": f"Bearer {API_KEY}"}
)
status = status_resp.json()["data"]
if status["status"] == "completed":
print(f"Audio endpoint: {status['audio_endpoint']}")
break
elif status["status"] == "failed":
print(f"Failed: {status['error']}")
break
time.sleep(1)Node.js
async function pollForResult(jobId) {
while (true) {
const res = await fetch(`${BASE}/tts/${jobId}`, {
headers: { "Authorization": `Bearer ${API_KEY}` }
});
const { data: status } = await res.json();
if (status.status === "completed") return status.audio_endpoint;
if (status.status === "failed") throw new Error(status.error?.message);
await new Promise(r => setTimeout(r, 1000));
}
}
const audioEndpoint = await pollForResult(data.job_id);
console.log(`Audio endpoint: ${audioEndpoint}`);When complete, the response includes a signed audio URL:
{
"success": true,
"data": {
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "completed",
"audio_url": "https://storage.googleapis.com/...",
"audio_url_expires_at": "2026-01-08T10:00:00.000Z",
"audio_endpoint": "/api/v1/tts/550e8400-e29b-41d4-a716-446655440000/audio",
"chars_charged": 43
}
}
The audio_url is a temporary signed URL that expires after 24 hours — it's provided for backward compatibility. For production use, prefer audio_endpoint (e.g. /api/v1/tts/{job_id}/audio), which returns a fresh signed URL via redirect and never expires as long as the audio is retained.
Step 4: Download the Audio
# Using the temporary audio_url (expires in 24h):
curl -o output.wav "AUDIO_URL_FROM_RESPONSE"
# Using the durable audio_endpoint (recommended):
curl -L -o output.wav \
https://aitts.theproductivepixel.com/api/v1/tts/JOB_ID/audio \
-H "Authorization: Bearer tts_YOUR_KEY"
# Python
audio = requests.get(f"{BASE}/tts/{job_id}/audio",
headers={"Authorization": f"Bearer {API_KEY}"},
allow_redirects=True)
with open("output.wav", "wb") as f:
f.write(audio.content)Choosing a Voice
Voice IDs follow the format provider:language-Family-Name. You can browse available voices programmatically:
curl "https://aitts.theproductivepixel.com/api/v1/voices?language=en-US" \
-H "Authorization: Bearer tts_YOUR_KEY"
The response includes voice metadata — provider, language, gender (clean enum: male, female, neutral, unknown), characteristics (styles, pitch, age, and more), tier, and capability flags like supports_ssml and supports_multispeaker. You can filter by provider, model_type (premium or ultra), and gender.
Or browse the voice gallery visually — no account required.
Output Formats
The default output is WAV. You can request MP3 or OGG Opus instead, with optional bitrate and sample rate control:
curl -X POST https://aitts.theproductivepixel.com/api/v1/tts \
-H "Authorization: Bearer tts_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"text": "MP3 output with custom bitrate.",
"voice_id": "polly:en-GB-Generative-Brian",
"output_format": "mp3",
"sample_rate_hertz": 24000,
"output_bitrate_kbps": 128
}'Supported formats and defaults:
- wav — lossless, no bitrate config. Default.
- mp3 — lossy, default 128 kbps. Configurable: 32–320 kbps.
- ogg_opus — lossy, default 64 kbps. Configurable: 6–320 kbps.
Sample rates available: 8000, 12000, 16000, 22050, 24000, 44100, 48000 Hz (OGG Opus excludes 12000, 22050, and 44100).
Safe Retries With Idempotency Keys
Network failures happen. If your request times out and you're not sure whether it was processed, retrying could create a duplicate job and charge you twice. Idempotency keys prevent this.
Add an Idempotency-Key header with a unique value (UUID v4 recommended). If you retry with the same key and the same request body, you get the same response — no duplicate job, no double charge.
curl -X POST https://aitts.theproductivepixel.com/api/v1/tts \
-H "Authorization: Bearer tts_YOUR_KEY" \
-H "Content-Type: application/json" \
-H "Idempotency-Key: 7c9e6679-7425-40de-944b-e07fc1f90ae7" \
-d '{
"text": "Safe to retry this request.",
"voice_id": "kokoro:en-US-Kokoro-Bella"
}'
If the key is still being processed, you'll get a 409 REQUEST_IN_PROGRESS. If the key was already used with a different request body, you'll get 409 IDEMPOTENCY_KEY_REUSE.
Error Handling
All errors follow a consistent shape:
{
"success": false,
"error": {
"code": "INSUFFICIENT_CREDITS",
"message": "Need 5.00 credits, have 2.50"
}
}Common errors you should handle:
401 UNAUTHORIZED— invalid or missing API key.400 VALIDATION_ERROR— malformed request. Check the message for details.400 INVALID_VOICE— voice ID not found or not available.402 INSUFFICIENT_CREDITS— not enough balance. The message tells you how much you need.429 RATE_LIMIT_EXCEEDED— too many requests. CheckRetry-Afterheader.500 GENERATION_FAILED— audio generation failed on the provider side.503 MAINTENANCE— API temporarily down. Retry after 300 seconds.
Request validation runs before voice/provider checks. A malformed payload gets VALIDATION_ERROR before INVALID_VOICE.
Rate Limits
Rate limits are per-account and depend on your plan:
| Plan | Requests | Window |
|---|---|---|
| Pay-as-you-go | 10 | 15 minutes |
| Pro | 100 | 15 minutes |
| Enterprise | 1,000 | 15 minutes |
Every response includes rate limit headers: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, and X-RateLimit-Bucket.
Webhooks (Enterprise)
Instead of polling, enterprise accounts can receive webhook notifications when jobs complete or fail. Configure your webhook endpoint once, and every job sends a callback:
# Configure your webhook URL
curl -X PUT https://aitts.theproductivepixel.com/api/user/enterprise/webhook \
-H "Authorization: Bearer FIREBASE_ID_TOKEN" \
-H "Content-Type: application/json" \
-d '{"url": "https://your-server.com/tts-webhook"}'When a job completes, your endpoint receives:
{
"event": "job.completed",
"created_at": "2026-01-07T12:00:00.000Z",
"data": {
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "completed",
"audio_url": "https://storage.googleapis.com/...",
"audio_endpoint": "/api/v1/tts/550e8400-e29b-41d4-a716-446655440000/audio",
"chars_charged": 150
}
}
Webhooks are signed with HMAC-SHA256 via the X-TTS-Signature header. Always verify the signature before processing. Deliveries retry with exponential backoff up to 8 attempts over approximately 38 hours.
See the webhook documentation for full setup, signature verification examples, and retry policy details.
Durable Audio Retrieval
The audio_url in the status response is a temporary signed URL that expires. For production integrations, use the audio_endpoint instead — it's a stable API path that returns a fresh signed URL via HTTP redirect:
curl -L https://aitts.theproductivepixel.com/api/v1/tts/YOUR_JOB_ID/audio \
-H "Authorization: Bearer tts_YOUR_KEY" \
-o output.wav
This endpoint returns 307 with a Location header pointing to a fresh signed URL. Use -L in curl to follow the redirect. The endpoint works as long as the audio is retained — no expiry to worry about.
Streaming Delivery
The async flow above works well for batch processing and background generation. For use cases where you want audio to start playing as it's generated — interactive previews, chatbot responses, live demos — the API also supports streaming delivery.
Streaming uses a two-step flow: POST to create the job and get a one-time stream_url, then GET that URL for chunked audio. The same voices, billing, and quality apply — streaming is a delivery mode, not a different product.
# Step 1: Create streaming job
curl -X POST https://aitts.theproductivepixel.com/api/v1/tts/stream \
-H "Authorization: Bearer tts_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"text": "Streaming lets you hear audio as it is generated.",
"voice_id": "google:en-US-Chirp3HD-Charon"
}'
# Step 2: Open the stream URL from the response
curl -N "STREAM_URL_FROM_RESPONSE" --output audio.ogg
After streaming completes, the durable audio is available at audio_endpoint just like an async job. For the full streaming guide — format selection, transport details, and when to use streaming vs async — see our streaming delivery guide.
Putting It All Together
Here's a complete Python example that handles the full flow — submit, poll, download, with error handling and idempotency:
import requests
import time
import uuid
API_KEY = "tts_YOUR_KEY"
BASE = "https://aitts.theproductivepixel.com/api/v1"
def generate_and_download(text, voice_id, output_path="output.wav"):
# Submit with idempotency key
idem_key = str(uuid.uuid4())
resp = requests.post(
f"{BASE}/tts",
headers={
"Authorization": f"Bearer {API_KEY}",
"Idempotency-Key": idem_key
},
json={"text": text, "voice_id": voice_id}
)
if not resp.ok:
error = resp.json().get("error", {})
raise Exception(f"{error.get('code')}: {error.get('message')}")
job_id = resp.json()["data"]["job_id"]
print(f"Job submitted: {job_id}")
# Poll for completion
for _ in range(60):
status_resp = requests.get(
f"{BASE}/tts/{job_id}",
headers={"Authorization": f"Bearer {API_KEY}"}
)
status = status_resp.json()["data"]
if status["status"] == "completed":
# Download via durable audio_endpoint
audio = requests.get(
f"{BASE}/tts/{job_id}/audio",
headers={"Authorization": f"Bearer {API_KEY}"},
allow_redirects=True)
with open(output_path, "wb") as f:
f.write(audio.content)
print(f"Saved to {output_path}")
return
if status["status"] == "failed":
raise Exception(f"Generation failed: {status.get('error')}")
time.sleep(1)
raise TimeoutError("Job did not complete within 60 seconds")
# Usage
generate_and_download(
text="This is a complete example with error handling.",
voice_id="polly:en-GB-Generative-Brian",
output_path="demo.wav"
)Next Steps
- Browse the voice gallery to find voices for your project.
- Read the full API reference for all endpoints and parameters.
- Check provider capabilities to understand what each provider and tier supports.
- Read the streaming delivery guide for real-time audio playback via the API.
- Download the OpenAPI spec for code generation in any language.
Get started: Create an API key from the Dashboard and make your first request in under two minutes.