How to Use a TTS API: From First Request to Production

Integrating a text-to-speech API into your app is straightforward once you understand the pattern: authenticate, submit text, poll for the result, download the audio. This tutorial walks through each step with working examples in curl, Python, and Node.js — using the AI TTS Microservice API as the reference, though the patterns apply to most async TTS APIs.
Step 1: Get an API Key
API access requires a key. In AI TTS Microservice, you create one from the Dashboard under the API tab. The key is shown once — copy it immediately. Keys are prefixed with tts_.
API keys are available to pay-as-you-go, Pro, and Enterprise accounts.
Step 2: Submit Your First Request
The generation endpoint accepts text and a voice ID, and returns a job ID immediately. The audio is generated asynchronously — you don't wait for it in the same request.
curl
curl -X POST https://aitts.theproductivepixel.com/api/v1/tts \
-H "Authorization: Bearer tts_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"text": "Hello world! This is my first TTS request.",
"voice_id": "kokoro:en-US-Kokoro-Bella"
}'Python
import requests
API_KEY = "tts_YOUR_KEY"
BASE = "https://aitts.theproductivepixel.com/api/v1"
response = requests.post(
f"{BASE}/tts",
headers={"Authorization": f"Bearer {API_KEY}"},
json={
"text": "Hello world! This is my first TTS request.",
"voice_id": "kokoro:en-US-Kokoro-Bella"
}
)
data = response.json()["data"]
print(f"Job ID: {data['job_id']}")
print(f"Characters charged: {data['chars_charged']}")Node.js
const API_KEY = "tts_YOUR_KEY";
const BASE = "https://aitts.theproductivepixel.com/api/v1";
const response = await fetch(`${BASE}/tts`, {
method: "POST",
headers: {
"Authorization": `Bearer ${API_KEY}`,
"Content-Type": "application/json"
},
body: JSON.stringify({
text: "Hello world! This is my first TTS request.",
voice_id: "kokoro:en-US-Kokoro-Bella"
})
});
const { data } = await response.json();
console.log(`Job ID: ${data.job_id}`);The response comes back immediately with HTTP 202:
{
"success": true,
"data": {
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "pending",
"poll_url": "/api/v1/tts/550e8400-e29b-41d4-a716-446655440000",
"chars_charged": 43
}
}Step 3: Poll for the Result
Use the job ID to check status. Poll until the status is completed or failed.
curl
curl https://aitts.theproductivepixel.com/api/v1/tts/YOUR_JOB_ID \
-H "Authorization: Bearer tts_YOUR_KEY"Python
import time
job_id = data["job_id"]
while True:
status_resp = requests.get(
f"{BASE}/tts/{job_id}",
headers={"Authorization": f"Bearer {API_KEY}"}
)
status = status_resp.json()["data"]
if status["status"] == "completed":
print(f"Audio ready: {status['audio_url']}")
break
elif status["status"] == "failed":
print(f"Failed: {status['error']}")
break
time.sleep(1)Node.js
async function pollForResult(jobId) {
while (true) {
const res = await fetch(`${BASE}/tts/${jobId}`, {
headers: { "Authorization": `Bearer ${API_KEY}` }
});
const { data: status } = await res.json();
if (status.status === "completed") return status.audio_url;
if (status.status === "failed") throw new Error(status.error?.message);
await new Promise(r => setTimeout(r, 1000));
}
}
const audioUrl = await pollForResult(data.job_id);
console.log(`Audio URL: ${audioUrl}`);When complete, the response includes a signed audio URL:
{
"success": true,
"data": {
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "completed",
"audio_url": "https://storage.googleapis.com/...",
"audio_url_expires_at": "2026-01-08T10:00:00.000Z",
"chars_charged": 43
}
}Audio URLs are signed and expire after 24 hours. Download the file promptly or re-poll to get a fresh URL.
Step 4: Download the Audio
# curl
curl -o output.wav "AUDIO_URL_FROM_RESPONSE"
# Python
audio = requests.get(audio_url)
with open("output.wav", "wb") as f:
f.write(audio.content)Choosing a Voice
Voice IDs follow the format provider:language-Family-Name. You can browse available voices programmatically:
curl "https://aitts.theproductivepixel.com/api/v1/voices?language=en-US" \
-H "Authorization: Bearer tts_YOUR_KEY"
The response includes voice metadata — provider, language, gender, tier, and capability flags like supports_ssml and supports_multispeaker. You can also filter by provider and model_type (premium or ultra).
Or browse the voice gallery visually — no account required.
Output Formats
The default output is WAV. You can request MP3 or OGG Opus instead, with optional bitrate and sample rate control:
curl -X POST https://aitts.theproductivepixel.com/api/v1/tts \
-H "Authorization: Bearer tts_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"text": "MP3 output with custom bitrate.",
"voice_id": "polly:en-GB-Generative-Brian",
"output_format": "mp3",
"sample_rate_hertz": 24000,
"output_bitrate_kbps": 128
}'Supported formats and defaults:
- wav — lossless, no bitrate config. Default.
- mp3 — lossy, default 128 kbps. Configurable: 32–320 kbps.
- ogg_opus — lossy, default 64 kbps. Configurable: 6–320 kbps.
Sample rates available: 8000, 12000, 16000, 22050, 24000, 44100, 48000 Hz (OGG Opus excludes 12000, 22050, and 44100).
Safe Retries With Idempotency Keys
Network failures happen. If your request times out and you're not sure whether it was processed, retrying could create a duplicate job and charge you twice. Idempotency keys prevent this.
Add an Idempotency-Key header with a unique value (UUID v4 recommended). If you retry with the same key and the same request body, you get the same response — no duplicate job, no double charge.
curl -X POST https://aitts.theproductivepixel.com/api/v1/tts \
-H "Authorization: Bearer tts_YOUR_KEY" \
-H "Content-Type: application/json" \
-H "Idempotency-Key: 7c9e6679-7425-40de-944b-e07fc1f90ae7" \
-d '{
"text": "Safe to retry this request.",
"voice_id": "kokoro:en-US-Kokoro-Bella"
}'
If the key is still being processed, you'll get a 409 REQUEST_IN_PROGRESS. If the key was already used with a different request body, you'll get 409 IDEMPOTENCY_KEY_REUSE.
Error Handling
All errors follow a consistent shape:
{
"success": false,
"error": {
"code": "INSUFFICIENT_CREDITS",
"message": "Need 5.00 credits, have 2.50"
}
}Common errors you should handle:
401 UNAUTHORIZED— invalid or missing API key.400 VALIDATION_ERROR— malformed request. Check the message for details.400 INVALID_VOICE— voice ID not found or not available.402 INSUFFICIENT_CREDITS— not enough balance. The message tells you how much you need.429 RATE_LIMIT_EXCEEDED— too many requests. CheckRetry-Afterheader.500 GENERATION_FAILED— audio generation failed on the provider side.503 MAINTENANCE— API temporarily down. Retry after 300 seconds.
Request validation runs before voice/provider checks. A malformed payload gets VALIDATION_ERROR before INVALID_VOICE.
Rate Limits
Rate limits are per-account and depend on your plan:
| Plan | Requests | Window |
|---|---|---|
| Pay-as-you-go | 10 | 15 minutes |
| Pro | 100 | 15 minutes |
| Enterprise | 1,000 | 15 minutes |
Every response includes rate limit headers: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, and X-RateLimit-Bucket.
Webhooks (Enterprise)
Instead of polling, enterprise accounts can receive webhook notifications when jobs complete or fail. Configure your webhook endpoint once, and every job sends a callback:
# Configure your webhook URL
curl -X PUT https://aitts.theproductivepixel.com/api/user/enterprise/webhook \
-H "Authorization: Bearer FIREBASE_ID_TOKEN" \
-H "Content-Type: application/json" \
-d '{"url": "https://your-server.com/tts-webhook"}'When a job completes, your endpoint receives:
{
"event": "job.completed",
"created_at": "2026-01-07T12:00:00.000Z",
"data": {
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "completed",
"audio_url": "https://storage.googleapis.com/...",
"chars_charged": 150
}
}
Webhooks are signed with HMAC-SHA256 via the X-TTS-Signature header. Always verify the signature before processing. Deliveries retry with exponential backoff up to 8 attempts over approximately 38 hours.
See the webhook documentation for full setup, signature verification examples, and retry policy details.
Putting It All Together
Here's a complete Python example that handles the full flow — submit, poll, download, with error handling and idempotency:
import requests
import time
import uuid
API_KEY = "tts_YOUR_KEY"
BASE = "https://aitts.theproductivepixel.com/api/v1"
def generate_and_download(text, voice_id, output_path="output.wav"):
# Submit with idempotency key
idem_key = str(uuid.uuid4())
resp = requests.post(
f"{BASE}/tts",
headers={
"Authorization": f"Bearer {API_KEY}",
"Idempotency-Key": idem_key
},
json={"text": text, "voice_id": voice_id}
)
if not resp.ok:
error = resp.json().get("error", {})
raise Exception(f"{error.get('code')}: {error.get('message')}")
job_id = resp.json()["data"]["job_id"]
print(f"Job submitted: {job_id}")
# Poll for completion
for _ in range(60):
status_resp = requests.get(
f"{BASE}/tts/{job_id}",
headers={"Authorization": f"Bearer {API_KEY}"}
)
status = status_resp.json()["data"]
if status["status"] == "completed":
# Download audio
audio = requests.get(status["audio_url"])
with open(output_path, "wb") as f:
f.write(audio.content)
print(f"Saved to {output_path}")
return
if status["status"] == "failed":
raise Exception(f"Generation failed: {status.get('error')}")
time.sleep(1)
raise TimeoutError("Job did not complete within 60 seconds")
# Usage
generate_and_download(
text="This is a complete example with error handling.",
voice_id="polly:en-GB-Generative-Brian",
output_path="demo.wav"
)Next Steps
- Browse the voice gallery to find voices for your project.
- Read the full API reference for all endpoints and parameters.
- Check provider capabilities to understand what each provider and tier supports.
- Download the OpenAPI spec for code generation in any language.
Get started: Create an API key from the Dashboard and make your first request in under two minutes.