AI TTS Microservice
api
tutorial

How to Use a TTS API: From First Request to Production

AI TTS Microservice Team4 min read
How to Use a TTS API: From First Request to Production

Integrating a text-to-speech API into your app is straightforward once you understand the pattern: authenticate, submit text, poll for the result, download the audio. This tutorial walks through each step with working examples in curl, Python, and Node.js — using the AI TTS Microservice API as the reference, though the patterns apply to most async TTS APIs.

Step 1: Get an API Key

API access requires a key. In AI TTS Microservice, you create one from the Dashboard under the API tab. The key is shown once — copy it immediately. Keys are prefixed with tts_.

API keys are available to pay-as-you-go, Pro, and Enterprise accounts.

Step 2: Submit Your First Request

The generation endpoint accepts text and a voice ID, and returns a job ID immediately. The audio is generated asynchronously — you don't wait for it in the same request.

curl

curl -X POST https://aitts.theproductivepixel.com/api/v1/tts \
  -H "Authorization: Bearer tts_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Hello world! This is my first TTS request.",
    "voice_id": "kokoro:en-US-Kokoro-Bella"
  }'

Python

import requests

API_KEY = "tts_YOUR_KEY"
BASE = "https://aitts.theproductivepixel.com/api/v1"

response = requests.post(
    f"{BASE}/tts",
    headers={"Authorization": f"Bearer {API_KEY}"},
    json={
        "text": "Hello world! This is my first TTS request.",
        "voice_id": "kokoro:en-US-Kokoro-Bella"
    }
)
data = response.json()["data"]
print(f"Job ID: {data['job_id']}")
print(f"Characters charged: {data['chars_charged']}")

Node.js

const API_KEY = "tts_YOUR_KEY";
const BASE = "https://aitts.theproductivepixel.com/api/v1";

const response = await fetch(`${BASE}/tts`, {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${API_KEY}`,
    "Content-Type": "application/json"
  },
  body: JSON.stringify({
    text: "Hello world! This is my first TTS request.",
    voice_id: "kokoro:en-US-Kokoro-Bella"
  })
});
const { data } = await response.json();
console.log(`Job ID: ${data.job_id}`);

The response comes back immediately with HTTP 202:

{
  "success": true,
  "data": {
    "job_id": "550e8400-e29b-41d4-a716-446655440000",
    "status": "pending",
    "poll_url": "/api/v1/tts/550e8400-e29b-41d4-a716-446655440000",
    "chars_charged": 43
  }
}

Step 3: Poll for the Result

Use the job ID to check status. Poll until the status is completed or failed.

curl

curl https://aitts.theproductivepixel.com/api/v1/tts/YOUR_JOB_ID \
  -H "Authorization: Bearer tts_YOUR_KEY"

Python

import time

job_id = data["job_id"]

while True:
    status_resp = requests.get(
        f"{BASE}/tts/{job_id}",
        headers={"Authorization": f"Bearer {API_KEY}"}
    )
    status = status_resp.json()["data"]

    if status["status"] == "completed":
        print(f"Audio ready: {status['audio_url']}")
        break
    elif status["status"] == "failed":
        print(f"Failed: {status['error']}")
        break

    time.sleep(1)

Node.js

async function pollForResult(jobId) {
  while (true) {
    const res = await fetch(`${BASE}/tts/${jobId}`, {
      headers: { "Authorization": `Bearer ${API_KEY}` }
    });
    const { data: status } = await res.json();

    if (status.status === "completed") return status.audio_url;
    if (status.status === "failed") throw new Error(status.error?.message);

    await new Promise(r => setTimeout(r, 1000));
  }
}

const audioUrl = await pollForResult(data.job_id);
console.log(`Audio URL: ${audioUrl}`);

When complete, the response includes a signed audio URL:

{
  "success": true,
  "data": {
    "job_id": "550e8400-e29b-41d4-a716-446655440000",
    "status": "completed",
    "audio_url": "https://storage.googleapis.com/...",
    "audio_url_expires_at": "2026-01-08T10:00:00.000Z",
    "chars_charged": 43
  }
}

Audio URLs are signed and expire after 24 hours. Download the file promptly or re-poll to get a fresh URL.

Step 4: Download the Audio

# curl
curl -o output.wav "AUDIO_URL_FROM_RESPONSE"

# Python
audio = requests.get(audio_url)
with open("output.wav", "wb") as f:
    f.write(audio.content)

Choosing a Voice

Voice IDs follow the format provider:language-Family-Name. You can browse available voices programmatically:

curl "https://aitts.theproductivepixel.com/api/v1/voices?language=en-US" \
  -H "Authorization: Bearer tts_YOUR_KEY"

The response includes voice metadata — provider, language, gender, tier, and capability flags like supports_ssml and supports_multispeaker. You can also filter by provider and model_type (premium or ultra).

Or browse the voice gallery visually — no account required.

Output Formats

The default output is WAV. You can request MP3 or OGG Opus instead, with optional bitrate and sample rate control:

curl -X POST https://aitts.theproductivepixel.com/api/v1/tts \
  -H "Authorization: Bearer tts_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "MP3 output with custom bitrate.",
    "voice_id": "polly:en-GB-Generative-Brian",
    "output_format": "mp3",
    "sample_rate_hertz": 24000,
    "output_bitrate_kbps": 128
  }'

Supported formats and defaults:

  • wav — lossless, no bitrate config. Default.
  • mp3 — lossy, default 128 kbps. Configurable: 32–320 kbps.
  • ogg_opus — lossy, default 64 kbps. Configurable: 6–320 kbps.

Sample rates available: 8000, 12000, 16000, 22050, 24000, 44100, 48000 Hz (OGG Opus excludes 12000, 22050, and 44100).

Safe Retries With Idempotency Keys

Network failures happen. If your request times out and you're not sure whether it was processed, retrying could create a duplicate job and charge you twice. Idempotency keys prevent this.

Add an Idempotency-Key header with a unique value (UUID v4 recommended). If you retry with the same key and the same request body, you get the same response — no duplicate job, no double charge.

curl -X POST https://aitts.theproductivepixel.com/api/v1/tts \
  -H "Authorization: Bearer tts_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: 7c9e6679-7425-40de-944b-e07fc1f90ae7" \
  -d '{
    "text": "Safe to retry this request.",
    "voice_id": "kokoro:en-US-Kokoro-Bella"
  }'

If the key is still being processed, you'll get a 409 REQUEST_IN_PROGRESS. If the key was already used with a different request body, you'll get 409 IDEMPOTENCY_KEY_REUSE.

Error Handling

All errors follow a consistent shape:

{
  "success": false,
  "error": {
    "code": "INSUFFICIENT_CREDITS",
    "message": "Need 5.00 credits, have 2.50"
  }
}

Common errors you should handle:

  • 401 UNAUTHORIZED — invalid or missing API key.
  • 400 VALIDATION_ERROR — malformed request. Check the message for details.
  • 400 INVALID_VOICE — voice ID not found or not available.
  • 402 INSUFFICIENT_CREDITS — not enough balance. The message tells you how much you need.
  • 429 RATE_LIMIT_EXCEEDED — too many requests. Check Retry-After header.
  • 500 GENERATION_FAILED — audio generation failed on the provider side.
  • 503 MAINTENANCE — API temporarily down. Retry after 300 seconds.

Request validation runs before voice/provider checks. A malformed payload gets VALIDATION_ERROR before INVALID_VOICE.

Rate Limits

Rate limits are per-account and depend on your plan:

PlanRequestsWindow
Pay-as-you-go1015 minutes
Pro10015 minutes
Enterprise1,00015 minutes

Every response includes rate limit headers: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, and X-RateLimit-Bucket.

Webhooks (Enterprise)

Instead of polling, enterprise accounts can receive webhook notifications when jobs complete or fail. Configure your webhook endpoint once, and every job sends a callback:

# Configure your webhook URL
curl -X PUT https://aitts.theproductivepixel.com/api/user/enterprise/webhook \
  -H "Authorization: Bearer FIREBASE_ID_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://your-server.com/tts-webhook"}'

When a job completes, your endpoint receives:

{
  "event": "job.completed",
  "created_at": "2026-01-07T12:00:00.000Z",
  "data": {
    "job_id": "550e8400-e29b-41d4-a716-446655440000",
    "status": "completed",
    "audio_url": "https://storage.googleapis.com/...",
    "chars_charged": 150
  }
}

Webhooks are signed with HMAC-SHA256 via the X-TTS-Signature header. Always verify the signature before processing. Deliveries retry with exponential backoff up to 8 attempts over approximately 38 hours.

See the webhook documentation for full setup, signature verification examples, and retry policy details.

Putting It All Together

Here's a complete Python example that handles the full flow — submit, poll, download, with error handling and idempotency:

import requests
import time
import uuid

API_KEY = "tts_YOUR_KEY"
BASE = "https://aitts.theproductivepixel.com/api/v1"

def generate_and_download(text, voice_id, output_path="output.wav"):
    # Submit with idempotency key
    idem_key = str(uuid.uuid4())
    resp = requests.post(
        f"{BASE}/tts",
        headers={
            "Authorization": f"Bearer {API_KEY}",
            "Idempotency-Key": idem_key
        },
        json={"text": text, "voice_id": voice_id}
    )

    if not resp.ok:
        error = resp.json().get("error", {})
        raise Exception(f"{error.get('code')}: {error.get('message')}")

    job_id = resp.json()["data"]["job_id"]
    print(f"Job submitted: {job_id}")

    # Poll for completion
    for _ in range(60):
        status_resp = requests.get(
            f"{BASE}/tts/{job_id}",
            headers={"Authorization": f"Bearer {API_KEY}"}
        )
        status = status_resp.json()["data"]

        if status["status"] == "completed":
            # Download audio
            audio = requests.get(status["audio_url"])
            with open(output_path, "wb") as f:
                f.write(audio.content)
            print(f"Saved to {output_path}")
            return

        if status["status"] == "failed":
            raise Exception(f"Generation failed: {status.get('error')}")

        time.sleep(1)

    raise TimeoutError("Job did not complete within 60 seconds")

# Usage
generate_and_download(
    text="This is a complete example with error handling.",
    voice_id="polly:en-GB-Generative-Brian",
    output_path="demo.wav"
)

Next Steps


Get started: Create an API key from the Dashboard and make your first request in under two minutes.