Search docs

Find any page, endpoint, or guide.

API Overview

One API across every TTS provider. OpenAI-compatible schemas, one-line provider switching.

Alpha

VoxRouter's request and response schemas mirror the OpenAI Audio API, with one addition: the model field carries a provider prefix — "{provider}/{model_id}" — so a single VoxRouter key routes across every supported TTS provider. Swap providers by changing one string, no rewrites of client code, no new credentials.

Base URL
api.voxrouter.ai/v1
Auth
Bearer pk_…
Content-Type
application/json

OpenAPI specification

The machine-readable spec lives in the repo at voxrouter/router/openapi.yaml. Feed it into Swagger UI, Postman, or any OpenAPI code generator. We also use it as the source of truth for the first-party voxrouter SDK — the published TypeScript types are generated from this file on every spec change.

bash
# Fetch the spec directly from GitHub
curl -L https://raw.githubusercontent.com/voxrouter/voxrouter/main/voxrouter/router/openapi.yaml \
  -o voxrouter.openapi.yaml

Authentication

Every request carries a Bearer token in the Authorization header. Keys start with pk_ and are created from the console.

bash
curl https://api.voxrouter.ai/v1/audio/speech \
  -H "Authorization: Bearer $VOXROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"elevenlabs/eleven_turbo_v2_5","voice":"EXAVITQu4vr4xnSDxMaL","input":"hi"}'

401 Unauthorized means the key is missing or invalid. 429 Rate-Limited means you've hit the per-key limit — see Rate limits.

Requests

The router exposes endpoints for speech synthesis, catalog discovery, and wallet inspection. Each is documented in the API Reference sidebar; the prose below is a quick orientation.

POST /v1/audio/speech

Synthesize speech from a text input. The response body is raw audio — audio/mpeg for response_format: "mp3", audio/l16 (16-bit LE PCM @ 24 kHz) for "pcm".

typescript
// Request body (application/json)
type SpeechRequest = {
  /** Provider-prefixed model id, e.g. "elevenlabs/eleven_turbo_v2_5". */
  model: string;
  /** Text to synthesize. */
  input: string;
  /** Provider-local voice id. Use GET /v1/voices to discover. */
  voice: string;
  /** Output encoding. Defaults to "mp3". */
  response_format?: "mp3" | "pcm";
  /** Passthrough provider-specific options. */
  provider_options?: Record<string, unknown>;
};

GET /v1/voices

Return the voice catalog across every configured provider. Filter with query params:

typescript
// Query params
type VoicesQuery = {
  /** Comma-separated provider list, e.g. "elevenlabs,cartesia". */
  provider?: string;
  /** ISO language prefix (case-insensitive), e.g. "en" or "en-US". */
  language?: string;
  /** Exact gender label (case-insensitive), e.g. "female". */
  gender?: string;
};

GET /v1/providers

Return the catalog of routable providers and the models each exposes. Near-static — safe to cache for hours. For live availability use /v1/status.

GET /v1/status

Per-provider live health (available / degraded / unavailable) plus the reason a provider is non-available (missing_api_key, circuit_open, circuit_half_open). Cheap to poll.

GET /v1/credits

Wallet snapshot for the authenticated key's account: balanceMicros (available credit) and reservedMicros (in-flight reservations). Both in USD micro-dollars (1_000_000 = $1).

GET /v1/credits/activity

Recent ledger entries for the wallet (newest first). Each row records a wallet mutation (top-up, reserve, commit, refund) with the signed microsDelta and resulting microsBalanceAfter.

Model strings

The model field always uses the "provider/model_id" shape. The part before the slash picks the provider; the part after is the provider-native model id (passed through unchanged).

text
elevenlabs/eleven_turbo_v2_5
cartesia/sonic-2
openai/gpt-4o-mini-tts

Responses

Successful POST /v1/audio/speech returns the raw audio stream. The provider that served the request is in the X-VoxRouter-Provider response header. Successful GET /v1/voices returns a JSON object with a voices array.

typescript
// Voice catalog response
type VoicesResponse = {
  voices: Array<{
    id: string;
    provider: string;
    name: string;
    language: string;
    labels: Record<string, string>;
    preview_url?: string;
    model_compatibility: string[];
  }>;
};

Errors

Non-2xx responses return a JSON error body with a machine-readable error code and an optional human-readable details. The first-party SDK surfaces these as VoxRouterError with .status, .code, and .details.

json
{
  "error": "invalid_model",
  "details": "unknown_provider: bad"
}
StatusCodeMeaning
400invalid_bodyJSON body failed schema validation
400invalid_modelMalformed model string or unknown provider
401unauthorizedMissing or invalid API key
402insufficient_creditWallet does not have enough credit to cover the estimated cost. Top up and retry.
402spend_limit_exceededThe API key tripped its per-key daily or monthly spend cap.
429rate_limitedPer-key rate limit exceeded. Retry-After header indicates seconds to wait.
429concurrency_limitedToo many in-flight requests for this key. Slots free on completion.
500internal_errorUnexpected server error.
502upstream_errorProvider returned an unrecoverable error after automatic retries.
503provider_unavailableProvider's circuit-breaker is open. Retry-After indicates expected reset.
504upstream_errorProvider did not respond within the per-attempt deadline.

Rate limits

Requests are rate-limited per API key. When you exceed the limit, the router returns 429 with {"error":"rate_limited"}. Retry with backoff. Concrete per-key limits are not yet published — reach out if you need a higher ceiling.

Streaming

POST /v1/audio/speech returns the audio body as a chunked HTTP response. In the SDK, use audio.speech.createRaw(…) to get the raw Response and read .body as a ReadableStream. In fetch-land, iterate the Blob or stream directly; see the Quickstart.