Audio · OpenAI

Text-to-Speech

Convert text to natural-sounding speech using OpenAI TTS models, including the new gpt-4o-mini-tts with voice instructions.

POST/v1/audio/speech

Supported Models

Model	Provider	Description
`tts-1`	OpenAI	Standard quality TTS — low latency
`tts-1-hd`	OpenAI	High-definition TTS — richer audio
`gpt-4o-mini-tts`	OpenAI	Next-gen TTS with voice instructions support

Voices

Built-in voices: alloy, ash, ballad, coral, echo, fable, onyx, nova, sage, shimmer, verse, marin, cedar.

You can also use a custom voice by passing an object: { "id": "voice_abc123" }.

Request

Body Parameters

modelstringrequired

TTS model — tts-1, tts-1-hd, or gpt-4o-mini-tts

inputstringrequired

Text to convert to speech (max 4096 chars)

voicestring | objectrequired

Voice name (e.g. "nova") or custom voice object {"id":"voice_abc123"}

Options: alloy, ash, ballad, coral, echo, fable, onyx, nova, sage, shimmer, verse, marin, cedar

speednumber

Speech speed (0.25-4.0)

Default: 1.0

response_formatstring

Output audio format

Default: mp3

Options: mp3, opus, aac, flac, wav, pcm

instructionsstring

Instructions to control voice style/emotion (gpt-4o-mini-tts only)

stream_formatstring

Streaming format (gpt-4o-mini-tts only)

Options: sse, audio

cURL

curl https://api.metriqual.com/v1/audio/speech \
  -H "Authorization: Bearer mql_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tts-1-hd",
    "input": "Hello, welcome to Metriqual!",
    "voice": "nova"
  }' --output speech.mp3

TypeScript SDK

const audio = await mql.audio.speech({
  model: 'tts-1-hd',
  input: 'Hello, welcome to Metriqual!',
  voice: 'nova'
});

// audio is an ArrayBuffer — save to file
const fs = require('fs');
fs.writeFileSync('speech.mp3', Buffer.from(audio));

Python SDK

audio_bytes = mql.audio.speech(
    input="Hello, welcome to Metriqual!",
    voice="nova",
    model="tts-1-hd",
)

with open("speech.mp3", "wb") as f:
    f.write(audio_bytes)

Voice Instructions (gpt-4o-mini-tts)

The gpt-4o-mini-tts model supports an instructions parameter that lets you control the voice's tone, emotion, cadence, and speaking style.

TypeScript SDK — with instructions

const audio = await mql.audio.speech({
  model: 'gpt-4o-mini-tts',
  input: 'Welcome aboard! We are thrilled to have you.',
  voice: 'coral',
  instructions: 'Speak in a warm, enthusiastic, friendly tone.'
});

Python SDK — with instructions

audio = mql.audio.speech(
    input="Welcome aboard! We are thrilled to have you.",
    voice="coral",
    model="gpt-4o-mini-tts",
    instructions="Speak in a warm, enthusiastic, friendly tone.",
)

TypeScript SDK — custom voice

// Use a voice created via createVoice()
const audio = await mql.audio.speech({
  model: 'gpt-4o-mini-tts',
  input: 'Hello from your custom voice!',
  voice: { id: 'voice_abc123' },
  instructions: 'Speak naturally and clearly.'
});

Quick Start