Audio · MiniMax
Text-to-Speech
Convert text to natural-sounding speech using MiniMax speech-02-hd with voice cloning, voice design, and streaming support.
/v1/audio/speechSupported Models
| Model | Provider |
|---|---|
speech-02-hd | MiniMax |
Request
Body Parameters
modelstringrequiredUse "speech-02-hd"
textstringrequiredText to convert to speech
voice_settingobjectrequiredVoice configuration — voice_id, speed, vol, pitch
audio_settingobjectOutput audio config — sample_rate, bitrate, format, channel
male-qn-qingse, or a custom voice created via Voice Cloning or Voice Design.voice_setting Fields
voice_idstringrequiredBuilt-in or cloned voice identifier
speednumberSpeech speed multiplier
Default: 1.0
volnumberVolume level
Default: 1.0
pitchnumberPitch adjustment
Default: 0
audio_setting Fields
sample_rateintegerSample rate in Hz
Default: 32000
bitrateintegerBitrate in bps
Default: 128000
formatstringAudio format
Default: mp3
Options: mp3, wav, pcm, flac
channelintegerNumber of audio channels
Default: 1
curl https://api.metriqual.com/v1/audio/speech \
-H "Authorization: Bearer mql_your_key" \
-H "Content-Type: application/json" \
-d '{
"model": "speech-02-hd",
"text": "Hello, welcome to Metriqual!",
"voice_setting": {
"voice_id": "male-qn-qingse",
"speed": 1.0,
"vol": 1.0,
"pitch": 0
},
"audio_setting": {
"sample_rate": 32000,
"bitrate": 128000,
"format": "mp3"
}
}' --output speech.mp3const audio = await mql.audio.speech({
model: 'speech-02-hd',
text: 'Hello, welcome to Metriqual!',
voice_setting: {
voice_id: 'male-qn-qingse',
speed: 1.0,
vol: 1.0,
pitch: 0
},
audio_setting: {
sample_rate: 32000,
bitrate: 128000,
format: 'mp3'
}
});audio = mql.audio.speech(
model="speech-02-hd",
text="Hello, welcome to Metriqual!",
voice_setting={
"voice_id": "male-qn-qingse",
"speed": 1.0,
"vol": 1.0,
"pitch": 0,
},
audio_setting={
"sample_rate": 32000,
"bitrate": 128000,
"format": "mp3",
},
)Async Speech (Long-form)
/v1/audio/speech/asyncFor long text, use async speech generation. Submit the request, then poll for status and download when complete.
Related Endpoints
GET /v1/audio/speech/async/:task_idendpointCheck async task status
GET /v1/audio/speech/async/:task_id/downloadendpointDownload completed audio
# Start async generation
curl -X POST https://api.metriqual.com/v1/audio/speech/async \
-H "Authorization: Bearer mql_your_key" \
-H "Content-Type: application/json" \
-d '{
"model": "speech-02-hd",
"text": "Very long text content goes here..."
}'
# Check status
curl https://api.metriqual.com/v1/audio/speech/async/TASK_ID \
-H "Authorization: Bearer mql_your_key"
# Download when ready
curl https://api.metriqual.com/v1/audio/speech/async/TASK_ID/download \
-H "Authorization: Bearer mql_your_key" --output output.mp3# Start async generation
task = mql.audio.speech_async(
model="speech-02-hd",
text="Very long text content goes here...",
voice_setting={"voice_id": "male-qn-qingse"},
)
# Or submit and wait for completion in one call
audio = mql.audio.speech_async_and_wait(
model="speech-02-hd",
text="Very long text content goes here...",
voice_setting={"voice_id": "male-qn-qingse"},
)WebSocket Streaming
/v1/audio/speech/streamReal-time TTS streaming over WebSocket. Connect, send text chunks, and receive audio in real-time. Ideal for interactive voice applications.