Audio · OpenAI
Audio Transcription
Transcribe audio files to text using OpenAI Whisper and GPT-4o transcription models through the Metriqual gateway.
POST
/v1/audio/transcriptionsSupported Models
| Model | Provider |
|---|---|
whisper-1 | OpenAI |
gpt-4o-transcribe | OpenAI |
gpt-4o-mini-transcribe | OpenAI |
Request
Multipart Form Parameters
filefileformrequiredAudio file (mp3, mp4, mpeg, mpga, m4a, wav, webm — max 25 MB)
modelstringformrequiredModel ID
languagestringformISO-639-1 language code (improves accuracy)
promptstringformHint text to guide transcription style
response_formatstringformOutput format
Default: json
Options: json, text, srt, verbose_json, vtt, diarized_json
temperaturenumberformSampling temperature (0-1)
Default: 0
include[]stringformExtra fields to include
Options: logprobs
cURL — Whisper
curl https://api.metriqual.com/v1/audio/transcriptions \
-H "Authorization: Bearer mql_your_key" \
-F file=@audio.mp3 \
-F model=whisper-1 \
-F language=encURL — GPT-4o Transcribe
curl https://api.metriqual.com/v1/audio/transcriptions \
-H "Authorization: Bearer mql_your_key" \
-F file=@meeting.mp3 \
-F model=gpt-4o-transcribe \
-F response_format=verbose_json \
-F "include[]=logprobs"TypeScript SDK
const result = await mql.audio.transcribe({
file: audioBuffer,
model: 'gpt-4o-transcribe',
language: 'en',
response_format: 'verbose_json',
include: ['logprobs']
});
console.log(result.text);
console.log(result.logprobs); // token-level probabilitiesPython SDK
with open("audio.mp3", "rb") as f:
result = mql.audio.transcribe(
file=f,
model="gpt-4o-transcribe",
language="en",
)
print(result["text"])Response
The response format depends on the response_format parameter.
diarized_json (gpt-4o models only) includes speaker labels for each segment.
200
json (default)
{
"text": "Hello, this is a test transcription of audio content."
}200
verbose_json
{
"text": "Hello world.",
"task": "transcribe",
"language": "english",
"duration": 3.42,
"segments": [
{
"id": 0,
"start": 0.0,
"end": 3.42,
"text": " Hello world.",
"temperature": 0
}
],
"logprobs": [
{ "token": "Hello", "logprob": -0.12 },
{ "token": " world", "logprob": -0.05 }
]
}