Audio Transcription - quickstart

The Audio Transcriptions API provides a dedicated speech-to-text endpoint for uploaded audio files. It features:

Performant audio transcription with low word error rate (WER)
A simple multipart upload flow
Support for 14 languages

Prerequisites

Free trial key saved under a TRIAL_KEY environment variable

Setting up the audio file

The Audio Transcriptions endpoint accepts FLAC, MP3, MPEG, MPGA, OGG, and WAV files of 25MB or less. For this quickstart, we’ll use the following WAV example.

If you don’t have an audio file already, download the example to follow along.

Passing the audio file

Use the following code to pass the audio file to the Audio Transcription endpoint.

curl

Python

$ curl -X POST "https://api.cohere.com/v2/audio/transcriptions" \
>   -H "Authorization: Bearer $TRIAL_KEY" \
>   -F "model=cohere-transcribe-03-2026" \
>   -F "language=en" \
>   -F "file=@./transcribe-model-sample-derrida-mashup.wav" \

Receiving the response

It should return a response similar to the following.

curl

Python

$ {"text":"I speak only one language, and it's not my own, but the poet is a man of metaphor."}