Audio Transcription - quickstart
The Audio Transcriptions API provides a dedicated speech-to-text endpoint for uploaded audio files. It features:
- Performant audio transcription with low word error rate (WER)
- A simple multipart upload flow
- Support for 14 languages
Setting up the audio file
The Audio Transcriptions endpoint accepts FLAC, MP3, MPEG, MPGA, OGG, and WAV files of 25MB or less. For this quickstart, we’ll use the following WAV example.
If you don’t have an audio file already, download the example to follow along.