For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
DASHBOARDPLAYGROUNDDOCSCOMMUNITYLOG IN
Guides and conceptsAPI ReferenceRelease NotesLLMUCookbooks
Guides and conceptsAPI ReferenceRelease NotesLLMUCookbooks
  • Get Started
    • Introduction
    • Installation
    • Creating a client
      • RAG
      • Reranking
      • Semantic Search
      • Text Generation
      • Tool Use & Agents
      • Transcribing Audio
    • Playground
    • FAQs
  • Models
    • An Overview of Cohere's Models
    • Aya
    • Embed
    • Rerank
  • Text Generation
    • Introduction to Text Generation at Cohere
    • Using the Chat API
    • Reasoning
    • Image Inputs
    • Streaming Responses
    • Predictable Outputs
    • Advanced Generation Parameters
    • Tool Use
    • Tokens and Tokenizers
    • Summarizing Text
    • Safety Modes
  • Embeddings (Vectors, Search, Retrieval)
    • Introduction to Embeddings at Cohere
    • Semantic Search with Embeddings
    • Multimodal Embeddings
    • Batch Embedding Jobs
  • Going to Production
    • API Keys and Rate Limits
    • Going Live
    • Deprecations
    • How Does Cohere's Pricing Work?
  • Integrations
    • Integrating Embedding Models with Other Tools
    • Cohere and LangChain
    • LlamaIndex and Cohere
  • Deployment Options
    • Overview
    • SDK Compatibility
  • Tutorials
    • Cookbooks
    • LLM University
    • Build Things with Cohere!
    • Agentic RAG
    • Cohere on Azure
  • Responsible Use
    • Security
    • Usage Policy
    • Command A Technical Report
    • Command R and Command R+ Model Card
  • Cohere Labs
    • Cohere Labs Acceptable Use Policy
  • More Resources
    • Cohere Toolkit
    • Datasets
    • Improve Cohere Docs
LogoLogodocs
DASHBOARDPLAYGROUNDDOCSCOMMUNITYLOG IN
On this page
  • Further Resources
Get StartedQuickstart

Audio Transcription - quickstart

Was this page helpful?
Edit this page
Previous

An Overview of the Developer Playground

Next
Built with

The Audio Transcriptions API provides a dedicated speech-to-text endpoint for uploaded audio files. It features:

  • Performant audio transcription with low word error rate (WER)
  • A simple multipart upload flow
  • Support for 14 languages
1

Prerequisites

  • Free trial key saved under a TRIAL_KEY environment variable
2

Setting up the audio file

The Audio Transcriptions endpoint accepts FLAC, MP3, MPEG, MPGA, OGG, and WAV files of 25MB or less. For this quickstart, we’ll use the following WAV example.

If you don’t have an audio file already, download the example to follow along.

3

Passing the audio file

Use the following code to pass the audio file to the Audio Transcription endpoint.

curl
Python
1curl -X POST "https://api.cohere.com/v2/audio/transcriptions" \
2 -H "Authorization: Bearer $TRIAL_KEY" \
3 -F "model=cohere-transcribe-03-2026" \
4 -F "language=en" \
5 -F "file=@./transcribe-model-sample-derrida-mashup.wav" \
4

Receiving the response

It should return a response similar to the following.

curl
Python
${"text":"I speak only one language, and it's not my own, but the poet is a man of metaphor."}

Further Resources

  • Cohere Transcribe model card
  • Audio Transcriptions API reference documentation
  • Jupyter notebook