Transcribe audio files to text using OpenAI Whisper. Supports MP3, WAV, OGG, WebM up to 25MB. Returns text, detected language, duration, and word segments.
Convert audio files to accurate text transcriptions using state-of-the-art Whisper AI models. Provide a URL to any publicly accessible audio file (mp3, mp4, wav, ogg, webm, m4a) up to 25MB. Returns the full transcription text, detected language, audio duration in seconds, and optional word-level segments. Pricing is based on audio duration at $0.003 per minute.
| Endpoint | POST /audio/transcribe |
| Price | $0.003 / minute of audio |
| Max file size | 25 MB |
| Provider | OpenAI Whisper (with Groq/Replicate fallback) |
| Auth | Bearer token or x402 micropayment |
| Base URL | https://api.iteratools.com |