A single endpoint for all transcription needs — URL-based (YouTube, X, Twitch, Kick, TikTok, X Spaces) and file uploads (audio & video). Replaces vid2txt, aud2txt, videofile2txt, and audiofile2txt with automatic source detection.
slug, check specific limits and features.source_url — a URL to transcribe (YouTube, X/Twitter, Twitch, Kick, TikTok, or X Spaces)source_file — an uploaded audio or video file| Type | Formats |
|---|---|
| Audio | AAC, MPEG, OGG, WAV, WebM, FLAC |
| Video | MP4, MPEG, QuickTime, AVI, WMV, OGG |
return_result_in_response: true to receive the transcription text directly in the API response instead of a download URL. Useful for short content or real-time integrations.
vid2txt, aud2txt, videofile2txt, audiofile2txt) remain fully operational. The unified /transcribe endpoint is the recommended path for new integrations.Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
application/json Transcription parameters. Provide exactly one of source_url or source_file.
Should transcription include timestamps
The model to use for transcription. Available models can be retrieved via the GET /api/v1/client/models endpoint.
"WhisperLargeV3"
URL of video/audio to transcribe (YouTube, Twitter/X, Twitch, Kick, TikTok, Twitter Spaces). Mutually exclusive with source_file.
"https://www.youtube.com/watch?v=jNQXAC9IVRw"
Audio or video file to transcribe. Supported audio: aac, mpeg, ogg, wav, webm, flac. Supported video: mp4, mpeg, quicktime, avi, wmv, ogg. Mutually exclusive with source_url.
If true, the result will be returned directly in the response instead of only download url.
Optional HTTPS URL to receive webhook notifications for job status changes.
2048"https://your-server.com/webhooks/deapi"
ID of the inference request.
Information from success endpoint