Unified transcription — auto-detects URL or file upload for audio and video sources (YouTube, X, Twitch, Kick, TikTok, X/Twitter Spaces).
vid2txt, aud2txt, videofile2txt, audiofile2txt, transcribe). Accepts URLs or multipart file uploads. Returns a request_id for status polling.
video_url / audio_url, or upload a file via multipart form-data.slug and check supported languages.Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
application/json Transcription parameters. Provide exactly one of source_url or source_file.
Should transcription include timestamps
The model to use for transcription. Available models can be retrieved via the GET /api/v1/client/models endpoint.
"WhisperLargeV3"
URL of video/audio to transcribe (YouTube, Twitter/X, Twitch, Kick, TikTok, Twitter Spaces). Mutually exclusive with source_file.
"https://www.youtube.com/watch?v=jNQXAC9IVRw"
Audio or video file to transcribe. Supported audio: aac, mpeg, ogg, wav, webm, flac. Supported video: mp4, mpeg, quicktime, avi, wmv, ogg. Mutually exclusive with source_url.
If true, the result will be returned directly in the response instead of only download url.
Optional HTTPS URL to receive webhook notifications for job status changes.
2048"https://your-server.com/webhooks/deapi"
ID of the inference request.
Information from success endpoint