Introduction
Transcribe audio and video programmatically with the VideoToText REST API.
The VideoToText API lets you submit audio or video and retrieve transcripts from your own apps, scripts, agents, or automation tools (n8n, MCP, and more).
Transcription is asynchronous: you create a job, then either poll its status or receive a webhook when it finishes. Large files never flow through your code — you either upload directly to storage or hand us a public URL to fetch.
Base URL
https://www.transcribevideototext.com/api/v1The machine-readable spec is served by the API at
/api/v1/openapi.json —
import it into Postman, Insomnia, or a client generator.
Quickstart
Create an API key
In the dashboard, go to Settings → API and create a key. Copy it — it's shown once.
Submit a transcription
curl -X POST https://www.transcribevideototext.com/api/v1/transcriptions \
-H "Authorization: Bearer vtt_your_key" \
-H "Content-Type: application/json" \
-d '{"url":"https://example.com/audio.mp3","language":"auto"}'
# → { "id": "…", "status": "pending" }Poll until it completes
curl https://www.transcribevideototext.com/api/v1/transcriptions/{id} \
-H "Authorization: Bearer vtt_your_key"
# → { "status": "completed", "segments": [ … ], … }What's next
- Authentication — API keys and the
Authorizationheader. - Create a transcription — upload vs. URL ingest, options.
- Retrieve results — status, segments, and listing.
- Webhooks — get pushed events instead of polling.
- Rate limits and Errors.