Speech to Text API

The latest and most precise speech recognition AI model that enables you to quickly transcribe audio into text.
Our Speech-to-Text (STT) engine uses advanced neural networks to provide near-perfect transcription across 50+ languages, handling multiple accents and background noise with ease.

Quick Start

Endpoint

POST https://sonna.web.id/api/v1/audio/transcriptions

Authentication

Authenticate your requests by providing your API key in the Authorization header.
bash
curl -X POST https://sonna.web.id/api/v1/audio/transcriptions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: multipart/form-data" \
  -F "file=@audio.mp3" \
  -F "model=whisper-1"

Request Parameters

ParameterTypeRequiredDescription
filefileYesThe audio file to transcribe (mp3, wav, m4a, etc).
modelstringYesThe ID of the model to use. Current available: whisper-1.
languagestringNoThe language of the input audio in ISO-639-1 format.
promptstringNoAn optional text to guide the model's style or continue a previous segment.

Response

The response will be a JSON object containing the transcribed text.
json
{
  "text": "Hello, this is a sample transcription from Sonna AI's precision engine."
}

Any questions? Email us at hello@sonna.ai