Fumadocs on TanStack Start

POST https://api.qubittron.ai/v1/audio/transcriptions

OpenAI-compatible speech-to-text. Multipart upload, 25 MB cap.

Authentication

Authorization: Bearer qbt_<key>

Request

Content-Type: multipart/form-data with the following fields:

Field	Type	Required	Notes
`file`	file (`Blob`)	yes	Audio. Max 25 MB
`model`	string	yes	Speech model id
`language`	string	no	ISO-639-1 (e.g. `"en"`)
`response_format`	string	no	`"json"` (default), `"text"`, `"verbose_json"`
`temperature`, `prompt`	various	no	Passthrough

Supported models

Model	Notes
`whisper-large-v3`	Highest quality, slower
`whisper-large-v3-turbo`	Faster, near-equivalent quality

Examples

curl https://api.qubittron.ai/v1/audio/transcriptions \
  -H "Authorization: Bearer $QUBITTRON_API_KEY" \
  -F "model=whisper-large-v3-turbo" \
  -F "language=en" \
  -F "file=@/path/to/audio.wav"

import OpenAI from "openai";
import { createReadStream } from "node:fs";

const client = new OpenAI({
  baseURL: "https://api.qubittron.ai/v1",
  apiKey: process.env.QUBITTRON_API_KEY,
});

const res = await client.audio.transcriptions.create({
  model: "whisper-large-v3-turbo",
  file: createReadStream("/path/to/audio.wav"),
  language: "en",
});
console.log(res.text);

import { readFileSync } from "node:fs";

const fileBytes = readFileSync("/path/to/audio.wav");
const fd = new FormData();
fd.append("model", "whisper-large-v3-turbo");
fd.append("language", "en");
fd.append("file", new Blob([fileBytes], { type: "audio/wav" }), "audio.wav");

const res = await fetch("https://api.qubittron.ai/v1/audio/transcriptions", {
  method: "POST",
  headers: { Authorization: `Bearer ${process.env.QUBITTRON_API_KEY}` },
  body: fd,
});
const json = (await res.json()) as { text: string; language?: string };
console.log(json.text);

Response

Default (response_format omitted or "json"):

{
  "text": "the quick brown fox jumps over the lazy dog",
  "language": "en"
}

response_format: "verbose_json" adds segments and timing.

Errors

Status	Code	When
400	`invalid_request`	Missing fields, non-multipart Content-Type
400	`model_not_found`	Model unknown or doesn't support transcriptions
401	`invalid_api_key`	Missing/invalid Bearer token
402	`insufficient_funds`	Account credit exhausted
413	`request_entity_too_large`	File or body exceeds 25 MB
429	`rate_limit_exceeded`	Rate limit hit
502	`upstream_error`	Upstream returned 5xx or unparseable JSON

Pricing

Metered per second of audio. Duration is read from the upstream usage.seconds (when present, e.g. verbose_json), otherwise parsed from the file's audio header.

Audio transcriptions