Fumadocs on TanStack Start

Sovereign Canadian inference. OpenAI-compatible API for chat, embeddings, images, transcription, and TTS.

Qubi Bastion is a sovereign Canadian inference platform. One OpenAI-compatible API surface across open-source LLMs (Llama, Mistral, Qwen, GPT-OSS), embeddings, image generation, transcription, and TTS — all hosted on Canadian infrastructure with data residency you can audit.

Drop in your existing OpenAI SDK calls — change only the baseURL and key.

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.qubittron.ai/v1",
  apiKey: process.env.QUBITTRON_API_KEY,
});

const res = await client.chat.completions.create({
  model: "gpt-oss-120b",
  messages: [{ role: "user", content: "Hello!" }],
});

Getting started

Create a key, set the base URL, make your first request.

API reference

All eight endpoints with cURL, SDK, and fetch examples.

TypeScript SDK

Official typed client for Node, Bun, and edge — with streaming, errors, and guides.

Guides

Migrate from OpenAI, build RAG, wire up audio pipelines.

Best practices

Keys, retries, streaming, and a production checklist.

Endpoint surface

Endpoint	What it does
`GET /v1/models`	List available models
`POST /v1/chat/completions`	Chat completions (streaming optional)
`POST /v1/completions`	Legacy text completions
`POST /v1/responses`	OpenAI Responses API (`store: false`)
`POST /v1/embeddings`	Vector embeddings
`POST /v1/images/generations`	Image generation
`POST /v1/audio/transcriptions`	Speech-to-text (multipart)
`POST /api/v1/tts/text_to_audio`	Text-to-speech (NVIDIA Riva)

Compatibility

Authentication uses a Bearer token (Authorization: Bearer qbt_...). Request and response shapes mirror OpenAI's API for chat.completions, completions, embeddings, images.generations, and audio.transcriptions. The responses endpoint requires store: false. text_to_audio follows NVIDIA Riva's TTS contract.

Bastion API