Bastion API
Sovereign Canadian inference. OpenAI-compatible API for chat, embeddings, images, transcription, and TTS.
Qubi Bastion is a sovereign Canadian inference platform. One OpenAI-compatible API surface across open-source LLMs (Llama, Mistral, Qwen, GPT-OSS), embeddings, image generation, transcription, and TTS — all hosted on Canadian infrastructure with data residency you can audit.
Drop in your existing OpenAI SDK calls — change only the baseURL and key.
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.qubittron.ai/v1",
apiKey: process.env.QUBITTRON_API_KEY,
});
const res = await client.chat.completions.create({
model: "gpt-oss-120b",
messages: [{ role: "user", content: "Hello!" }],
});Getting started
Create a key, set the base URL, make your first request.
API reference
All eight endpoints with cURL, SDK, and fetch examples.
TypeScript SDK
Official typed client for Node, Bun, and edge — with streaming, errors, and guides.
Guides
Migrate from OpenAI, build RAG, wire up audio pipelines.
Best practices
Keys, retries, streaming, and a production checklist.
Endpoint surface
| Endpoint | What it does |
|---|---|
GET /v1/models | List available models |
POST /v1/chat/completions | Chat completions (streaming optional) |
POST /v1/completions | Legacy text completions |
POST /v1/responses | OpenAI Responses API (store: false) |
POST /v1/embeddings | Vector embeddings |
POST /v1/images/generations | Image generation |
POST /v1/audio/transcriptions | Speech-to-text (multipart) |
POST /api/v1/tts/text_to_audio | Text-to-speech (NVIDIA Riva) |
Compatibility
Authentication uses a Bearer token (Authorization: Bearer qbt_...). Request and response shapes mirror OpenAI's API for chat.completions, completions, embeddings, images.generations, and audio.transcriptions. The responses endpoint requires store: false. text_to_audio follows NVIDIA Riva's TTS contract.