Qubittron Bastion

Bastion API

Sovereign Canadian inference. OpenAI-compatible API for chat, embeddings, images, transcription, and TTS.

Qubi Bastion is a sovereign Canadian inference platform. One OpenAI-compatible API surface across open-source LLMs (Llama, Mistral, Qwen, GPT-OSS), embeddings, image generation, transcription, and TTS — all hosted on Canadian infrastructure with data residency you can audit.

Drop in your existing OpenAI SDK calls — change only the baseURL and key.

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.qubittron.ai/v1",
  apiKey: process.env.QUBITTRON_API_KEY,
});

const res = await client.chat.completions.create({
  model: "gpt-oss-120b",
  messages: [{ role: "user", content: "Hello!" }],
});

Endpoint surface

EndpointWhat it does
GET /v1/modelsList available models
POST /v1/chat/completionsChat completions (streaming optional)
POST /v1/completionsLegacy text completions
POST /v1/responsesOpenAI Responses API (store: false)
POST /v1/embeddingsVector embeddings
POST /v1/images/generationsImage generation
POST /v1/audio/transcriptionsSpeech-to-text (multipart)
POST /api/v1/tts/text_to_audioText-to-speech (NVIDIA Riva)

Compatibility

Authentication uses a Bearer token (Authorization: Bearer qbt_...). Request and response shapes mirror OpenAI's API for chat.completions, completions, embeddings, images.generations, and audio.transcriptions. The responses endpoint requires store: false. text_to_audio follows NVIDIA Riva's TTS contract.

On this page