Fumadocs on TanStack Start

POST https://api.qubittron.ai/v1/chat/completions

The primary entry point for conversational LLMs. OpenAI-compatible — drop in your existing SDK code with baseURL swapped.

Authentication

Authorization: Bearer qbt_<key>

Request body

Field	Type	Required	Notes
`model`	`string`	yes	Any LLM model id (see below)
`messages`	`Message[]`	yes	OpenAI message format
`stream`	`boolean`	no	When `true`, returns SSE stream
`max_tokens`, `temperature`, `top_p`, `tools`, `tool_choice`, `response_format`, etc.	various	no	Passed through to upstream

Unknown fields pass through unchanged — Bastion does not strip OpenAI fields it doesn't recognize.

Supported models

Model	Category
`gpt-oss-120b`	LLM
`gpt-oss-20b`	LLM
`Llama-3.1-8B-Instruct`	LLM
`Meta-Llama-3_3-70B-Instruct`	LLM
`Qwen3-32B`	LLM
`Mistral-7B-Instruct-v0.3`	LLM
`Mistral-Small-3.2-24B-Instruct-2506`	LLM
`Mistral-Nemo-Instruct-2407`	LLM
`Qwen3-Coder-30B-A3B-Instruct`	Code
`Qwen2.5-VL-72B-Instruct`	Vision
`Qwen3Guard-Gen-8B`	Safety
`Qwen3Guard-Gen-0.6B`	Safety

Tool/structured-output support is not enforced by Bastion — the request passes through unchanged. Models that don't support tools or response_format upstream may return a 502.

Use GET /v1/models for the live list on your account.

Examples

curl https://api.qubittron.ai/v1/chat/completions \
  -H "Authorization: Bearer $QUBITTRON_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-oss-120b",
    "messages": [{ "role": "user", "content": "Reply with exactly: ok" }],
    "max_tokens": 16
  }'

Non-streaming:

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.qubittron.ai/v1",
  apiKey: process.env.QUBITTRON_API_KEY,
});

const res = await client.chat.completions.create({
  model: "gpt-oss-120b",
  messages: [{ role: "user", content: "Reply with exactly: ok" }],
  max_tokens: 16,
});
console.log(res.choices[0]?.message.content);

Streaming:

const stream = await client.chat.completions.create({
  model: "gpt-oss-120b",
  messages: [{ role: "user", content: "Count from 1 to 5." }],
  stream: true,
});
for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}

const res = await fetch("https://api.qubittron.ai/v1/chat/completions", {
  method: "POST",
  headers: {
    Authorization: `Bearer ${process.env.QUBITTRON_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "gpt-oss-120b",
    messages: [{ role: "user", content: "Reply with exactly: ok" }],
    max_tokens: 16,
  }),
});
const json = (await res.json()) as {
  choices: { message: { role: string; content: string } }[];
  usage: { prompt_tokens: number; completion_tokens: number; total_tokens: number };
};
console.log(json.choices[0]?.message.content);

Response

Non-streaming:

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "created": 1735689600,
  "model": "gpt-oss-120b",
  "choices": [
    {
      "index": 0,
      "message": { "role": "assistant", "content": "ok" },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 1,
    "total_tokens": 13
  }
}

Streaming (stream: true) returns text/event-stream with OpenAI-format data: {…} chunks ending with data: [DONE].

Errors

Status	Code	When
400	`invalid_request`	Body failed validation (`messages` missing, etc.)
400	`model_not_found`	Model unknown or doesn't support chat
401	`invalid_api_key`	Missing/invalid Bearer token
402	`insufficient_funds`	Account credit exhausted
429	`rate_limit_exceeded`	Rate limit hit
502	`upstream_error`	Upstream model unreachable or 5xx

Pricing

Metered per token (input + output). Per-model rates are listed on your dashboard.

Chat completions