Qubittron Bastion
API reference

Completions

POST /v1/completions — legacy text completions. Prefer chat completions for new work.

POST https://api.qubittron.ai/v1/completions

OpenAI-compatible legacy completions. Useful for non-chat prompts and porting older code. New integrations should prefer chat completions.

Authentication

Authorization: Bearer qbt_<key>

Request body

FieldTypeRequiredNotes
modelstringyesLLM model id
promptstring | string[]yesSingle string or array
streambooleannoWhen true, returns SSE
max_tokens, temperature, top_p, stop, etc.variousnoPassthrough

Supported models

The same LLM, code, vision, and safety models that support chat completions also support completions:

ModelCategory
gpt-oss-120bLLM
gpt-oss-20bLLM
Llama-3.1-8B-InstructLLM
Meta-Llama-3_3-70B-InstructLLM
Qwen3-32BLLM
Mistral-7B-Instruct-v0.3LLM
Mistral-Small-3.2-24B-Instruct-2506LLM
Mistral-Nemo-Instruct-2407LLM
Qwen3-Coder-30B-A3B-InstructCode
Qwen2.5-VL-72B-InstructVision
Qwen3Guard-Gen-8BSafety
Qwen3Guard-Gen-0.6BSafety

Examples

curl https://api.qubittron.ai/v1/completions \
  -H "Authorization: Bearer $QUBITTRON_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-oss-120b",
    "prompt": "The capital of Canada is",
    "max_tokens": 16
  }'
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.qubittron.ai/v1",
  apiKey: process.env.QUBITTRON_API_KEY,
});

const res = await client.completions.create({
  model: "gpt-oss-120b",
  prompt: "The capital of Canada is",
  max_tokens: 16,
});
console.log(res.choices[0]?.text);
const res = await fetch("https://api.qubittron.ai/v1/completions", {
  method: "POST",
  headers: {
    Authorization: `Bearer ${process.env.QUBITTRON_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "gpt-oss-120b",
    prompt: "The capital of Canada is",
    max_tokens: 16,
  }),
});
const json = (await res.json()) as {
  choices: { text: string; finish_reason: string }[];
  usage: { prompt_tokens: number; completion_tokens: number; total_tokens: number };
};
console.log(json.choices[0]?.text);

Response

{
  "id": "cmpl-...",
  "object": "text_completion",
  "created": 1735689600,
  "model": "gpt-oss-120b",
  "choices": [
    {
      "text": " Ottawa.",
      "index": 0,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 6,
    "completion_tokens": 3,
    "total_tokens": 9
  }
}

Streaming returns text/event-stream with OpenAI-format chunks.

Errors

StatusCodeWhen
400invalid_requestBody failed validation
400model_not_foundModel unknown or doesn't support completions
401invalid_api_keyMissing/invalid Bearer token
402insufficient_fundsAccount credit exhausted
429rate_limit_exceededRate limit hit
502upstream_errorUpstream model unreachable or 5xx

Pricing

Metered per token (input + output).

On this page