API reference
Completions
POST /v1/completions — legacy text completions. Prefer chat completions for new work.
POST https://api.qubittron.ai/v1/completionsOpenAI-compatible legacy completions. Useful for non-chat prompts and porting older code. New integrations should prefer chat completions.
Authentication
Authorization: Bearer qbt_<key>
Request body
| Field | Type | Required | Notes |
|---|---|---|---|
model | string | yes | LLM model id |
prompt | string | string[] | yes | Single string or array |
stream | boolean | no | When true, returns SSE |
max_tokens, temperature, top_p, stop, etc. | various | no | Passthrough |
Supported models
The same LLM, code, vision, and safety models that support chat completions also support completions:
| Model | Category |
|---|---|
gpt-oss-120b | LLM |
gpt-oss-20b | LLM |
Llama-3.1-8B-Instruct | LLM |
Meta-Llama-3_3-70B-Instruct | LLM |
Qwen3-32B | LLM |
Mistral-7B-Instruct-v0.3 | LLM |
Mistral-Small-3.2-24B-Instruct-2506 | LLM |
Mistral-Nemo-Instruct-2407 | LLM |
Qwen3-Coder-30B-A3B-Instruct | Code |
Qwen2.5-VL-72B-Instruct | Vision |
Qwen3Guard-Gen-8B | Safety |
Qwen3Guard-Gen-0.6B | Safety |
Examples
curl https://api.qubittron.ai/v1/completions \
-H "Authorization: Bearer $QUBITTRON_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-oss-120b",
"prompt": "The capital of Canada is",
"max_tokens": 16
}'import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.qubittron.ai/v1",
apiKey: process.env.QUBITTRON_API_KEY,
});
const res = await client.completions.create({
model: "gpt-oss-120b",
prompt: "The capital of Canada is",
max_tokens: 16,
});
console.log(res.choices[0]?.text);const res = await fetch("https://api.qubittron.ai/v1/completions", {
method: "POST",
headers: {
Authorization: `Bearer ${process.env.QUBITTRON_API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "gpt-oss-120b",
prompt: "The capital of Canada is",
max_tokens: 16,
}),
});
const json = (await res.json()) as {
choices: { text: string; finish_reason: string }[];
usage: { prompt_tokens: number; completion_tokens: number; total_tokens: number };
};
console.log(json.choices[0]?.text);Response
{
"id": "cmpl-...",
"object": "text_completion",
"created": 1735689600,
"model": "gpt-oss-120b",
"choices": [
{
"text": " Ottawa.",
"index": 0,
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 6,
"completion_tokens": 3,
"total_tokens": 9
}
}Streaming returns text/event-stream with OpenAI-format chunks.
Errors
| Status | Code | When |
|---|---|---|
| 400 | invalid_request | Body failed validation |
| 400 | model_not_found | Model unknown or doesn't support completions |
| 401 | invalid_api_key | Missing/invalid Bearer token |
| 402 | insufficient_funds | Account credit exhausted |
| 429 | rate_limit_exceeded | Rate limit hit |
| 502 | upstream_error | Upstream model unreachable or 5xx |
Pricing
Metered per token (input + output).