Fumadocs on TanStack Start

POST https://api.qubittron.ai/v1/responses

OpenAI's Responses API in stateless mode. Pass store: false — Bastion's upstream does not retain server-side state.

Authentication

Authorization: Bearer qbt_<key>

Request body

Field	Type	Required	Notes
`model`	`string`	yes	One of the 9 supported models below
`input`	`string \| InputItem[]`	yes	Plain text or structured input array
`stream`	`boolean`	no	When `true`, returns SSE event stream
`store`	`boolean`	no	The Bastion server does not enforce this, but the upstream rejects stateful mode — send `store: false` or expect a 502
`max_output_tokens`, `temperature`, `tools`, etc.	various	no	Passthrough

Supported models

Nine models support /v1/responses:

gpt-oss-120b
gpt-oss-20b
Llama-3.1-8B-Instruct
Meta-Llama-3_3-70B-Instruct
Qwen3-32B
Mistral-Small-3.2-24B-Instruct-2506
Mistral-Nemo-Instruct-2407
Qwen3-Coder-30B-A3B-Instruct
Qwen2.5-VL-72B-Instruct

Other LLMs (e.g. Mistral-7B-Instruct-v0.3) return model_not_found on this endpoint — use /v1/chat/completions instead.

Examples

curl https://api.qubittron.ai/v1/responses \
  -H "Authorization: Bearer $QUBITTRON_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-oss-120b",
    "input": "Reply with: ok",
    "max_output_tokens": 32,
    "store": false
  }'

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.qubittron.ai/v1",
  apiKey: process.env.QUBITTRON_API_KEY,
});

const res = await client.responses.create({
  model: "gpt-oss-120b",
  input: "Reply with: ok",
  max_output_tokens: 32,
  store: false,
});
console.log(res.output_text);

const res = await fetch("https://api.qubittron.ai/v1/responses", {
  method: "POST",
  headers: {
    Authorization: `Bearer ${process.env.QUBITTRON_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "gpt-oss-120b",
    input: "Reply with: ok",
    max_output_tokens: 32,
    store: false,
  }),
});
const json = await res.json();
console.log(json);

Response

OpenAI Responses API format — see OpenAI's Responses reference for the full shape. The response contains output[] items, usage (input_tokens, output_tokens), and model.

Streaming (stream: true) returns the standard Responses event stream (response.created, response.output_item.added, response.completed, etc.).

Errors

Status	Code	When
400	`invalid_request`	Body failed validation
400	`model_not_found`	Model unknown or doesn't support responses
401	`invalid_api_key`	Missing/invalid Bearer token
402	`insufficient_funds`	Account credit exhausted
429	`rate_limit_exceeded`	Rate limit hit
502	`upstream_error`	Upstream returned 5xx, often when `store: true` is sent

Pricing

Metered per token (input + output).

Responses