Examples
Copy-paste recipes for common SDK integrations — streaming UIs, tool calling, voice pipelines.
Every snippet here is self-contained — drop it into a file, set BASTION_API_KEY, and run.
Streaming token-by-token to stdout
import { Bastion } from "@qubittron/bastion-sdk";
const client = new Bastion();
const stream = await client.chat.completions.create({
model: "gpt-oss-120b",
messages: [{ role: "user", content: "Explain TLS in two sentences." }],
stream: true,
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta.content ?? "");
}
process.stdout.write("\n");Streaming to an HTTP client (Hono)
Re-emit SSE chunks to a browser so the user sees output as it generates:
import { Hono } from "hono";
import { Bastion } from "@qubittron/bastion-sdk";
const app = new Hono();
const client = new Bastion();
app.post("/chat", async (c) => {
const { messages } = await c.req.json();
const stream = await client.chat.completions.create({
model: "gpt-oss-120b",
messages,
stream: true,
});
return new Response(
new ReadableStream({
async start(controller) {
const enc = new TextEncoder();
try {
for await (const chunk of stream) {
controller.enqueue(enc.encode(`data: ${JSON.stringify(chunk)}\n\n`));
}
controller.enqueue(enc.encode("data: [DONE]\n\n"));
} finally {
controller.close();
}
},
}),
{ headers: { "Content-Type": "text/event-stream" } },
);
});
export default app;JSON-mode output
response_format and other OpenAI-compatible fields aren't on the SDK's static type but pass through to the upstream verbatim. Build the params object with an intersection type so TS still checks the rest:
import type { ChatCompletionCreateParamsNonStreaming } from "@qubittron/bastion-sdk";
type WithJsonMode = ChatCompletionCreateParamsNonStreaming & {
response_format: { type: "json_object" };
};
const params: WithJsonMode = {
model: "gpt-oss-120b",
response_format: { type: "json_object" },
messages: [
{ role: "system", content: "Respond with JSON: { city: string, country: string }." },
{ role: "user", content: "Where is the CN Tower?" },
],
};
const res = await client.chat.completions.create(params);
const data = JSON.parse(res.choices[0]?.message.content ?? "{}");
console.log(data.city, data.country);Tool calling (passthrough)
Bastion forwards tools / tool_choice unchanged. Same pattern — extend the param type rather than casting to object:
import type {
ChatCompletionCreateParamsNonStreaming,
} from "@qubittron/bastion-sdk";
type ToolDef = {
type: "function";
function: {
name: string;
parameters: Record<string, unknown>;
};
};
type WithTools = ChatCompletionCreateParamsNonStreaming & {
tools: ToolDef[];
tool_choice: "auto" | "none";
};
const params: WithTools = {
model: "gpt-oss-120b",
messages: [{ role: "user", content: "What's the weather in Toronto?" }],
tools: [
{
type: "function",
function: {
name: "get_weather",
parameters: {
type: "object",
properties: { city: { type: "string" } },
required: ["city"],
},
},
},
],
tool_choice: "auto",
};
const res = await client.chat.completions.create(params);
const toolCalls = (res.choices[0]?.message as { tool_calls?: unknown[] }).tool_calls;Whether tool calls actually fire depends on the upstream model — gpt-oss-120b and Mistral-Small-3.2 support them; safety/guard models do not.
Whisper transcription → summarize
A two-step pipeline that's common in voice apps:
import { readFile } from "node:fs/promises";
import { Bastion } from "@qubittron/bastion-sdk";
const client = new Bastion();
const buf = await readFile("meeting.wav");
const file = new File([buf], "meeting.wav", { type: "audio/wav" });
const transcript = await client.audio.transcriptions.create({
file,
model: "whisper-large-v3",
});
const summary = await client.chat.completions.create({
model: "gpt-oss-120b",
messages: [
{ role: "system", content: "Summarize this transcript in five bullets." },
{ role: "user", content: transcript.text },
],
});
console.log(summary.choices[0]?.message.content);Text → speech file
import { writeFile } from "node:fs/promises";
import { Bastion, SpeechEncoding } from "@qubittron/bastion-sdk";
const client = new Bastion();
const { audio, contentType } = await client.audio.speech({
text: "The quick brown fox jumps over the lazy dog.",
language_code: "en-US",
voice_name: "English-US.Female-1",
encoding: SpeechEncoding.LINEAR_PCM,
sample_rate_hz: 44100,
});
const ext = contentType.includes("ogg") ? "ogg" : "wav";
await writeFile(`out.${ext}`, audio);Embeddings → cosine search
Embeddings are on the Bastion HTTP API but not yet wrapped by the SDK (see roadmap in the SDK README). Until then, call the HTTP endpoint with the OpenAI SDK pointed at Bastion:
import OpenAI from "openai";
const openai = new OpenAI({
baseURL: "https://api.qubittron.ai/v1",
apiKey: process.env.QUBITTRON_API_KEY,
});
const res = await openai.embeddings.create({
model: "bge-large-en-v1.5",
input: "sovereign Canadian inference",
});
const vec = res.data[0]?.embedding;See the Bastion API docs for the full HTTP shape.
Retry wrapper
The withRetry helper from error handling applied to every SDK call:
import {
Bastion,
RateLimitError,
UpstreamError,
APIConnectionError,
} from "@qubittron/bastion-sdk";
const client = new Bastion();
async function withRetry<T>(fn: () => Promise<T>, max = 4): Promise<T> {
let attempt = 0;
while (true) {
try {
return await fn();
} catch (err) {
attempt += 1;
const retriable =
err instanceof RateLimitError ||
err instanceof UpstreamError ||
err instanceof APIConnectionError;
if (!retriable || attempt >= max) throw err;
await new Promise((r) => setTimeout(r, 2 ** attempt * 250 + Math.random() * 250));
}
}
}
const res = await withRetry(() =>
client.chat.completions.create({
model: "gpt-oss-120b",
messages: [{ role: "user", content: "hi" }],
}),
);