TypeScript SDK
Best practices
Production-readiness checklist — secrets, retries, timeouts, observability, cost control.
A short, opinionated list. Treat it as a checklist, not a tutorial.
Secrets
- Never embed
BASTION_API_KEYin client-side bundles. Bundlers will inline it. Proxy through your server. See Environments → Browsers. - Rotate quarterly, and on any suspected leak. The SDK reads the env var on construction — restart the process to pick up a new key.
- Use a key per environment (dev, staging, prod). Mixing them masks misuse and complicates revocation.
- Don't log
err.bodyraw when handling errors — upstreams sometimes echo request fragments that include user content. Logstatus,code,type, and a redacted view ofbody.
Timeouts
fetch has no default timeout. A hung connection on a hot path will pin a worker until the runtime kills it (minutes, sometimes never). Always wire one:
const client = new Bastion({
apiKey: process.env.BASTION_API_KEY,
fetch: (url, init) => {
const ac = new AbortController();
const timer = setTimeout(() => ac.abort(), 60_000);
return globalThis.fetch(url, { ...init, signal: ac.signal })
.finally(() => clearTimeout(timer));
},
});For streaming, the "timeout" you want is usually time-to-first-byte plus an inactivity timer between chunks — not a wall-clock budget — so consumers can complete long generations. Implement that in your iteration loop, not in fetch.
Retries
- Retry:
RateLimitError,UpstreamError,APIConnectionError. - Do not retry:
BadRequestError,AuthenticationError,PermissionDeniedError,NotFoundError. These are deterministic. - Cap retries (3–5 typical). Exponential backoff with jitter.
- For batch / offline jobs, lean on the rate limiter — don't hammer.
A reference implementation lives in Error handling → Retries.
Concurrency
- Bastion enforces a per-key concurrent-request ceiling. Going over it surfaces as
RateLimitError. - Use a semaphore on the client side if your traffic is bursty. A simple
p-limitof 20–50 concurrent calls per worker is a reasonable starting point.
Observability
- Tag each request with a stable id and propagate it via
defaultHeaders(e.g.x-trace-id) so server-side logs join your traces. - Measure latency in your fetch wrapper (see Custom fetch → Tracing).
- On error, capture
{ status, code, type, model, requestId }at minimum. Add an exemplarerr.bodysample at debug level.
Cost / token control
- Set
max_tokensfor every chat completion. Without it, you're paying for whatever the model decides to say. A budget of512–2048covers most chat UIs. - Choose the cheapest model that meets quality:
Llama-3.1-8B-InstructorMistral-7B-Instruct-v0.3for simple tasks; reservegpt-oss-120bandMeta-Llama-3_3-70B-Instructfor heavyweight reasoning. - Cache aggressively for deterministic prompts (
temperature: 0). A simple keyed cache (prompt-hash → response) cuts steady-state costs by 30–60% on most apps. - Stream UIs improve perceived latency but cost the same — don't stream for batch jobs.
Model selection
- Call
models.list()at startup to verify the models you depend on exist for this key. - For multi-tenant apps, let your tenants choose from a curated allowlist — surfacing the raw list invites support load.
- Pin specific model versions where available (
Mistral-Small-3.2-24B-Instruct-2506); behavior of unversioned model ids may shift.
Testing
- Inject a fake
fetch(see Custom fetch → Mocking) rather than hitting the real API in unit tests. - For one or two end-to-end smoke tests, use a separate sandbox key with a low spend cap.
- Pin streaming-test fixtures (raw SSE bytes) — Bastion's parser handles
\r\n\r\nand\n\nevent separators, multi-linedata:, and[DONE]. Cover all three in fixtures.
Versioning
- Pre-1.0 (
v0.x), pin to a minor inpackage.json(e.g."~0.1.0"). The patch range is safe; minors may rename types. - Read the changelog before bumping a minor. Migrations are usually one rename.
Multi-tenancy
- One client per process is fine. Don't construct a new
Bastionper request — there's no connection pool to warm, but you do allocate closures and headers needlessly. - For per-tenant keys, hold a
Map<tenantId, Bastion>and evict on logout / key rotation.