Qubittron Bastion
Best practices

Production checklist

Walk through this list before routing real traffic to the Bastion API.

A short, opinionated pre-launch checklist. Most teams already do most of this — the value is in catching the one item you forgot.

Credentials

  • API key lives in a secrets manager, not in source control or in client code.
  • Separate keys per environment (dev, staging, prod) and per service.
  • Key rotation runbook documented and tested at least once.
  • Logs, error reports, and metrics tags are redacted to never include the raw key.

Reliability

  • Retry wrapper in place for 429 and 5xx with exponential backoff and jitter.
  • 429 responses honor the Retry-After header.
  • No retry-on-4xx (other than 429).
  • Streaming requests retry the entire request on drop — no mid-stream resume.
  • Downstream side effects (billing, emails, DB writes) are guarded against duplicate execution if the same Bastion call is retried.

Observability

  • Latency (p50, p95, p99), error rate, and 4xx-vs-5xx breakdown are dashboards you can open in under 30 seconds.
  • Alerts for: sustained 5xx rate above your SLO, sustained 429 rate, 402 insufficient_funds, key revocation.
  • Sampled request/response logging (sampled, redacted, retained per your policy — never log full PII-bearing prompts).

Cost and capacity

  • Per-environment monthly spend cap configured in the dashboard (so a runaway test doesn't drain prod budget).
  • Model selection documented per use-case (don't default everything to the largest model).
  • Auto-recharge or low-balance alerts wired up to billing — 402 should never be the way you learn you ran out of credit.
  • Token budgets per call (max_tokens) set to the smallest reasonable value for each surface.

Streaming-specific

  • Reverse proxies and CDNs do not buffer SSE responses (proxy_buffering off, no caching, no compression on stream routes).
  • Client UI commits partial assistant output only after the stream ends cleanly.
  • AbortController / cancellation tested end-to-end (user clicks "stop" → upstream connection closes promptly).

Data and privacy

  • responses endpoint usage always sets store: false (required by the API).
  • Prompts containing personally identifiable information are redacted or scoped per your data agreement before send.
  • Audit logging for which user prompted which call (your own user ID → Bastion call), separate from request bodies.
  • Data-residency requirements (PIPEDA, provincial, sector-specific) are documented and the Canadian-inference guarantee is referenced in your DPA where relevant.

Failure drills

  • You've tested the path where Bastion returns 5xx for sustained periods (chaos test or a fault-injection switch).
  • You've tested the path where the API key is revoked mid-traffic (401 cascade).
  • You've tested the path where the account hits a hard credit limit (402 cascade — your UI should degrade gracefully, not 500).

When every box is checked, you're ready to ship. If any box is unchecked, write down why before launching anyway.

On this page