Qubittron Bastion
Best practices

Errors and retries

A retry decision tree for Bastion API responses — what to retry, what to surface, and how to back off.

Bastion uses OpenAI-compatible error envelopes. The full error code table lives on the Getting started page. This page is about deciding what to do with each response in production.

Decision tree

StatusCode classRetry?Backoff
2xxsuccess
400clientnofix the request
401clientnofix the key
402insufficient_fundsnotop up first
404clientnofix the path/model
413clientnoshrink the upload
429rate_limit_exceededyeshonor Retry-After
5xx (esp. 502, 503)transientyesexponential backoff
network error / timeouttransientyesexponential backoff

The rule of thumb: retry only what's clearly transient. Retrying a 400 just produces 400s faster.

Exponential backoff with jitter

For 429 and 5xx, retry up to 3 times with jittered exponential backoff. Pseudocode:

async function withRetry<T>(fn: () => Promise<T>, maxAttempts = 3): Promise<T> {
  let lastErr: unknown;
  for (let attempt = 0; attempt < maxAttempts; attempt++) {
    try {
      return await fn();
    } catch (err) {
      lastErr = err;
      if (!isTransient(err) || attempt === maxAttempts - 1) throw err;
      const base = 1000 * 2 ** attempt; // 1s, 2s, 4s
      const jitter = Math.random() * base * 0.25;
      await new Promise((r) => setTimeout(r, base + jitter));
    }
  }
  throw lastErr;
}

If the response is 429, prefer the server-supplied delay:

const retryAfter = Number(res.headers.get("Retry-After"));
const delayMs = Number.isFinite(retryAfter) ? retryAfter * 1000 : computedBackoff;

Retry-After can technically be an HTTP-date instead of a delta-seconds integer. Bastion emits the integer form; the snippet above falls through to computedBackoff if a date ever shows up.

Stream retries

You cannot resume a stream mid-flight. If a stream connection drops, retry the entire request and discard any partial output. Buffer client-side output until you commit to a successful completion if your UX cannot tolerate "the first three sentences came back twice."

What to surface to users

  • 4xx (except 429) → user-facing error. Show the message from error.message if it's safe; otherwise a generic "invalid request."
  • 402 → billing alert. Surface "out of credit" with a link to the billing page.
  • 429 → soft retry behind the scenes. If retries are exhausted, show "we're at capacity, try again in a moment."
  • 5xx → soft retry. If exhausted, log loudly and surface "service temporarily unavailable."

Don't auto-retry idempotency-sensitive calls indefinitely

The Bastion API does not require idempotency keys, and most endpoints are naturally retry-safe (chat completions, embeddings, models, transcriptions). But if your application performs side effects on success (deduct credits, send an email, write to a downstream system), debounce the downstream side effect on your side — retrying the API call repeatedly is fine; firing the downstream action twice is not.

On this page