Errors and retries
A retry decision tree for Bastion API responses — what to retry, what to surface, and how to back off.
Bastion uses OpenAI-compatible error envelopes. The full error code table lives on the Getting started page. This page is about deciding what to do with each response in production.
Decision tree
| Status | Code class | Retry? | Backoff |
|---|---|---|---|
2xx | success | — | — |
400 | client | no | fix the request |
401 | client | no | fix the key |
402 | insufficient_funds | no | top up first |
404 | client | no | fix the path/model |
413 | client | no | shrink the upload |
429 | rate_limit_exceeded | yes | honor Retry-After |
5xx (esp. 502, 503) | transient | yes | exponential backoff |
| network error / timeout | transient | yes | exponential backoff |
The rule of thumb: retry only what's clearly transient. Retrying a 400 just produces 400s faster.
Exponential backoff with jitter
For 429 and 5xx, retry up to 3 times with jittered exponential backoff. Pseudocode:
async function withRetry<T>(fn: () => Promise<T>, maxAttempts = 3): Promise<T> {
let lastErr: unknown;
for (let attempt = 0; attempt < maxAttempts; attempt++) {
try {
return await fn();
} catch (err) {
lastErr = err;
if (!isTransient(err) || attempt === maxAttempts - 1) throw err;
const base = 1000 * 2 ** attempt; // 1s, 2s, 4s
const jitter = Math.random() * base * 0.25;
await new Promise((r) => setTimeout(r, base + jitter));
}
}
throw lastErr;
}If the response is 429, prefer the server-supplied delay:
const retryAfter = Number(res.headers.get("Retry-After"));
const delayMs = Number.isFinite(retryAfter) ? retryAfter * 1000 : computedBackoff;Retry-After can technically be an HTTP-date instead of a delta-seconds integer. Bastion emits the integer form; the snippet above falls through to computedBackoff if a date ever shows up.
Stream retries
You cannot resume a stream mid-flight. If a stream connection drops, retry the entire request and discard any partial output. Buffer client-side output until you commit to a successful completion if your UX cannot tolerate "the first three sentences came back twice."
What to surface to users
4xx(except429) → user-facing error. Show the message fromerror.messageif it's safe; otherwise a generic "invalid request."402→ billing alert. Surface "out of credit" with a link to the billing page.429→ soft retry behind the scenes. If retries are exhausted, show "we're at capacity, try again in a moment."5xx→ soft retry. If exhausted, log loudly and surface "service temporarily unavailable."
Don't auto-retry idempotency-sensitive calls indefinitely
The Bastion API does not require idempotency keys, and most endpoints are naturally retry-safe (chat completions, embeddings, models, transcriptions). But if your application performs side effects on success (deduct credits, send an email, write to a downstream system), debounce the downstream side effect on your side — retrying the API call repeatedly is fine; firing the downstream action twice is not.