Fumadocs on TanStack Start

The AsyncIterable returned by streaming SDK calls — parsing semantics, cancellation, and edge cases.

class Stream<T> implements AsyncIterable<T> {
  [Symbol.asyncIterator](): AsyncIterator<T>;
}

Stream<T> is what chat.completions.create({ stream: true }) returns. It wraps a ReadableStream<Uint8Array> and turns server-sent-events into a typed async iterable.

You typically don't construct Stream yourself — but the class is exported for advanced use (custom parsers, testing).

Iteration

The for await loop yields one parsed event per SSE message:

const stream = await client.chat.completions.create({
  model: "gpt-oss-120b",
  messages: [{ role: "user", content: "hi" }],
  stream: true,
});

for await (const chunk of stream) {
  // chunk is a ChatCompletionChunk
}

Parsing rules

The stream consumes the body chunk-by-chunk and buffers across reads.
Events are split on \n\n or \r\n\r\n (whichever appears first in the buffer).
Only lines starting with data: contribute to a parsed message. The optional leading space after the colon is stripped.
Multi-line data: lines are joined with newlines (SSE spec).
data: [DONE] terminates the iterator cleanly — your for await loop ends after the chunk preceding [DONE].
Non-JSON data: lines (e.g. heartbeats, comments) are dropped silently.
A trailing event without a terminator is flushed at end-of-stream.

Yield shape

For chat.completions.create({ stream: true }), the yielded type is ChatCompletionChunk:

interface ChatCompletionChunk {
  id: string;
  object: "chat.completion.chunk";
  created: number;
  model: string;
  choices: ChatChoiceDelta[];
  usage?: TokenUsage;        // present only on the final chunk for providers that emit it
}

The reader is released in a finally block, so abnormal termination (a throw inside your loop, a break) closes the underlying connection deterministically.

Cancellation

const stream = await client.chat.completions.create({
  model: "gpt-oss-120b",
  messages,
  stream: true,
});

for await (const chunk of stream) {
  if (shouldStop()) break;     // releases the reader, closes the connection
  emit(chunk);
}

Abort via `AbortSignal`

Every resource method accepts an optional { signal } as a second argument:

const ac = new AbortController();
setTimeout(() => ac.abort(), 30_000);

const stream = await client.chat.completions.create(
  { model: "gpt-oss-120b", messages, stream: true },
  { signal: ac.signal },
);

Behavior depends on when the signal fires:

Before the response opens — the abort surfaces at the initial create() call as APIConnectionError (wrapping the runtime's abort error).
Mid-stream — the reader's next read() rejects with the runtime's native abort error (AbortError in most runtimes). It propagates raw out of the for await loop; the SDK does not re-wrap it.

You can also wire a global signal via a custom fetch (see custom fetch → timeouts) when you want a single budget across every call.

Errors during iteration

Network-level failures mid-stream surface as exceptions thrown from the iterator — wrap the loop in a try/catch:

try {
  for await (const chunk of stream) {
    // ...
  }
} catch (err) {
  // err is typically the runtime's AbortError or a TypeError from the reader
}

Note: API-level errors (4xx / 5xx) happen at the initial create() call, not mid-stream. Once the HTTP response opens with 200 OK, the iterator either yields chunks or terminates on [DONE] — Bastion does not embed error events in the stream body.

Stream