logo hsb.horse
← Back to snippets index

Snippets

Splitting Text with TransformStream

A TypeScript example using ReadableStream and TransformStream to split long text into chunks of specified size.

Published: Updated:

I wanted to get better at using the Stream API, so I tried experimenting with string manipulation as an example. This implements a TransformStream that splits long text into arrays of a specific size.

Implementation

type Ctrl = TransformStreamDefaultController<string[]>;
class TextArrayTransformStream extends TransformStream<string, string[]> {
#chunk: string[] = [];
#chunkSize: number;
#splitReg: RegExp;
constructor(chunkSize: number, maxTextLength: number) {
super({
transform: (chunk, controller) => this.#handle(chunk, controller),
flush: (controller) => this.#flush(controller),
});
this.#chunkSize = chunkSize;
this.#splitReg = new RegExp(`.{1,${maxTextLength}}`, "g");
}
#handle(chunk: string, controller: Ctrl): void {
for (const str of chunk.match(this.#splitReg) || []) {
if (this.#chunk.length >= this.#chunkSize) {
controller.enqueue(this.#chunk);
this.#chunk = [];
} else {
this.#chunk.push(str);
}
}
}
#flush(controller: Ctrl): void {
if (this.#chunk.length > 0) {
controller.enqueue(this.#chunk);
}
}
}

Helper Function

function toReadableStream(text: string): ReadableStream<string> {
return new ReadableStream({
start(controller) {
controller.enqueue(text);
controller.close();
}
});
}

Usage Example

async function main() {
const text = "Long text...";
const arrayLength = 5; // Group 5 items per array
const textLength = 10; // Each element is 10 characters
const stream = toReadableStream(text)
.pipeThrough(new TextArrayTransformStream(arrayLength, textLength));
const reader = stream.getReader();
while (true) {
const { done, value } = await reader.read();
if (done) break;
console.log(value); // string[] is output sequentially
}
}

Use Cases

This pattern is effective in the following scenarios:

  • When batch processing LLM API responses
  • When paginating large text displays
  • Pre-processing before sending to character-limited APIs

Note that AsyncIterator is not implemented, so the for await…of syntax cannot be used.

Practical Note

This snippet fits well when I do not want to rewrite the same operation or check around TypeScript, JavaScript, Stream over and over. Keeping it as a small helper makes the caller easier to read because the intent stays in the foreground.

If the branches and preconditions start growing, it is usually better not to force everything into one snippet. Splitting the steps and helper responsibilities is easier to maintain.