Line-Oriented Text Splitting with TransformStream: LineSplitTransform

LineSplitTransform is a TransformStream implementation that converts binary data (Uint8Array) received from a stream into text and splits it by line breaks.

It processes continuous data chunks and emits each line separately, which makes it suitable for incremental processing of log files and large text files.

Implementation

Use TextDecoder to convert bytes to strings, then split on line breaks with the regex /\r?\n/.

Incomplete lines kept in the buffer are completed by the next chunk, and any remaining content is flushed when the stream ends.

export class LineSplitTransform extends TransformStream<Uint8Array, string> {
  constructor() {
    let buffer = "";
    const decoder = new TextDecoder();
    const delimiterRegex = /\r?\n/;

    super({
      transform(chunk, ctrl) {
        buffer += decoder.decode(chunk, { stream: true });
        const lines = buffer.split(delimiterRegex);
        buffer = lines.pop() || "";

        for (const line of lines) {
          ctrl.enqueue(line);
        }
      },
      flush(ctrl) {
        buffer += decoder.decode();
        if (buffer.length > 0) {
          ctrl.enqueue(buffer);
        }
      },
    });
  }
}

Example Usage

This example pipes standard input through LineSplitTransform, splits the input line by line, and outputs each line to the console.

import { LineSplitTransform } from './LineSplitTransform';

// Pipe standard input through LineSplitTransform
process.stdin.pipeThrough(new LineSplitTransform())
  .on('data', (line: string) => {
    console.log(`Received line: ${line}`);
  });

Use Cases

Real-time analysis of log files and text files
Incremental line processing for data received over the network
Implementing newline-delimited text protocols

Summary

With TransformStream, it is easy to implement a mechanism that processes stream data line by line.

Because large files can be handled incrementally instead of being loaded into memory all at once, this approach enables memory-efficient text processing.