Skip to content

Text Splitter

Overview

The Text Splitter node splits a long input string into an array of smaller text chunks based on configurable size and overlap parameters. Use it upstream of embedding nodes or AI processing steps that have token or character limits. Chunks are emitted as an array on the output port; if the input is empty or splitting produces no chunks, an empty array is emitted rather than an error.

Configuration

FieldTypeRequiredDescription
strategyenum (character, token, sentence, recursive)YesSplitting strategy. 'character' splits by raw character count, 'token' splits by estimated token count, 'sentence' splits on sentence boundaries, 'recursive' tries paragraph then sentence then character splits in order.
chunkSizenumberYesMaximum size of each chunk in the unit defined by the chosen strategy (characters or tokens). Must be greater than 0.
chunkOverlapnumberNoNumber of characters or tokens to overlap between consecutive chunks. Helps preserve context across chunk boundaries. Defaults to 0. Must be less than chunkSize.
separatorstringNoCustom separator string used to split text when strategy is 'character'. Defaults to a newline character. Ignored for other strategies.
trimWhitespacebooleanNoWhen true, leading and trailing whitespace is trimmed from each chunk before output. Defaults to true.

Inputs

PortTypeDescription
textstringThe input text to split. Required. If an empty string is received, the node emits an empty array on the output port.

Outputs

PortTypeDescription
chunksstring[]Ordered array of text chunks produced by the split. Each element is a non-empty string of at most chunkSize units (plus overlap from the preceding chunk).
countnumberTotal number of chunks produced. Useful for downstream branching or logging without needing to inspect the full array.

Example

json
{
  "nodeType": "text_splitter",
  "config": {
    "strategy": "recursive",
    "chunkSize": 512,
    "chunkOverlap": 64,
    "trimWhitespace": true
  }
}