Markdown-aware chunking strategy.

Splits markdown documents while respecting document structure:

  • Tries to keep sections under the same header together
  • Preserves code blocks
  • Respects list structures
  • Maintains document hierarchy

The chunker splits on markdown headers (# ## ### etc.) and tries to keep content under each header as a coherent unit.

const chunker = new MarkdownChunker({
chunkSize: 1000,
chunkOverlap: 100
});

const result = await chunker.chunk({
id: 'readme',
content: markdownContent,
metadata: { source: 'README.md' }
});

Implements

  • DocumentChunker

Constructors

Methods

Properties

Constructors

Methods

  • Split a document into chunks

    Parameters

    • document: Document

      Document to chunk

    Returns Promise<ChunkResult>

    Chunked documents with metadata

  • Split multiple documents into chunks

    Parameters

    • documents: Document[]

      Documents to chunk

    Returns Promise<ChunkResult[]>

    Array of chunk results

  • Estimate the number of chunks that will be created

    Parameters

    • document: Document

      Document to estimate

    Returns number

    Estimated number of chunks

Properties

config: ChunkerConfig

Configuration for this chunker