Interface for document chunking strategies.

Chunkers split large documents into smaller pieces suitable for embedding and retrieval. Different strategies work better for different content types.

const chunker = new RecursiveTextChunker({
chunkSize: 1000,
chunkOverlap: 200
});

const result = await chunker.chunk({
id: 'doc1',
content: largeDocument,
metadata: { source: 'manual.pdf' }
});

console.log(`Split into ${result.chunks.length} chunks`);
result.chunks.forEach((chunk, i) => {
console.log(`Chunk ${i}: ${chunk.metadata?.chunkIndex}/${chunk.metadata?.totalChunks}`);
});
interface DocumentChunker {
    config: ChunkerConfig;
    chunk(document: Document): Promise<ChunkResult>;
    chunkMany(documents: Document[]): Promise<ChunkResult[]>;
    estimateChunks(document: Document): number;
}

Methods

Properties

Configuration for this chunker