• Chunks a file based on a delimiter regex pattern. Returns chunks with their original positions in the file.

    If maxTokensPerChunk is specified, oversized chunks will be automatically split to ensure all chunks can be embedded.

    Parameters

    • filePath: string

      Absolute path to the file to chunk

    • delimiter: RegExp = ...

      Regex pattern to split on (default: 2+ newlines). Must have 'g' flag.

    • OptionalmaxTokensPerChunk: number

      Optional maximum tokens per chunk (will split if exceeded)

    • charsPerToken: number = 2.5

      Characters per token estimate (default: 2.5 for code)

    Returns FileChunk[]

    Array of chunks with content and position information

    // Split on double newlines
    const chunks = chunkFile('/path/to/file.txt');
    // Split on custom pattern with size limit
    const chunks = chunkFile('/path/to/file.txt', /---+/g, 6000, 2.5);