Splits a chunk that exceeds a token limit into smaller sub-chunks
Uses a simple recursive strategy of splitting in half until all
pieces are under the token limit. This prevents chunks from being
rejected by the embedding API due to size constraints.
// Split a chunk that's too large constoversized = { content:"very long content...", ... }; constsubChunks = split_chunk(oversized, 6000, 2.5); // Returns array of smaller chunks, each under 6000 tokens
Splits a chunk that exceeds a token limit into smaller sub-chunks
Uses a simple recursive strategy of splitting in half until all pieces are under the token limit. This prevents chunks from being rejected by the embedding API due to size constraints.