Options for calculating chunk statistics

interface ChunkStatsOptions {
    charsPerToken?: number;
    costPerToken?: number;
}

Properties

charsPerToken?: number

Average number of characters per token (default: 4) This is a rough estimate - actual tokenization varies by content

costPerToken?: number

Cost per token in dollars (default: 0.00000002) Default is based on OpenAI text-embedding-3-small pricing ($0.02 per 1M tokens) Adjust based on your embedding provider's pricing