addSourceWithChunking method

Future<SourceAddResult> addSourceWithChunking(
  1. String content, {
  2. String? metadata,
  3. String? name,
  4. String? filePath,
  5. ChunkingStrategy? strategy,
  6. Duration? chunkDelay,
  7. void onProgress(
    1. int done,
    2. int total
    )?,
})

Add a source document with automatic chunking and embedding.

The document is:

  1. Split into chunks based on file type (auto-detected from filePath)
  2. Each chunk is embedded (micro-batch streaming, kIngestionBatchSize at a time)
  3. Source and chunks are incrementally stored in DB

If filePath is provided, chunking strategy is auto-detected:

  • .md, .markdown → Markdown-aware chunking (preserves headers, code blocks)
  • Other files → Default recursive chunking

Memory safety: Uses streaming micro-batch pipeline instead of loading all embeddings into memory. Each batch of kIngestionBatchSize chunks is embedded, saved to DB, then released — keeping memory usage flat.

chunkDelay controls the yield duration between batches (default: 10ms). This allows GC to run and prevents thermal throttling on mobile devices.

Implementation

Future<SourceAddResult> addSourceWithChunking(
  String content, {
  String? metadata,
  String? name,
  String? filePath,
  ChunkingStrategy? strategy,
  Duration? chunkDelay,
  void Function(int done, int total)? onProgress,
}) async {
  final effectiveStrategy = strategy ??
      (filePath != null &&
              (filePath.endsWith('.md') || filePath.endsWith('.markdown'))
          ? ChunkingStrategy.markdown
          : ChunkingStrategy.recursive);

  // Hand the document body to Rust exactly once. `prepareSourceIngestion`
  // performs the source-row INSERT, claim, duplicate decision, chunker run,
  // and stages chunk content in a Rust-resident IngestSession — so neither
  // the chunk content nor the full document round-trips back to Dart.
  final prepared = await rust_ingest.prepareSourceIngestion(
    collectionId: collectionId,
    content: content,
    metadata: metadata,
    name: name ?? filePath,
    strategy: _toIngestStrategy(effectiveStrategy),
    maxChars: maxChunkChars,
    overlapChars: overlapChars,
  );
  return _runPreparedIngestion(
    prepared,
    chunkDelay: chunkDelay,
    onProgress: onProgress,
  );
}