MCP Ingest #

A document ingestion pipeline for MakeMind. Converts any file format into a standardized NormalizedDocument with chunking, OCR/ASR support, and fragment extraction.

Features #

Universal input — files, raw bytes, URLs, streams.
Format coverage — HTML, XML, YAML, JSON, Markdown, archives, plus pluggable handlers for PDF, audio, image (via OCR/ASR ports).
Chunking — configurable strategies for downstream embedding / retrieval.
Fragment extraction — semantic fragment splitting.
Pluggable OCR / ASR ports — bridge to your provider of choice.
Pipeline composition — IngestPipeline.defaults() for common cases, fully customizable for advanced use.

Quick Start #

import 'package:mcp_ingest/mcp_ingest.dart';

final pipeline = IngestPipeline.defaults();

final result = await pipeline.ingest(
  IngestInput.fromFile('document.pdf'),
  IngestOptions.defaults,
);

print('Extracted ${result.chunks.length} chunks');

With OCR/ASR plugins:

final pipeline = IngestPipeline(
  ocrPort: tesseractOcrPlugin,
  asrPort: whisperAsrPlugin,
);

Support #

License #

MIT — see LICENSE.

mcp_ingest 0.1.0
mcp_ingest: ^0.1.0 copied to clipboard

Metadata

MCP Ingest #

Features #

Quick Start #

Support #

License #

← Metadata

Documentation

Publisher

Weekly Downloads

Metadata

Topics

License

Dependencies

More

mcp_ingest 0.1.0 mcp_ingest: ^0.1.0 copied to clipboard

Metadata

MCP Ingest #

Features #

Quick Start #

Support #

License #

← Metadata

Documentation

Publisher

Weekly Downloads

Metadata

Topics

License

Dependencies

More

mcp_ingest 0.1.0
mcp_ingest: ^0.1.0 copied to clipboard