mcp_llm 2.1.0
mcp_llm: ^2.1.0 copied to clipboard
Dart package for Large Language Model integration with Model Context Protocol (MCP). Multi-provider LLM access (Claude / OpenAI / Gemini / Vertex / Bedrock / Cohere / Mistral / Groq / Together / Custo [...]
2.1.0 - 2026-05-03 - Prompt caching across all providers #
Added #
CacheHints(core/models.dart) — provider-agnostic intent for prompt caching (system,tools,messages,ttl). Attached toLlmRequest.cacheHints.LlmCacheMetadataKeys— canonical metadata keys (cache_creation_tokens/cache_read_tokens) so callers can compute savings without provider-specific branches.LlmProvider.supportsPromptCaching— getter on the base interface so callers can branch on capability.- Anthropic Claude / Bedrock-on-Anthropic —
cache_control: ephemeralmarkers on system, last tool, and last 2 messages by default. Length guard (Sonnet/Opus 1024 tok min, Haiku 2048 tok min) skips the marker when content is too small to be cacheable. - OpenAI —
prompt_cache_keyforwarded fromparametersfor explicit cache partitioning. Server-side automatic caching surfaces underprompt_tokens_details.cached_tokens→cache_read_tokenson the response metadata. - Gemini / Vertex AI —
cachedContentresource reference forwarded fromparameters['cached_content'](caller manages lifecycle). Default OFF because of the per-minute storage charge and 32K-token minimum on Pro models — small/one-shot prompts would cost more than they save.
Notes #
- Per-provider default policy when
cacheHintsisnulldocumented inREADME.md§ Prompt Caching. - API is fully additive: existing callers continue to work unchanged.
2.0.1 - 2026-05-02 - temperature is now opt-in across all providers #
Fixed #
- Claude Opus 4.x compatibility.
claude-opus-4-7(and other models in the same family) reject requests that carrytemperaturewith400 invalid_request_error: "temperature is deprecated for this model". The provider was always injectingtemperature: 0.7even when the caller didn't ask for one, making mcp_llm unusable on those models. - All seven providers that hardcoded
'temperature': … ?? 0.7—ClaudeProvider,OpenAiProvider,MistralProvider,GroqProvider,TogetherProvider,CustomProvider, and the Llama / Titan paths inBedrockProvider— now only forwardtemperaturewhen the caller explicitly provides one. Without it the key is omitted from the request body and the backend's own default applies. GeminiProvider,VertexAiProvider, andCohereProvideralready used the conditional pattern and are unchanged.
Behavior #
- Backwards-compatible for callers that do pass
temperature: the value is forwarded as before. - Callers that relied on the implicit
0.7default now get the provider's own default (typically1.0for Claude / OpenAI). Passparameters: {'temperature': 0.7}explicitly to keep the previous behavior.
2.0.0 - 2026-04-30 - MCP spec compliance + 2025-11-25 alignment #
Big-Bang spec normalization. Pairs with mcp_server 2.0 / mcp_client 2.0.
Breaking #
- JSON-RPC batching helpers removed. The MCP spec dropped JSON-RPC batching in 2025-06-18 (PR #416). The 1.x
BatchRequestManager,BatchConfig,LlmClient.executeBatchTools/getBatchToolsByClient/executeBatchPrompts/readBatchResources/getBatchStatistics/flushBatchRequests/hasBatchProcessing,McpClientManager.enableBatchProcessing/addBatchRequest, and themcp_llm.batch.*exports are all deleted. For multi-target fan-out without wire batching, iterate overcallTool/getPrompt/readResourcedirectly or useParallelExecutor/MultiLlmfor LLM-level concurrency. LlmClientconstructor: thebatchConfigparameter is removed.
Notes #
- The
pubspec.yamldescription no longer claims "JSON-RPC 2.0 batch processing". LlmClient.featureStatusno longer reportsbatch_processing.
1.2.0 - 2026-04-28 - Contract Layer & Cloud Providers #
Added #
- Contract Layer adapters for
mcp_bundleports —LlmPortAdapter,AsrPortAdapter,OcrPortAdapter,VisionPortAdapter,StoragePortAdapter. Bridgemcp_llmproviders to the Contract Layer used by knowledge / skill / profile packages. - Six new LLM providers — Bedrock, Cohere, Gemini, Groq, Mistral, Together (in addition to existing Claude / OpenAI / Vertex AI / Custom).
- Cloud Vision providers — Google Cloud Vision, OpenAI GPT-4 Vision.
- ASR providers — OpenAI Whisper, Google Cloud Speech-to-Text.
- OCR providers — Google Cloud Vision OCR, AWS Textract.
- Binary storage providers — AWS S3, Google Cloud Storage.
- Cloud provider registry for centralized capability discovery.
Changed #
- New dependency:
mcp_bundle ^0.3.0for Contract Layer types.
1.1.0 Breaking Changes & New Features #
⚠️ Breaking Changes #
Tool Message API Changes
LlmMessage.tool()role changed: Changed from deprecated'function'to'tool'to align with current OpenAI API standardsLlmMessage.tool()signature extended: Added optionaltoolCallIdandargumentsparameters for proper tool call tracking// Before (v1.0.x) LlmMessage.tool(toolName, result, metadata: metadata) // After (v1.1.0) LlmMessage.tool(toolName, result, metadata: metadata, toolCallId: 'call_123', arguments: {'param': 'value'}, )
Migration Required
If your code creates tool messages manually, update the role handling:
// If you check for tool messages by role
if (message.role == 'function') // OLD - will no longer match
if (message.role == 'tool') // NEW - use this instead
🐛 Fixed #
SSE Buffering Bug (Critical)
- Fixed "Unterminated string" JSON parse errors in all streaming providers
- TCP chunks can split JSON data mid-line causing incomplete JSON parsing
- Implemented StringBuffer pattern to accumulate complete lines before parsing
- Affects: OpenAI, Claude, and Together providers
- This fix resolves random streaming failures in production environments
OpenAI Tool Calls Structure
- Fixed tool_calls format to match OpenAI API specification
- Added required
type: 'function'wrapper in tool calls
// Before (incorrect) {"id": "...", "name": "...", "arguments": {...}} // After (correct OpenAI format) {"id": "...", "type": "function", "function": {"name": "...", "arguments": "..."}} - Added required
✨ Added #
Deferred Tool Loading (Token Optimization)
- 60-80% token reduction for tool definitions in LLM context
- Enable with
useDeferredLoading: truein LlmClient constructor - Sends only tool metadata initially, full schema on-demand when needed
- Zero overhead when disabled (opt-in feature)
- New
DeferredToolManagerclass for managing deferred tool schemas
Multi-Round Tool Calling
- Sequential tool execution with
maxToolRoundsparameter (1-10) - Allows LLM to chain multiple tool calls in a single conversation turn
- Default: 1 (single round, same as before)
LlmClient(
llmProvider: provider,
maxToolRounds: 3, // Allow up to 3 rounds of tool calls
)
Resource Tool Bridge
- Synthetic tools for MCP resource access
mcp_read_resource: Read content from MCP resourcesmcp_list_resources: List all available MCP resources
- Automatically added when resources are available
- Enables LLM to access MCP resources through standard tool calling
Chat Session Enhancements
addToolResult()method now properly tracksargumentsparameter- Structured tool messages with
tool_call_idfor proper conversation flow
📦 New Exports #
src/deferred/deferred_tool_manager.dart- Deferred tool loading manager
1.0.3 #
Fixed #
- OpenAI Provider: Fixed baseUrl handling inconsistency in
completeandgetEmbeddingsmethods- Now correctly appends
/v1/chat/completionsand/v1/embeddingspaths when custom baseUrl is provided - Consistent behavior with
streamCompletemethod
- Now correctly appends
1.0.2 #
Fixed #
- Web Platform Compatibility: Fixed web platform compatibility issues for Flutter web applications
- Replaced
dart:ioHttpClientwithpackage:httpin all LLM providers (Claude, OpenAI, Together) - Added conditional imports for platform-specific storage implementations
- Created web-compatible storage using localStorage for browser environments
- Implemented platform-agnostic compression with conditional imports
- All LLM providers now work seamlessly on web, mobile, and desktop platforms
- Replaced
- Storage System: Refactored storage to use interface pattern with platform-specific implementations
- Created
StorageInterfacefor consistent API across platforms - Implemented
IoStoragefor native platforms using file system - Implemented
WebStoragefor web browsers using localStorage - Added
ChatHistory.fromJson()factory constructor for proper deserialization
- Created
- Compression Utilities: Made compression platform-independent
- Created
CompressionInterfacefor platform abstraction - Native platforms use
dart:iogzip compression - Web platform returns uncompressed data (with TODO for future JS interop)
- Created
Note #
- Vector stores (Pinecone, Weaviate, Qdrant) still require web compatibility updates in a future release
1.0.1 #
Changed #
- Removed unnecessary
mcp_serverandmcp_clientdependencies from production dependencies - Moved
mcp_serverandmcp_clientto dev_dependencies for testing purposes only - Fixed test failures in
multi_client_test.dart - The package now allows users to provide their own MCP client/server instances without forcing dependency installation
1.0.0 - 2025-03-26 🚀 #
🎉 Major Release: Full 2025-03-26 MCP Specification Support #
This is a major milestone release with comprehensive 2025-03-26 Model Context Protocol specification support, delivering significant performance improvements, enhanced security, and production-ready features.
✨ Added #
🔐 Phase 1: OAuth 2.1 Authentication Integration
-
OAuth 2.1 Security Framework
- Complete OAuth 2.1 implementation with PKCE (Proof Key for Code Exchange) support
- Advanced token validation and refresh mechanisms
- Secure authentication context management with auto-refresh capabilities
McpAuthAdapterclass for comprehensive OAuth 2.1 authenticationTokenValidatorinterface withApiKeyValidatorimplementationAuthContextManagerfor authentication lifecycle management
-
MCP Client Integration
- OAuth 2.1 authentication enforcement in
LlmClientAdapter - Authentication status reporting and compliance checking
- Multi-client OAuth management in
McpClientManager - Automatic token refresh and error recovery
- OAuth 2.1 authentication enforcement in
⚡ Phase 2: JSON-RPC 2.0 Batch Processing Optimization
-
Performance Enhancement (40-60% improvement)
BatchRequestManagerfor intelligent JSON-RPC 2.0 batch processing- Configurable batch sizes, timeouts, and optimization strategies
- Smart request batching with automatic fallback mechanisms
- Parallel and sequential execution modes with order preservation
-
LlmClient Batch Methods
executeBatchTools()- Execute multiple tools efficiently in batchgetBatchToolsByClient()- Get tools from multiple clients simultaneouslyexecuteBatchPrompts()- Batch prompt execution with optimizationreadBatchResources()- Efficient batch resource readinggetBatchStatistics()- Comprehensive performance metricsflushBatchRequests()- Manual batch control for optimal timing
🏥 Phase 3: Enhanced 2025-03-26 Methods
Health Monitoring (health/check methods)
-
McpHealthMonitor- Comprehensive health monitoring system- Real-time health checks with configurable timeouts and retries
HealthCheckResultandHealthReportwith detailed status information- System-wide health aggregation and trending analysis
- Auto-recovery mechanisms for unhealthy components
- Health history tracking for performance analysis
-
LlmClient Health Integration
performHealthCheck()- Execute comprehensive health checksgetClientHealth()- Get specific client health statusgetHealthStatistics()- Health metrics and statisticsallClientsHealthyandunhealthyClientsproperties for quick status
Capability Management (capabilities/update methods)
-
McpCapabilityManager- Dynamic capability management- Real-time capability discovery and updates
CapabilityUpdateRequest/Responsefor structured capability management- Event-driven capability notifications with
CapabilityEvent - Version compatibility checking and validation
- Capability statistics and reporting
-
LlmClient Capability Integration
updateClientCapabilities()- Dynamic capability updatesgetClientCapabilities()/getAllCapabilities()- Capability inspectionenableClientCapability()/disableClientCapability()- Runtime controlrefreshAllCapabilities()- Bulk capability refreshgenerateCapabilityRequestId()- Unique request ID generation
Server Lifecycle Management
-
ServerLifecycleManager- Complete server lifecycle control- Full state management (initializing, starting, running, pausing, stopping, etc.)
ServerInfowith comprehensive server status and metadata- Auto-restart capabilities with configurable retry limits
- Lifecycle event tracking with
LifecycleEvent - Integration with health monitoring and capability management
-
LlmClient Lifecycle Integration
startServer()/stopServer()- Basic lifecycle controlpauseServer()/resumeServer()- Advanced lifecycle operationsrestartServer()- Intelligent restart with state preservationgetServerInfo()/getAllServersInfo()- Server status inspectionsetServerAutoRestart()- Auto-restart configurationgetLifecycleStatistics()- Lifecycle metrics and reporting
Enhanced Error Handling
-
EnhancedErrorHandler- Production-grade error handlingMcpEnhancedErrorwith detailed error categorization and metadata- Circuit breaker pattern implementation with configurable thresholds
- Intelligent retry logic with exponential backoff
- Auto-recovery mechanisms with customizable strategies
- Error history tracking and trend analysis
-
LlmClient Error Integration
executeWithErrorHandling()- Intelligent error handling wrappergetErrorStatistics()- Comprehensive error metricsgetClientErrorHistory()/getAllErrorHistory()- Error trackingclearErrorHistory()- Error history managementerrorEventsstream for real-time error monitoring
📡 Event-Driven Architecture
- Real-time Event Streams
capabilityEvents- Real-time capability change notificationslifecycleEvents- Server lifecycle state change eventserrorEvents- Enhanced error event stream with recovery suggestions- Comprehensive event metadata and timestamps
🎯 Integration and Management
-
Feature Status Management
featureStatusproperty for 2025-03-26 feature availability- Comprehensive system status reporting
- Unified configuration management for all features
-
Enhanced Client Management
- Automatic registration of clients with all 2025-03-26 managers
- Intelligent client health awareness in routing decisions
- Multi-manager coordination and state synchronization
🔧 Changed #
LlmClient Enhancements
-
Constructor Parameters - Added optional 2025-03-26 feature configurations:
batchConfig- Batch processing configurationhealthConfig- Health monitoring configurationerrorConfig- Error handling configuration- Feature enable flags for granular control
-
Client Management - Enhanced MCP client lifecycle:
- Automatic registration with health, capability, and lifecycle managers
- Coordinated cleanup and disposal across all managers
- Improved error handling and recovery mechanisms
Performance Optimizations
-
Batch Processing - Significant performance improvements:
- 40-60% faster execution for multiple operations
- Intelligent request optimization and batching
- Reduced network overhead and latency
-
Memory Management - Enhanced resource management:
- Proper disposal of all 2025-03-26 managers
- Memory leak prevention with comprehensive cleanup
- Optimized event stream management
🛡️ Security #
OAuth 2.1 Implementation
- PKCE Support - Proof Key for Code Exchange implementation
- Token Security - Secure token validation and refresh
- Scope Management - Fine-grained permission control
- Compliance Checking - 2025-03-26 OAuth compliance validation
Enhanced Authentication
- Multi-client Authentication - OAuth support across multiple MCP clients
- Authentication Context - Secure context management and lifecycle
- Token Refresh - Automatic token refresh with fallback mechanisms
📊 Monitoring and Observability #
Comprehensive Statistics
- Batch Processing - Request batching efficiency and performance metrics
- Health Monitoring - System-wide health status and trends
- Capability Management - Capability usage and update statistics
- Lifecycle Management - Server state changes and uptime tracking
- Error Handling - Error rates, recovery success, and circuit breaker status
Real-time Monitoring
- Event Streams - Live monitoring of system events
- Health Checks - Continuous health monitoring with alerting
- Performance Tracking - Real-time performance metrics
🔄 Backward Compatibility #
Zero Breaking Changes
- 100% Backward Compatible - All existing v0.x code works unchanged
- Opt-in Features - 2025-03-26 features are optional and configurable
- Migration Path - Gradual feature adoption without code changes
Legacy Support
- Existing APIs - All v0.x APIs remain fully functional
- Default Behavior - Unchanged default behavior for existing functionality
- Deprecation Policy - No deprecations in this release
📁 Examples and Documentation #
New Examples
example/mcp_2025_complete_example.dart- Comprehensive demonstration of all v1.0.0 featuresexample/batch_processing_2025_example.dart- Batch processing optimization showcase- Performance Comparisons - Before/after performance demonstrations
Enhanced Documentation
- README.md - Complete rewrite with v1.0.0 feature coverage
- API Documentation - Comprehensive documentation for all new features
- Migration Guide - Step-by-step migration from v0.x to v1.0.0
- Best Practices - Production-ready configuration examples
🧪 Testing #
Comprehensive Test Suite
- OAuth 2.1 Tests - Complete authentication flow testing
- Batch Processing Tests - Performance and functionality validation
- Health Monitoring Tests - Health check and recovery testing
- Integration Tests - End-to-end feature integration validation
- Error Handling Tests - Circuit breaker and recovery mechanism testing
Test Coverage
- New Features - 100% test coverage for all 2025-03-26 features
- Integration Testing - Cross-feature integration validation
- Performance Testing - Batch processing performance validation
🏗️ Development #
Code Organization
- Modular Architecture - Clean separation of 2025-03-26 features
- Manager Pattern - Consistent manager interfaces across features
- Event-Driven Design - Unified event system for all features
Dependencies
- Core Dart - No additional external dependencies required
- Faker - Added for enhanced test data generation
- Development Tools - Enhanced linting and testing setup
0.2.3 Previous Release #
Added #
- Enhanced plugin system improvements
- Performance optimizations for existing features
- Bug fixes and stability improvements
0.2.2 Previous Release #
Added #
- Additional multi-client management features
- Enhanced error handling for existing functionality
0.2.0 Multi-Client Support #
Added #
-
Multi-MCP Client Support (New Feature)
McpClientManagerclass to manage multiple MCP clients within a single LLM client- Enhanced LLM clients to work with multiple MCP clients identified by string IDs
- Intelligent routing of tool calls to the most appropriate MCP client
- Schema matching algorithm to select the best client for each tool
-
Multi-LLM Client Support (Existing Feature)
- Maintained
MultiClientManagerfor managing multiple LLM clients - Preserved query routing, load balancing, and fan-out capabilities
- Kept the ability to manage multiple LLM clients from a single McpLlm instance
- Maintained
-
LLM Provider Multi Language Support
API Additions #
- Added
mcpClientsparameter tocreateClientmethod for initializing with multiple MCP clients - New MCP client management methods:
addMcpClient,removeMcpClient,setDefaultMcpClientgetMcpClientIds,findMcpClientsWithTool,getToolsByClient
- New tool execution methods:
executeToolWithSpecificClient: Execute a tool on a specific MCP clientexecuteToolOnAllMcpClients: Execute a tool on all MCP clients and collect results
Compatibility Notes #
- All changes maintain backward compatibility with existing code
- Single MCP client approach continues to be supported
- New multi-client functionality is available as an opt-in feature
0.1.0 Initial Release #
Added #
- Initial release
- Features:
- Multiple LLM provider support (Claude, OpenAI, Together AI)
- Multi-client management with routing and load balancing
- Parallel processing across multiple LLM providers
- Plugin system for custom tools and templates
- Document storage and RAG capabilities
- Performance monitoring and task scheduling
Known Limitations #
- API still subject to significant changes
- Limited test coverage
- Some providers may have incomplete implementations
- Documentation is preliminary
Support and Contributing #
🚀 Upgrade to v1.0.0 today and experience the full power of 2025-03-26 MCP specification!