dart_agent_core 2.0.3
dart_agent_core: ^2.0.3 copied to clipboard
A mobile-first, local-first Dart library for building and evaluating stateful, tool-using AI agents with multi-provider LLM support.
2.0.3 #
- BREAKING: replace
systemCallback, legacy pre/post tool hooks, turn-completion hooks, and controller request/response loop controls with the unifiedAgentHookpipeline.AgentControlleris now used for observation events; flow control belongs in hooks. - Add typed hook phases for
beforeRun,beforeModelCall,onModelChunk,afterModelCall,beforeToolCall,afterToolCall,onTurnCompletion,beforePersistState,afterPersistState, andafterRun. - Hooks can rewrite current model requests without persistence, or directly mutate
context.statewhen data should survive later turns or resume.ModelCallHookResult.messagesToPersistwas removed so hook outcomes only describe flow control. - Add hook-focused tests and runnable examples covering model input rewriting, synthetic model responses, streaming chunk transforms, model retries, tool deny/defer/rewrite/stop, final-turn continuation, abort handling, and state persistence hooks.
resume()/resumeStream()now acceptcancelToken,useStream, andmaxTurns, matchingrun()/runStream()controls.
2.0.2 #
- Security hardening — comprehensive audit and fix of credential handling, error exposure, and logging:
- Gemini: move API key from URL query string to
x-goog-api-keyheader (prevents exposure in server/proxy logs). - All clients: keep provider error response bodies in thrown exceptions so callers can debug 4xx request issues.
- Claude / Bedrock: avoid logging raw provider error bodies; log status context while preserving error details in exceptions.
- All clients: make
apiKeyfields private (_apiKey); BedrockaccessKeyId/secretAccessKeyalso privatized. - Gemini: fix
maxRetryDelayMsdefault (was 3000, now 30000 — previous value was less thaninitialRetryDelayMs, defeating the cap). - FileStateStorage / RecordingStore: truncate session/hash IDs in error log messages to prevent enumeration.
- Gemini: move API key from URL query string to
2.0.1 #
- Fix:
StepStatusserialization — convert to enhanced enum with stablenamefield (pending/in_progress/completed/cancelled) to ensure backward-compatible JSON persistence regardless of Dart identifier naming. - Simplify planner code with Dart 3.0+
.indexed+ pattern destructuring for indexed iteration, andfirstWhere+orElsefor status parsing.
2.0.0 #
- Web & WASM support: the package now supports all 6 platforms (iOS, Android, Web, Windows, macOS, Linux) and is WebAssembly-compatible. Platform-specific concerns (file-system state storage, HTTP adapters, JavaScript runtime) are resolved at compile time via conditional exports, so the public API is identical across native and web.
- BREAKING:
FileStateStoragenow takes aString directoryPathinstead of aDirectory. MigrateFileStateStorage(dir)toFileStateStorage(dir.path). On web, prefer an in-memory orlocalStorage-backedStateStoragerather thanFileStateStorage. - Network errors from the OpenAI, Gemini, and Responses clients are now surfaced via
DioExceptioninstead ofdart:ioSocketException, removingdart:iofrom the public reachable surface. - Add a Platform Support section (with a web-safe API-key snippet) to both READMEs.
- Internal style cleanup: lowerCamelCase identifiers and braced flow-control statements; pana static-analysis now scores 50/50 (160/160 overall).
1.0.14 #
- Add agent run lifecycle hooks.
- Fix OpenAI duplicate tool call ids.
- Align agent run state and tool call ids.
- Prevent failed retries from bloating agent history.
1.0.13 #
- Fix
EventBus: ignore late events afterclose()to prevent errors when events are published during teardown. - Add unit tests for
LoopDetector,Planner, andEventBus.
1.0.12 #
- Add
EvalTranscriptRecorder: automatically captures execution traces (messages, tool calls, reasoning steps, token/turn metrics) fromAgentControllerduring eval runs — harnesses no longer need to manually build transcripts. EvalRunnernow integratesEvalTranscriptRecorderper trial, producing richerTranscriptdata with timing metrics (time-to-first-token, time-to-last-token).- Simplify
AgentHarnessFactoryinterface — harnesses focus on running the agent and returningOutcome, transcript recording is handled by the framework. - Update eval guide docs and examples to reflect the new recorder-based workflow.
1.0.11 #
- Add evaluation subsystem under
package:dart_agent_core/eval.dart(separate entry point — no impact on existingdart_agent_core.dartconsumers).- Core:
EvalTask/EvalSuite/EvalRunner/Trial/Outcome/Transcript/EvalEnvironment/AgentHarnessFactory. - Graders:
CodeGrader(rule / script / classification),ModelGrader(LLM-as-judge with requiredUnknownescape hatch),HumanGrader. - LLM record / replay: hash-based
RecordingLLMClient/ReplayLLMClient,FileRecordingStore, per-trialcacheSaltsotrialsPerRun > 1does not collapse into a single cache entry. - Rate limiting:
RpmRateLimitGate,TpmRateLimitGate,NoopRateLimitGate— decoupled from concurrency, replay hits skip the gate. - Metrics:
pass@k(Codex unbiased estimator),pass^k(empirical),ClassificationMetrics, bucket pass rates, grader means. - Reporting:
FileReportStore,diffRunReports, markdown + JSON output. - Suite health:
SuiteHealthAnalyzer— graduation candidates and broken-task detection across runs. - Calibration:
JudgeCalibrator— Spearman / Pearson / MAE against a human-labeled golden set. - Loaders: code-defined suites and JSON file-tree-defined suites via
loadEvalSuiteFromDir+GraderRegistry. - Observability:
JsonlTraceExporter,CompositeTraceExporter, and built-inLangfuseTraceExporter(background batched ingestion to Langfuse cloud or self-hosted, exponential-backoff retries, schema aligned with langfuse v4).
- Core:
- Add
bin/transcripts.dartCLI:list/show/diff/exportfor persisted run reports, runnable viadart run dart_agent_core:transcripts. - Add three runnable end-to-end demos under
example/eval_demo/:calculator/andcard_agent/(code-defined suites),pkm_agent/(file-defined suite with fixtures); plus a 130-line self-containedexample/min_eval/main.dart. - Add
doc/eval-guide.md(English) anddoc/eval-guide.zh-CN.mdcovering core concepts, components, two suite definition styles, three grader styles, metrics, record/replay, Langfuse export, cross-run health, judge calibration, CLI, and an API cheat sheet. Linked from each README.
1.0.10 #
- Update README and documentation.
1.0.9 #
- BREAKING: Replace
SystemCallbackreturn type from Dart Record(SystemMessage?, List<Tool>, List<LLMMessage>)toSystemCallbackResultclass for broader SDK compatibility and clearer semantics. Callers usingsystemCallbackmust update to access.systemMessage,.tools,.requestMessagesproperties instead of positional destructuring.
1.0.8 #
- Lower Dart SDK minimum constraint from
^3.10.4to^3.9.2to support Flutter 3.35 and HarmonyOS ecosystem. - Add
ToolParameterMode.objectfor tools: receive all arguments as a singleMap<String, dynamic>instead of positional/named parameter mapping viaFunction.apply. - Update tool documentation in README, README.zh-CN, and
doc/tools_and_planning.mdwith object mode usage and examples.
1.0.7 #
- Add
baseUrl/regionto request log messages across all LLM clients for easier debugging. - Add comprehensive provider documentation for Chinese README (Kimi, Qwen, GLM, Doubao, MiniMax, Ollama, OpenRouter, Claude direct).
- Add OpenAI-compatible and Anthropic-compatible provider sections to
doc/providers.md. - Document thinking/reasoning model support (
reasoning_contenthandling). - Update examples list in both READMEs with all provider examples.
- Fix typo "Sendings" → "Sending" in
OpenAIClientstreaming log. - Fix broken
docs/links todoc/in Chinese README.
1.0.6 #
- Fix
OpenAIClientnot handlingreasoning_contentfor thinking/reasoning models (e.g.kimi-k2-thinking,o1,deepseek-r1). - Parse
reasoning_contentfrom both non-streaming and streaming responses intoModelMessage.thought. - Re-send
reasoning_contentin assistant messages during multi-turn conversations to satisfy API validation. - Add
simple_agent_with_kimi_vision_example.dartfor image analysis with Kimi.
1.0.5 #
- Add examples for MiniMax, Kimi, Volcengine Seed, Zhipu GLM, and Qwen via OpenAI-compatible API.
- Fix
OpenAIResponseTransformernot extractingfinish_reasonwhen provider sends it in the same chunk asusage(e.g. GLM). - Fix double JSON encoding of
FunctionCall.argumentsinOpenAIClientandResponsesClientrequest body.
1.0.4 #
- Add
DirectorySkillsupport: load skills fromSKILL.mdfiles in a directory tree with automatic discovery and system prompt injection. - Add
JavaScriptRuntimeandNodeJavaScriptRuntimefor executing JavaScript scripts with bidirectional Dart↔JS bridge communication. - Integrate directory skills and JavaScript execution into
StatefulAgent. - Add
simple_agent_with_directory_skills_example.dartexample. - Update README documentation.
1.0.3 #
- Add
maxTurnsprotection toStatefulAgentto prevent potential infinite loops. - Add internal retry limit for empty model responses/stop reasons in
runStream.
1.0.2 #
- Add standard entry-point
example/main.dartto fix pub.dev example discovery. - Add comprehensive API documentation comments (
///) to core library members. - Fix library-level documentation in
lib/dart_agent_core.dart.
1.0.1 #
- Add
ClaudeClientfor direct Anthropic Messages API support (no AWS Bedrock required). - Add examples for Ollama and OpenRouter usage via
OpenAIClient. - Add Claude example with
ClaudeClient. - Rename
docs/todoc/andexamples/toexample/to follow pub.dev conventions.
1.0.0 #
- Initial release.
- Multi-provider LLM support: OpenAI (Chat Completions & Responses API), Google Gemini, AWS Bedrock (Claude).
- StatefulAgent with autonomous tool-calling loop.
- Multimodal message support (text, image, audio, video, document).
- Streaming via
runStream()with fine-grainedStreamingEvents. - Dynamic Skill system with runtime activation/deactivation.
- Sub-agent delegation with
cloneand named sub-agents. - Planning via
write_todostool withPlanMode. - Context compression with
LLMBasedContextCompressorand episodic memory. - Loop detection (tool signature tracking + LLM-based diagnosis).
- AgentController with Pub/Sub and Request/Response lifecycle hooks.
systemCallbackfor per-call request modification.- FileStateStorage for JSON-based state persistence.