eval 0.0.5
eval: ^0.0.5 copied to clipboard

Published 2 months ago •

SDKDart Flutter

PlatformAndroid iOS Linux macOS web Windows

2

→

Metadata

Pure Dart LLM evaluation helpers for tests, including judge-based matchers, RAG scoring, and statistics.

0.0.5 #

Increase timeout for evals to 10 minutes, expose timeout attribute

0.0.4 #

Fix parseMarkdownBody so nested frontmatter collections are mutable. Lists and maps nested inside the parsed frontmatter previously came back as immutable YamlList/YamlMap instances and threw UnsupportedError on assignment.
Export deepConvertYaml, a recursive helper that converts YamlMap/YamlList trees to mutable Map<String, dynamic> and List structures.

0.0.3 #

Overhaul the README and public Dartdoc so eval(...) is documented as the primary workflow instead of raw test(...).
Document the full exported matcher surface, including the previously omitted JSON array, schema-path, frontmatter schema, and RAG matchers.
Refresh the bundled example to show an end-to-end eval(...) run with sync and async assertions.
Reset internal eval run state after each run so expect(...) cleanly falls back to normal test behavior outside an active eval.

0.0.2 #

Align package metadata and documentation with the published API.
Fix async LLM and RAG matcher behavior so sync expect(...) usage fails with clear guidance instead of silently succeeding.
Fix APICallQueue recovery so one failed request does not poison later queued calls.
Preserve detailed evaluateRag() metadata including relevant context indices, unsupported claims, and joined metric reasons.
Treat empty frontmatter as valid frontmatter and reject malformed YAML with closing delimiters.
Distinguish missing paths from explicit null values in schema-based path matchers.

0.0.1 #

Initial public release of the eval package.
Added string, JSON, schema, frontmatter, distance, LLM-judge, and RAG matchers.
Added aggregate statistics and prompt comparison helpers.
Added the APICallService abstraction and the bundled Claude example service.

← Metadata

Documentation

Publisher

Weekly Downloads

Metadata

Pure Dart LLM evaluation helpers for tests, including judge-based matchers, RAG scoring, and statistics.

Homepage
Repository (GitHub)
View/report issues

Topics

#llm #evaluation #testing #ai #rag

License

Dependencies

http, matcher, test, yaml

More

Packages that depend on eval