eval 0.0.2 copy "eval: ^0.0.2" to clipboard
eval: ^0.0.2 copied to clipboard

Pure Dart LLM evaluation helpers for tests, including judge-based matchers, RAG scoring, and statistics.

0.0.2 #

  • Align package metadata and documentation with the published API.
  • Fix async LLM and RAG matcher behavior so sync expect(...) usage fails with clear guidance instead of silently succeeding.
  • Fix APICallQueue recovery so one failed request does not poison later queued calls.
  • Preserve detailed evaluateRag() metadata including relevant context indices, unsupported claims, and joined metric reasons.
  • Treat empty frontmatter as valid frontmatter and reject malformed YAML with closing delimiters.
  • Distinguish missing paths from explicit null values in schema-based path matchers.

0.0.1 #

  • Initial public release of the eval package.
  • Added string, JSON, schema, frontmatter, distance, LLM-judge, and RAG matchers.
  • Added aggregate statistics and prompt comparison helpers.
  • Added the APICallService abstraction and the bundled Claude example service.
2
likes
0
points
217
downloads

Publisher

verified publisherscalabs.de

Weekly Downloads

Pure Dart LLM evaluation helpers for tests, including judge-based matchers, RAG scoring, and statistics.

Homepage
Repository (GitHub)
View/report issues

Topics

#llm #evaluation #testing #ai #rag

License

unknown (license)

Dependencies

http, matcher, test, yaml

More

Packages that depend on eval