eval 0.0.5 copy "eval: ^0.0.5" to clipboard
eval: ^0.0.5 copied to clipboard

Pure Dart LLM evaluation helpers for tests, including judge-based matchers, RAG scoring, and statistics.

example/eval_example.dart

import 'dart:io';

import 'package:eval/eval.dart';

Future<void> main() async {
  final apiKey = Platform.environment['ANTHROPIC_API_KEY'] ?? '';
  if (apiKey.isEmpty) {
    throw StateError('Set ANTHROPIC_API_KEY before running this example.');
  }

  await eval(
    'answers geography questions',
    (apiService) async {
      final answer = await apiService.sendRequest(
        'Answer in one short sentence: What is the capital of France?',
      );

      expect(answer, containsIgnoreCase('paris'));
      expect(answer, sentenceCountBetween(1, 2));

      await expectAsync(
        answer,
        answersQuestion(
          'What is the capital of France?',
          apiService: apiService,
        ),
      );

      await expectAsync(answer, isNotToxic(apiService: apiService));
    },
    apiServices: [
      ExampleClaudeService(
        defaultModel: ExampleClaudeModel.haiku45,
        apiKey: apiKey,
      ),
      ExampleClaudeService(
        defaultModel: ExampleClaudeModel.sonnet45,
        apiKey: apiKey,
      ),
    ],
    numberOfRunsPerLLM: 2,
    verbose: true,
  );
}
2
likes
160
points
217
downloads

Documentation

API reference

Publisher

verified publisherscalabs.de

Weekly Downloads

Pure Dart LLM evaluation helpers for tests, including judge-based matchers, RAG scoring, and statistics.

Homepage
Repository (GitHub)
View/report issues

Topics

#llm #evaluation #testing #ai #rag

License

MIT (license)

Dependencies

http, matcher, test, yaml

More

Packages that depend on eval