EvalRunner class

Runs evaluation suites with bounded concurrency and optional rate limiting. See RFC §6.8 and §6.15.

Available extensions

Constructors

EvalRunner({required EvalEnvironment environment, required AgentHarnessFactory harnessFactory, List<TraceExporter> exporters = const [], RecordingStore? recordingStore, ReportStore? reportStore, RateLimitGate? rateLimitGate, Duration defaultTimeout = const Duration(minutes: 5)})

Properties

defaultTimeout Duration
Default per-trial timeout. Tasks may override via EvalTask.timeout.
final
environment EvalEnvironment
final
exporter TraceExporter
final
harnessFactory AgentHarnessFactory
final
hashCode int
The hash code for this object.
no setterinherited
rateLimitGate RateLimitGate
Optional. The harness can pull this from EvalContext too.
final
recordingStore RecordingStore?
Optional. If set, the runner exposes the store to the harness via the EvalContext (the harness chooses to wrap its LLMClient or not).
final
reportStore ReportStore?
Optional persistent store for run reports. When set, runSuite automatically saves the final EvalRunReport for cross-run analysis (saturation, graduation, diff).
final
runtimeType Type
A representation of the runtime type of the object.
no setterinherited

Methods

noSuchMethod(Invocation invocation) → dynamic
Invoked when a nonexistent method or property is accessed.
inherited
runSuite({required String runName, required EvalSuite suite, int concurrency = 8, int? trialsOverride, bool filter(EvalTask)?}) Future<EvalRunReport>

Available on EvalRunner, provided by the EvalRunnerOps extension

Run all tasks in suite, honoring concurrency and per-task trialsPerRun. Returns the aggregated report.
runTask({required String runName, required EvalTask task, required String agentName, int? trialsOverride}) Future<List<TrialResult>>

Available on EvalRunner, provided by the EvalRunnerOps extension

Convenience: run a single task. Useful for ad-hoc debugging or for rerunning a flaky task with extra trials.
toString() String
A string representation of this object.
inherited

Operators

operator ==(Object other) bool
The equality operator.
inherited