EvalTask class abstract
Anthropic: a task is a single test with defined inputs and success criteria. Implementations are pure data — they do not run the agent.
Note on what's intentionally not here:
- There is no
successCriteriamap. The success contract lives entirely inside graders; a description-only mirror of it on the task would drift from the actual graders. If you want a human-readable summary of what a task tests, use description plus metadata. - There is no
expectedBehaviorenum (positive/negative). Use'failure_bucket'or a similar tag to mark positive vs negative tasks if you need to filter by it.
- Implementers
Constructors
- EvalTask()
Properties
- description → String
-
One-line human description.
no setter
-
graders
→ List<
Grader> -
Graders attached to this task. At least one is required.
no setter
- hashCode → int
-
The hash code for this object.
no setterinherited
- id → String
-
Stable id. Immutable after creation.
no setter
-
input
→ Map<
String, dynamic> -
Input handed to the agent harness. Schema is application-defined.
no setter
-
metadata
→ Map<
String, String> -
Free-form labels for filtering and bucketing. Conventional keys:
failure_bucket,fixture,difficulty,language,expected.no setter - referenceSolution → ReferenceSolution?
-
Anthropic Step 2: a known working solution that passes all graders.
Strongly recommended — proves the task is solvable and graders are
configured correctly. May be required by the parent suite.
no setter
- runtimeType → Type
-
A representation of the runtime type of the object.
no setterinherited
- timeout → Duration?
-
Optional per-task timeout. Falls back to the runner default if null.
no setter
- trialsPerRun → int
-
Anthropic non-determinism: how many trials to run per dataset run.
Defaults to 1. Set ≥3 for tasks where stability matters (pass^k).
no setter
Methods
-
noSuchMethod(
Invocation invocation) → dynamic -
Invoked when a nonexistent method or property is accessed.
inherited
-
toString(
) → String -
A string representation of this object.
inherited
Operators
-
operator ==(
Object other) → bool -
The equality operator.
inherited