buildEvaluationInstructions method

@override

List<ChatMessage>? buildEvaluationInstructions(

List<ChatMessage> messages,
ChatResponse modelResponse,
List<EvaluationContext> additionalContext

)

override

Builds the evaluation instructions (system + user messages).

Return null to signal that a required context was missing.

Implementation

@override
List<ChatMessage>? buildEvaluationInstructions(
  List<ChatMessage> messages,
  ChatResponse modelResponse,
  List<EvaluationContext> additionalContext,
) {
  final ctx = additionalContext.whereType<CompletenessEvaluatorContext>().firstOrNull;
  if (ctx == null) return null;

  final response = modelResponse.text;
  final prompt = '''
# Definition
**Completeness** measures whether the RESPONSE includes all key information, claims, and statements from the GROUND TRUTH.

# Ratings
## [Completeness: 1] Very incomplete — most key points are missing.
## [Completeness: 2] Mostly incomplete — several key points are missing.
## [Completeness: 3] Partially complete — some key points are covered.
## [Completeness: 4] Mostly complete — most key points are covered with minor omissions.
## [Completeness: 5] Fully complete — all key points from the GROUND TRUTH are covered.

# Data
GROUND TRUTH: ${ctx.groundTruth}
RESPONSE: $response

# Tasks
## Score the RESPONSE's completeness relative to the GROUND TRUTH.
- **ThoughtChain**: Think step by step. Start with "Let's think step by step:".
- **Explanation**: A very short explanation of why you think the input Data should get that Score.
- **Score**: An integer score (1–5) based on the definitions.

## Please provide your answers between the tags: <S0>your chain of thoughts</S0>, <S1>your explanation</S1>, <S2>your Score</S2>.
# Output
''';
  return [
    ChatMessage.fromText(ChatRole.system, _systemPrompt),
    ChatMessage.fromText(ChatRole.user, prompt),
  ];
}

buildEvaluationInstructions method

Implementation

CompletenessEvaluator class