CalibrationReport class

LLM judge 与人工评分的相关性。

Constructors

CalibrationReport({required double spearmanCorrelation, required double pearsonCorrelation, required double agreementRate, required double meanAbsoluteError, required int samples, required int agreementCount, required int disagreementCount, required List<TrialDisagreement> disagreements})
const

Properties

agreementCount int
final
agreementRate double
"judge 与人工差 ≤ tolerance"的样本占比 ∈ 0, 1
final
disagreementCount int
final
disagreements List<TrialDisagreement>
不一致最严重的 trial,由 judge - human 绝对差降序排列,前 N 个。
final
hashCode int
The hash code for this object.
no setterinherited
meanAbsoluteError double
平均绝对误差(MAE)。
final
meetsAnthropicBar bool
Anthropic Step 5 的"上线门槛":Spearman ≥ 0.7。
no setter
pearsonCorrelation double
Pearson 线性相关系数 ∈ -1, 1
final
runtimeType Type
A representation of the runtime type of the object.
no setterinherited
samples int
final
spearmanCorrelation double
Spearman 等级相关系数 ∈ -1, 1。Anthropic 推荐 ≥ 0.7 才上线。
final

Methods

noSuchMethod(Invocation invocation) → dynamic
Invoked when a nonexistent method or property is accessed.
inherited
toJson() Map<String, dynamic>
toString() String
A string representation of this object.
inherited

Operators

operator ==(Object other) bool
The equality operator.
inherited