CalibrationReport class
LLM judge 与人工评分的相关性。
Constructors
-
CalibrationReport({required double spearmanCorrelation, required double pearsonCorrelation, required double agreementRate, required double meanAbsoluteError, required int samples, required int agreementCount, required int disagreementCount, required List<
TrialDisagreement> disagreements}) -
const
Properties
- agreementCount → int
-
final
- agreementRate → double
-
"judge 与人工差 ≤ tolerance"的样本占比 ∈
0, 1。final - disagreementCount → int
-
final
-
disagreements
→ List<
TrialDisagreement> -
不一致最严重的 trial,由 judge - human 绝对差降序排列,前 N 个。
final
- hashCode → int
-
The hash code for this object.
no setterinherited
- meanAbsoluteError → double
-
平均绝对误差(MAE)。
final
- meetsAnthropicBar → bool
-
Anthropic Step 5 的"上线门槛":Spearman ≥ 0.7。
no setter
- pearsonCorrelation → double
-
Pearson 线性相关系数 ∈
-1, 1。final - runtimeType → Type
-
A representation of the runtime type of the object.
no setterinherited
- samples → int
-
final
- spearmanCorrelation → double
-
Spearman 等级相关系数 ∈
-1, 1。Anthropic 推荐 ≥ 0.7 才上线。final
Methods
-
noSuchMethod(
Invocation invocation) → dynamic -
Invoked when a nonexistent method or property is accessed.
inherited
-
toJson(
) → Map< String, dynamic> -
toString(
) → String -
A string representation of this object.
inherited
Operators
-
operator ==(
Object other) → bool -
The equality operator.
inherited