post_process_numeric library

Numeric normalization passes for OCR post-processing.

Handles digit segment corrections, numeric gap repair, and date separator normalization.

Constants

digitConfusionMap → const Map<String, String>
Map of letters commonly confused with digits by OCR.
digitNonAlnumMap → const Map<String, String>
Map of non-alphanumeric characters commonly confused with digits.
highConfidenceDigitLookalikes → const Set<String>
Letters that are high-confidence digit lookalikes — safe to convert to digits even with only one digit-dominant neighbor.

Functions

normalizeDateSeparators(String line) String
Removes OCR-introduced spaces around date separators and digit clusters.
normalizeDigitSegments(String line) String
Corrects letter-like confusions inside digit-dominant token segments.
normalizeNumericGaps(String line) String
Repairs noisy separators and spacing in numeric expressions.
normalizeStandaloneDecimalLikeToken(String line) String
Normalizes standalone decimal-like tokens made only of digit lookalikes.
normalizeStructuredNumericFieldValue(String line) String
Normalizes numeric-like values in simple structured field lines.