ocr_stabilizer 0.1.0 copy "ocr_stabilizer: ^0.1.0" to clipboard
ocr_stabilizer: ^0.1.0 copied to clipboard

Real-time OCR overlay stabilization engine — drift correction, spatial indexing, block tracking. Built for Flutter.

ocr_stabilizer #

A real-time stabilization engine for live OCR overlays. Tracks text block identity across noisy captures, corrects positional drift, and provides spatial indexing for deduplication.

Built for Flutter. Designed for OCR pipelines where screenshots are captured at 1-2 Hz and translated overlays must remain stable as the user scrolls.

The Problem #

Live OCR on scrollable content produces a stream of noisy, jittery observations. The same paragraph appears at slightly different positions each capture. Without a stabilization layer, overlays flicker, duplicate, and drift.

This is the same problem visual SLAM (Simultaneous Localization and Mapping) solves in robotics: associate noisy sensor observations to persistent landmarks, correct accumulated drift, and maintain a consistent map. ocr_stabilizer adapts SLAM techniques to the OCR domain.

Installation #

dependencies:
  ocr_stabilizer: ^0.1.0

Core Components #

TrackedBlock<T> #

The engine's central interface. Every block the engine processes implements this.

class MyBlock implements TrackedBlock<MyPayload> {
  @override final AbsoluteRect absoluteRect;
  @override final String originalText;
  @override final ContainerId? containerId;
  @override final bool isViewportRelative;
  @override final bool isInnerScrollerChild;
  @override final bool isHorizontalScrollChild;
  @override final ScrollContext scrollContext;
  @override final bool isFromStickyElement;
  @override final StickyFallback stickyFallback;
  // ... other required getters
  @override final MyPayload payload;  // opaque — engine carries but never reads
}

The generic T carries app-specific data (translations, styles) without coupling the engine to your domain types.

DriftTracker #

Tracks positional drift per coordinate-space region. OCR positions jitter between captures due to scroll timing, viewport changes, and sensor noise. DriftTracker accumulates observations and computes a robust median correction per region.

final drift = DriftTracker();

// Record a drift observation
drift.addObservation(block, measuredDrift);

// Query the correction for a region
final correction = drift.medianDriftForKey(spaceKey);

// Apply correction to a fresh observation
final corrected = DriftTracker.applyCorrectedPosition(rect, correction);

Key properties:

  • Bounded corrections: Drift is clamped to the median block height per region — the engine can never shift a block farther than a typical line of text.
  • Rolling window: Keeps the last 20 observations per region, so drift adapts to changing conditions.
  • Submap isolation: Normal page-scroll and inner-scroller containers track drift independently via SpaceKey.

SpatialBlockIndex #

Grid-cell spatial index for O(cells) overlap candidate lookup during deduplication. Blocks are indexed by their center position into adaptive grid cells.

final index = SpatialBlockIndex();
index.updateBucketSizes(viewportWidth: 1000, viewportHeight: 800);

index.add(block);
final nearby = index.candidates(queryBlock);
index.remove(block);

Three coordinate-space namespaces prevent cross-space false matches:

  • Normal page-absolute blocks
  • Viewport-relative (fixed/sticky) blocks (vr: prefix)
  • Inner-scroller relative blocks (ic: prefix) — dual-indexed for both IC-to-normal and IC-to-IC comparisons.

HierarchyWeightX #

Extension on TrackedBlock computing hierarchy weight from coordinate-space flags. Higher weight means more constrained coordinate space:

Tier Weight Meaning
Viewport-relative 40 Fixed/sticky — no scroll drift
Nested IC+carousel 30 Compound coordinate space
IC or carousel 20 Single-axis constraint
Normal 10 Unrestricted page scroll

Extension Types #

Zero-cost compile-time wrappers for coordinate safety:

  • AbsoluteRect — wraps Rect for world-space coordinates. Spatial operations (overlaps, expandToInclude) only accept other AbsoluteRect values, preventing accidental coordinate-space mixing.
  • ContainerId — wraps String for stable container identity hashes.
  • SpaceKey — wraps String with typed constructors (normal, ic, unknown) for drift observation coordinate spaces.

Six-Dimension Block Identity #

A block's identity is a six-dimensional signature:

Dimension What It Answers Package Support
Textual What does this text say? originalText on TrackedBlock
Spatial Where is it in the page? absoluteRect, confidence scores
Relative Which coordinate space? SpaceKey, ContainerId
Semantic What kind of element? hierarchyWeight (extension)
Temporal How much evidence? observationCount (ObservableBlock)
Contextual What context was it in? ContextualInvalidationCheck (callback)

API Reference #

Interfaces #

Type Purpose
TrackedBlock<T> Core block contract (13 getters including payload)
ObservableBlock Observation history (counts, votes, provisional state)
ClassificationInput Platform-agnostic viewport geometry
CarouselInput Carousel-specific geometry
SubmapMembership Strategy for coordinate-space partitioning
ContextualInvalidationCheck Callback for context-change detection

Components #

Type Purpose
DriftTracker Regional drift correction with submap isolation
SpatialBlockIndex Grid-cell spatial index for overlap queries
CssSubmapMembership Default WebView submap partitioning
RobustStats Robust statistics (median, MAD, IQR)

Value Types #

Type Purpose
ScrollContext Scroll offsets and carousel identity at capture time
StickyFallback Fallback coordinate context for demoted sticky elements

Extension Types #

Type Wraps Purpose
AbsoluteRect Rect World-space coordinate safety
ContainerId String Stable container identity
SpaceKey String Typed drift observation keys

Platform Support #

The package depends on dart:ui (for Rect, Offset) and therefore requires the Flutter SDK. It has no platform-specific code — it works on Android, iOS, macOS, Windows, Linux, and Web.

The SubmapMembership and ClassificationInput interfaces allow the engine to support different input sources:

Platform SubmapMembership ClassificationInput
WebView CssSubmapMembership (default) CaptureSnapshotAdapter (app-side)
PDF Custom (page-based submaps) Custom (page geometry)
Camera Custom (frame regions) Custom (camera frame)

Roadmap #

  • ✅ Extract DriftTracker and SpatialBlockIndex (#516)
  • ✅ Specify TrackedBlock public contract (#518)
  • ✅ ClassificationInput abstraction (#519)
  • ✅ SubmapMembership interface (#520)
  • ✅ Contextual invalidation callback (#521)
  • ✅ Package README and API docs (#522)
  • ✅ Extract OverlayCacheService (SAR merge, dedup pipeline)
  • ✅ Extract BlockClassifierService
  • ✅ Graduate to standalone repository
0
likes
0
points
794
downloads

Publisher

unverified uploader

Weekly Downloads

Real-time OCR overlay stabilization engine — drift correction, spatial indexing, block tracking. Built for Flutter.

Repository (GitHub)
View/report issues

License

unknown (license)

Dependencies

flutter

More

Packages that depend on ocr_stabilizer