agent_wires_probe 0.1.9
agent_wires_probe: ^0.1.9 copied to clipboard
Runtime probe that exposes a Flutter app's widget tree to QA agents over the Dart VM service. Pairs with agent_wires_mcp.
Changelog #
0.1.9 #
Fixes screenshot returning a stale frame — most visibly "one navigation
behind": a freshly-navigated, fully idle screen captured the previous route,
byte-identical across repeated calls (#13).
- Root cause: the wrong
RepaintBoundarywas captured. Flutter wraps every Navigator route in its ownRepaintBoundary(_ModalScope). The capture path selected the first boundary in pre-order traversal — the bottom/oldest route in the stack, which the current route covers. Its retained layer still holds whatever it last painted (the previous screen, or a mid-transition frame), and the Overlay never repaints it — which also made'!debugNeedsPaint'assert on that covered route even when the binding was idle. - Fix: capture the visible route. A covered route's layer is detached from
the live scene while the visible route's is attached, so the capture now picks
the largest attached boundary.
debugNeedsPaintis deliberately not part of the selection — the visible route can be transiently dirty (e.g. a blinking text cursor); that case is still handled by the existing settle-and-retry.
0.1.8 #
Fixes scroll on screens with an offstage / not-yet-laid-out Scrollable
ahead of the content list — most commonly an app wrapped in DevicePreview,
but also lazy IndexedStack tab shells and collapsed bottom sheets (#12).
- No more
Null check operator used on a null value. The driver picked the firstScrollablein pre-order traversal even when itsScrollPositionwas unattached, then dereferenced_position!/_pixels!. Selection now skips scrollables without an attached, dimensioned position, and_drivebails out gracefully instead of throwing. - The real content list is no longer shadowed. Selection filters to the requested axis and prefers the largest viewport, so a horizontal chrome strip (e.g. DevicePreview's device picker) no longer wins over the vertical list.
scrollwith anelement_idof a list row now works.scrollInwalks up to the nearest enclosingScrollableof the requested axis when the target has no scrollable descendant, since the list is an ancestor of the row.
0.1.7 #
Adds an on-screen action overlay — a narration layer for a human watching
the agent drive the app. Debug-only (like the rest of the probe) and on by
default; it never appears in the agent's screenshot or snapshot.
- Draws a ripple where the agent taps/long-presses, a trail for swipes/scrolls
and back, a highlight box (with caption) for
enter_text/clear_textandinspect/point-at, and a brief border flash + badge forscreenshotandsnapshot. - Rendered via a single
IgnorePointeroverlay inserted into the app's topmostOverlay; effects auto-expire and are capped so a burst can't accumulate. The host is never installed while the overlay is disabled, so it can't perturb the app's element layout when off. - Kept out of the agent's perception: the snapshot walker skips the overlay subtree (and the snapshot occlusion pass never treats it as a covering page), and the screenshot path suppresses the overlay for the captured frame.
- Toggle at runtime with the new
ext.qa.set_overlayextension (paired with the MCPset_action_overlaytool); compile-time opt-out viaAgentWiresProbe.install(actionOverlay: false).
0.1.6 #
Docs-only release. No code or API changes — refreshes the README so the
pub.dev page reflects the 0.1.5 screenshot and ping version-reporting
behaviour and bumps the install snippet to ^0.1.6.
0.1.5 #
Screenshot reliability + version reporting from a real LLM-agent driving
session. No breaking API changes; AgentWiresProbe.install() is unchanged.
screenshot — capture during repaint and before first frame #
- Captures while a
TextFieldis focused.RenderRepaintBoundary.toImageasserts!debugNeedsPaintin debug builds, so a focused text field — whose blinking cursor repaints every frame — kept the root boundary perpetually dirty andscreenshotalways failed mid-edit ('!debugNeedsPaint': is not true). Capture is now optimistic with a settle-and-retry: on failure it commits a frame (which also defeats stale captures) and retries across the windows between cursor blinks where the boundary is clean. - Pre-first-frame no longer hangs. The "no frame yet" path previously
awaited
endOfFrameunbounded, which could block the call indefinitely when no frame is being produced. The frame wait is now bounded (a live app lands one in ~16ms; the timeout is only a safety net), and theno RepaintBoundary founderror explains the cause and points atwait_for_idle.
ping — reports the probe version #
ext.qa.pingnow returns{"ok": true, "probe_version": "0.1.5"}. The MCP server reads this to warn on probe/server version skew, since protocol drift between the two has been observed to cause odd hangs.probeVersionis exported from the package and pinned to the pubspec version.
0.1.4 #
Perception-accuracy fixes from real LLM-agent driving sessions. No API
changes; AgentWiresProbe.install() is unchanged.
Snapshot — only report what the user can actually see and touch #
- Transparent text-editing overlays no longer drop the page. When a
text field is focused, Flutter inserts text-editing / selection /
autocomplete overlay entries wrapped by
InheritedTheme.captureAll(_CaptureAll). They fill the viewport geometrically but paint nothing, so the containment cover test treated them as a covering page and occluded the real route beneath — every text-field screen (DomainRegisterRoute, etc.) came back with an empty snapshot. The cover test now ignores transient overlays (a_CaptureAllwith no_ModalScope/ModalBarrierin its subtree); only opaque routes and modal barriers occlude. - Pointer-blocked elements are filtered ("phantom FAB"). An
expandable FAB keeps its collapsed sub-items mounted and laid out but
wraps them in
IgnorePointer/AbsorbPointer, so a tap can't reach them. Elements under anignoring/absorbingwrapper are no longer surfaced as actionable. Opacity is deliberately not used as the signal — a transparent widget still receives taps in Flutter, so hiding by opacity would diverge from real tap behaviour.
0.1.3 #
Post-launch iteration driven by real LLM-agent driving sessions. No
breaking API changes; the public AgentWiresProbe.install() /
AgentWiresProbe.routeTracker.createObserver() shape is unchanged.
Snapshot — route-aware and state-aware #
- Route scoping (the big one). Pushed routes don't unmount what's
beneath them — Flutter keeps lower pages alive for the back-swipe
parallax. The walker previously enumerated every layout-active
element, so a snapshot of
DomainDetailsRoutewould include the home screen's FAB, stats cards, and bottom nav (shifted left ~146px by parallax but still in the tree). Now per-Navigatoroverlay introspection drops any entry buried under a topmost viewport-covering page. Covering is judged by containment of the owning theater's box (not the global window), sodevice_previewand nested navigators are scoped against their real viewport. - Widget state in the snapshot. Each element gains an optional
statefield:"on"/"off"for Switch / SwitchListTile / CupertinoSwitch;"checked"/"unchecked"/"indeterminate"for Checkbox / CheckboxListTile;"selected"/"unselected"for Radio / RadioListTile; the value for Slider / RangeSlider. A post-pass hoists state from a contained child to the smallest containing labelled parent, so a labelledSwitchListTilereportsstate: "on"directly without the agent inspecting the inner Switch. - Compact by default.
unresolved[]is now omitted and replaced byunresolved_countunless the MCP caller passesinclude_unresolved: true— cuts snapshot size by ~60% on real screens. - Diagnostics. Snapshot output gains an optional
_debug.occlusionblock:{theaters_found, entries_processed, entries_dropped, viewport_found, theaters: [...]}. Lets the agent verify the route-scoping pass ran and (when it didn't drop anything) see per- theater entry descriptions so the failure mode is debuggable without instrumenting the probe.
inspect — drill into custom widgets #
- Descendants subtree.
ext.qa.inspectnow returns adescendants[]array with{depth, widget_type, visible_text?}for every element up todescendant_depth(default 3, capped at 500 entries). Lets the agent answer "what's inside this Card?" without re-snapshotting. CustomPaintmetadata. Descendant entries forCustomPaintsurfacepainter(the painter's runtime type, e.g."PrecisionReactiveSliderPainter"),foreground_painterif set, and renderedsize. Tells the agent "this region is drawn pixels, not addressable widgets" — the integrator can wrap withSemantics(button: true, label: '…')or attach aKeyto make it targetable. README has an integrator note covering this.
Sync — diagnose stuck idles #
ext.qa.wait_for_idlereturns a structuredIdleStatusJSON:{idle, blocked_by, in_flight_http, has_scheduled_frame, in_transient_callback}. On timeout,blocked_bylists what's still active (scheduled_frame,transient_callback,in_flight_http:N) so the agent knows whether to wait, retry, or proceed.- New
ignore_animations: trueparam drops the scheduled-frame and transient-callback checks; waits only for HTTP. Use on screens with continuous spring animations (custom sliders, looping animations) that never visually settle.
route_stack — works without integrator wiring #
- Multi-navigator route tracking now reads
Navigator.pagesfrom the live Element tree at snapshot time. AutoRoute / GoRouter / any page-based router gets fullroute_stackcoverage with zero observer wiring —createObserver()becomes optional (still available as a fallback for imperativeNavigator.pushNamedapps). - Returns the full back-stack of each navigator, deepest-first:
["UserProfileRoute", "AccountRoute", "MainRoute"]instead of just the leaf.
screenshot — first-frame race fixed #
- If no
RepaintBoundaryis found on the first call (cold start, no frame has rasterized yet), awaits oneendOfFrameand retries before failing. Eliminates the spurious"no RepaintBoundary found"the agent used to hit on the first screenshot of a session.
0.1.0 #
Initial public release.
- VM-service extensions for an LLM agent to introspect and drive a running
Flutter app:
ext.qa.snapshot,inspect,screenshot,tap,long_press,swipe,enter_text,clear_text,scroll,press_back,wait_for_idle,wait_for_route,wait_for_element,get_logs,get_network,ping. - Denoised semantic tree: a typical 800-node tree collapses to ~15–30 agent-targetable elements via a three-pass classify / dedup / label pipeline.
- Multi-Text label inference — invoice cards, list rows, and other
multi-text containers get a label that distinguishes one card from
the next (e.g.
"Sub Total · 9,709.50 LYD · Unpaid · 342844"). - Multi-navigator route tracking —
routeStacksurfaces every observed navigator's top route so tab apps (AutoRoute, nested Navigators) can be told apart. - HTTP capture —
get_networkreturns method / url / status / duration per exchange, drained incrementally via asincecursor. - Log capture —
get_logsteesdebugPrint,FlutterError.onError, andPlatformDispatcher.onErrorinto a 500-entry ring buffer. - No-op in
kReleaseMode; the probe never ships to your users.