agent_wires_probe 0.1.4
agent_wires_probe: ^0.1.4 copied to clipboard
Runtime probe that exposes a Flutter app's widget tree to QA agents over the Dart VM service. Pairs with agent_wires_mcp.
Changelog #
0.1.4 #
Perception-accuracy fixes from real LLM-agent driving sessions. No API
changes; AgentWiresProbe.install() is unchanged.
Snapshot — only report what the user can actually see and touch #
- Transparent text-editing overlays no longer drop the page. When a
text field is focused, Flutter inserts text-editing / selection /
autocomplete overlay entries wrapped by
InheritedTheme.captureAll(_CaptureAll). They fill the viewport geometrically but paint nothing, so the containment cover test treated them as a covering page and occluded the real route beneath — every text-field screen (DomainRegisterRoute, etc.) came back with an empty snapshot. The cover test now ignores transient overlays (a_CaptureAllwith no_ModalScope/ModalBarrierin its subtree); only opaque routes and modal barriers occlude. - Pointer-blocked elements are filtered ("phantom FAB"). An
expandable FAB keeps its collapsed sub-items mounted and laid out but
wraps them in
IgnorePointer/AbsorbPointer, so a tap can't reach them. Elements under anignoring/absorbingwrapper are no longer surfaced as actionable. Opacity is deliberately not used as the signal — a transparent widget still receives taps in Flutter, so hiding by opacity would diverge from real tap behaviour.
0.1.3 #
Post-launch iteration driven by real LLM-agent driving sessions. No
breaking API changes; the public AgentWiresProbe.install() /
AgentWiresProbe.routeTracker.createObserver() shape is unchanged.
Snapshot — route-aware and state-aware #
- Route scoping (the big one). Pushed routes don't unmount what's
beneath them — Flutter keeps lower pages alive for the back-swipe
parallax. The walker previously enumerated every layout-active
element, so a snapshot of
DomainDetailsRoutewould include the home screen's FAB, stats cards, and bottom nav (shifted left ~146px by parallax but still in the tree). Now per-Navigatoroverlay introspection drops any entry buried under a topmost viewport-covering page. Covering is judged by containment of the owning theater's box (not the global window), sodevice_previewand nested navigators are scoped against their real viewport. - Widget state in the snapshot. Each element gains an optional
statefield:"on"/"off"for Switch / SwitchListTile / CupertinoSwitch;"checked"/"unchecked"/"indeterminate"for Checkbox / CheckboxListTile;"selected"/"unselected"for Radio / RadioListTile; the value for Slider / RangeSlider. A post-pass hoists state from a contained child to the smallest containing labelled parent, so a labelledSwitchListTilereportsstate: "on"directly without the agent inspecting the inner Switch. - Compact by default.
unresolved[]is now omitted and replaced byunresolved_countunless the MCP caller passesinclude_unresolved: true— cuts snapshot size by ~60% on real screens. - Diagnostics. Snapshot output gains an optional
_debug.occlusionblock:{theaters_found, entries_processed, entries_dropped, viewport_found, theaters: [...]}. Lets the agent verify the route-scoping pass ran and (when it didn't drop anything) see per- theater entry descriptions so the failure mode is debuggable without instrumenting the probe.
inspect — drill into custom widgets #
- Descendants subtree.
ext.qa.inspectnow returns adescendants[]array with{depth, widget_type, visible_text?}for every element up todescendant_depth(default 3, capped at 500 entries). Lets the agent answer "what's inside this Card?" without re-snapshotting. CustomPaintmetadata. Descendant entries forCustomPaintsurfacepainter(the painter's runtime type, e.g."PrecisionReactiveSliderPainter"),foreground_painterif set, and renderedsize. Tells the agent "this region is drawn pixels, not addressable widgets" — the integrator can wrap withSemantics(button: true, label: '…')or attach aKeyto make it targetable. README has an integrator note covering this.
Sync — diagnose stuck idles #
ext.qa.wait_for_idlereturns a structuredIdleStatusJSON:{idle, blocked_by, in_flight_http, has_scheduled_frame, in_transient_callback}. On timeout,blocked_bylists what's still active (scheduled_frame,transient_callback,in_flight_http:N) so the agent knows whether to wait, retry, or proceed.- New
ignore_animations: trueparam drops the scheduled-frame and transient-callback checks; waits only for HTTP. Use on screens with continuous spring animations (custom sliders, looping animations) that never visually settle.
route_stack — works without integrator wiring #
- Multi-navigator route tracking now reads
Navigator.pagesfrom the live Element tree at snapshot time. AutoRoute / GoRouter / any page-based router gets fullroute_stackcoverage with zero observer wiring —createObserver()becomes optional (still available as a fallback for imperativeNavigator.pushNamedapps). - Returns the full back-stack of each navigator, deepest-first:
["UserProfileRoute", "AccountRoute", "MainRoute"]instead of just the leaf.
screenshot — first-frame race fixed #
- If no
RepaintBoundaryis found on the first call (cold start, no frame has rasterized yet), awaits oneendOfFrameand retries before failing. Eliminates the spurious"no RepaintBoundary found"the agent used to hit on the first screenshot of a session.
0.1.0 #
Initial public release.
- VM-service extensions for an LLM agent to introspect and drive a running
Flutter app:
ext.qa.snapshot,inspect,screenshot,tap,long_press,swipe,enter_text,clear_text,scroll,press_back,wait_for_idle,wait_for_route,wait_for_element,get_logs,get_network,ping. - Denoised semantic tree: a typical 800-node tree collapses to ~15–30 agent-targetable elements via a three-pass classify / dedup / label pipeline.
- Multi-Text label inference — invoice cards, list rows, and other
multi-text containers get a label that distinguishes one card from
the next (e.g.
"Sub Total · 9,709.50 LYD · Unpaid · 342844"). - Multi-navigator route tracking —
routeStacksurfaces every observed navigator's top route so tab apps (AutoRoute, nested Navigators) can be told apart. - HTTP capture —
get_networkreturns method / url / status / duration per exchange, drained incrementally via asincecursor. - Log capture —
get_logsteesdebugPrint,FlutterError.onError, andPlatformDispatcher.onErrorinto a 500-entry ring buffer. - No-op in
kReleaseMode; the probe never ships to your users.