fluttersdk_dusk 0.0.1
fluttersdk_dusk: ^0.0.1 copied to clipboard
Flutter E2E driver for LLM agents and CI. 32 CLI commands and 31 MCP tools drive a running app over VM Service extensions; no flutter_test harness needed.
Changelog #
All notable changes to this project will be documented in this file.
This project follows Semantic Versioning 2.0.0. Entries follow the Keep a Changelog shape.
[Unreleased] #
No unreleased changes yet.
0.0.1 - 2026-05-23 #
Initial public release of fluttersdk_dusk. E2E driver for Flutter apps. Snapshot, tap, type, drag, scroll, screenshot, wait, find via VM Service extensions (ext.dusk.*). Framework-agnostic (vanilla Flutter friendly); Magic / Wind integrations ship inside those packages via DuskPlugin.enrichers extension point. Plugin of fluttersdk_artisan ^0.0.5 (hosted-only; no path overrides). Wind diagnostics flow through the neutral fluttersdk_wind_diagnostics_contracts bridge (WindDebugRegistry) rather than through the enricher list, so wind alpha-10 needs no dusk-side install wiring.
Added #
- 32 CLI commands via
DuskArtisanProvider.commands()(live count fromls lib/src/commands/*_command.dart):dusk:install,dusk:snap,dusk:tap,dusk:screenshot,dusk:type,dusk:scroll,dusk:wait,dusk:wait_for_network_idle,dusk:hover,dusk:drag,dusk:modal,dusk:doctor,dusk:navigate,dusk:navigate_back,dusk:get_routes,dusk:press_key,dusk:select_option,dusk:close_app,dusk:find,dusk:focus,dusk:blur,dusk:clear,dusk:right_click,dusk:dblclick,dusk:triple_click,dusk:set_checkbox,dusk:console,dusk:exceptions,dusk:observe,dusk:resize,dusk:device,dusk:hot_reload_and_snap.dusk:installis the one-shot bootstrap; the rest wrap a matching VM Service extension or substrate-routed action. - 31 MCP tool descriptors via
DuskArtisanProvider.mcpTools()(live count fromgrep "name: 'dusk_" lib/src/dusk_artisan_provider.dart | sort -u):dusk_blur,dusk_clear,dusk_close_app,dusk_console,dusk_dblclick,dusk_device_profile,dusk_dismiss_modals,dusk_drag,dusk_evaluate,dusk_exceptions,dusk_find,dusk_focus,dusk_get_routes,dusk_hot_reload_and_snap,dusk_hover,dusk_navigate,dusk_navigate_back,dusk_observe,dusk_press_key,dusk_resize_viewport,dusk_right_click,dusk_screenshot,dusk_scroll,dusk_select_option,dusk_set_checkbox,dusk_snap,dusk_tap,dusk_triple_click,dusk_type,dusk_wait_for,dusk_wait_for_network_idle. AllMcpToolDescriptorconst instances with Claude Code canonical descriptions (imperative opener + context paragraph +Usage:bullets). - 28 ext.dusk. VM Service extensions + 3 artisan:dusk: substrate-routed tools** (live count from
grep "extensionMethod:" lib/src/dusk_artisan_provider.dart | sort -u). Direct ext.dusk.:snap,screenshot,tap,hover,drag,type,scroll,wait_for,wait_for_network_idle,dismiss_modals,press_key,select_option,navigate,navigate_back,get_routes,evaluate,close_app,find,focus,blur,clear,right_click,dblclick,triple_click,set_checkbox,console,exceptions,observe. Substrate-routed viaartisan:dusk:*:resize,device,hot_reload_and_snap(in-isolate hot-reload deadlock avoidance). All ext.dusk. extensions register throughregisterExtensionIdempotentfor hot-restart safety. DuskPlugin.install(); idempotent host-side install entry. Wraps the app widget root in aRepaintBoundary(noGlobalKey) soext.dusk.screenshotcan find it via render-tree walk. Hot-restart safe via static_installCountguard. HonorsDUSK_DISABLEenv var (1/true/yes, case-insensitive) as kill switch.DuskSnapshotEnrichertypedef; snapshot-enricher extension point.String? Function(Element, RefRegistry). Magic ships its enrichers viaMagicDuskIntegration. Wind no longer ships an enricher as of wind alpha-10: wind state is read through the neutralfluttersdk_wind_diagnostics_contracts.WindDebugRegistry.current?.resolve(element)bridge insideext_snapshot.dartandext_observe.dartahead of the enricher loop, so the 6 core wind fields (breakpoint, brightness, platform, states, bgColor, textColor) survive without an enricher registration. Contract: synchronous, stateless w.r.t. call ordering, may returnnullto skip, multi-line fragments split + indented under the ref entry by the dispatcher.fluttersdk_wind_diagnostics_contractsintegration: new production depfluttersdk_wind_diagnostics_contracts: ^1.0.0.ext.dusk.snapandext.dusk.observeread wind state viaWindDebugRegistry.current?.resolve(element)in addition to the existing enricher list dispatch; thewind:block (filtered by_kDefaultWindKeysindefaultsmode) is emitted directly by dusk. Magic enricher contract UNCHANGED.RefRegistry; stablee<N>(snapshot-frozen) andq<N>(re-resolvable Playwright-Locator) token systems.e<N>refs are minted atdusk_snaptime and consumed by every action tool;q<N>refs are minted bydusk:findand re-execute their stored predicates against the live tree on every action call (resilient to widget rebuild + route push).- Actionability gate (
lib/src/utils/actionability_gate.dart);tap/hover/drag/typeresolve through a single gate that verifies the target's enabled flag (Tristate.isFalsefails;Tristate.noneandTristate.isTruepass), zero-area rect, and viewport overlap BEFORE synthesising the pointer / key event. Failures surfaceServiceExtensionResponse.error(extensionError, "Widget ref=$ref is not actionable: $reason")with$reason∈ {"not enabled","zero rect","off-viewport (rect=..., viewport=...)"}.scroll,select_option, andpress_keyintentionally skip the gate (see Known gaps). dusk:installone-shot bootstrap; minimal install. Edits the consumer'slib/main.dartonly (nobin/artisan.dartorlib/app/scaffolding for vanilla Flutter apps). Detects Magic-stack apps via theawait Magic.init(anchor and injectsDuskPlugin.install()BEFORE Magic.init (thenMagicDuskIntegration.install()AFTER), falling back to therunApp(anchor for vanilla Flutter apps. Wind alpha-10 needs no install-time wiring from dusk: the consumer callsWind.installDebugResolver()directly, and dusk reads wind state throughWindDebugRegistryat snap time. Vanilla consumers access dusk viadart run fluttersdk_dusk <cmd>. Idempotent; safe to re-run.- Flutter-free CLI wrapper;
bin/fluttersdk_dusk.dart+executables: fluttersdk_duskpubspec entry.dart run fluttersdk_dusk <cmd>proxies the full artisan CLI surface and exposes the dusk commands without draggingdart:uiinto pure-Dart contexts. install.yamlplugin manifest; V1 manifest at the package root makesplugin:install fluttersdk_duskwork end-to-end via the artisanPluginInstaller.lib/cli.dartcodegen barrel; Flutter-free typedef aliasFluttersdkDuskArtisanProvider. Consumed by consumer-sidelib/app/_plugins.g.dartauto-discovery without pulling Flutter symbols into the pure-Dart artisan codegen path.dusk:findPlaywright-Locator pattern; mintsq<N>query handles backed bytext/semanticsLabel/keypredicates. Unlikee<N>refs (frozen at snap time), q-handles re-execute the Semantics + Element walk on every action call, so they survive widget rebuilds and route pushes as long as the predicates still match. Stale match returns an explicitstale-handleerror; the agent re-finds, never silently retries.dusk:doctor; diagnostic command that checks~/.artisan/state.jsonChrome PID staleness,DUSK_DISABLEenv-var value, registered enricher count, Semantics-tree-forced flag, and Magic-init wiring in one pass. Emits a categorised report (OK / WARN / ERROR per check); exit code 0 when every check passes.- Chrome reaper (
lib/src/utils/chrome_reaper.dart); graceful Chromium subprocess teardown between dusk:* runs so leftover headless tabs no longer accumulate. Detects orphans by VM Service URI, exits cleanly viaSystemNavigator.popfirst, falls back to SIGTERM. - Example apps:
example/(vanilla Flutter, 7 scenario screens: home menu + buttons / inputs / scroll / modals / drawer / forms) for live e2e validation against the 31 MCP tools + 32 CLI commands. - CDP driver (
lib/src/cdp/):CdpClient,DevicePresets(8 curated device presets with explicit DPR values:iphone-x,iphone-13,iphone-15-pro,pixel-5,pixel-8,ipad-pro-12.9,desktop-1440,desktop-1920),ChromeFinder. Minimal in-house Chrome DevTools Protocol client (~110 LoC, dart:io WebSocket + dart:convert; no pub.dev deps). dusk:resizeCLI (lib/src/commands/dusk_resize_command.dart):dart run fluttersdk_artisan dusk:resize --width=375 --height=812 [--dpr=3] [--mobile] [--touch]. ReadscdpPortfrom state.json, opensCdpClient, sendsEmulation.setDeviceMetricsOverride(+ optionalsetTouchEmulationEnabled).--resetsends 3-call clear chain. Fails loudly when CDP not enabled.dusk:deviceCLI (lib/src/commands/dusk_device_command.dart):dart run fluttersdk_artisan dusk:device --preset=iphone-x. Applies the full emulation chain (metrics + conditional touch + UA) from the curated preset database.--listprints all 8 preset entries;--resetmirrorsdusk:resize --reset.- 2 CDP MCP tools (
dusk_resize_viewport+dusk_device_profile): both dispatch via the existingartisan:substrate prefix (nomcp_server.dartchanges). FakeCdpServertest harness (test/src/cdp/fake_cdp_server.dart): dart:ioHttpServer+WebSocketTransformer.upgradeon an ephemeral loopback port. Configurable failure modes (failOnJsonVersion,dropWebSocket,delayResponseMs). Used bycdp_client_test.dart,dusk_resize_command_test.dart,dusk_device_command_test.dart.- Integration smoke test (
test/integration/cdp_smoke_test.dart): tagged@Skipso defaultflutter testskips it; run manually viaflutter test test/integration --tags integrationto validatedart-lang/webdev#2642regression status. dusk:installmagic-detect branch: now injectsimport 'package:magic/dusk_integration.dart';instead ofimport 'package:magic/magic.dart';. Pairs with magic 1.0.0-alpha.15 which extracts the integration class into a dedicated sub-barrel.- 6-step actionability gate (Wave 3): Step 0 defunct preflight + Stable + Receives-Events gates round out
ensureActionable(now async). Total preconditions in evaluation order: defunct (preflight), enabled, zero-rect, off-viewport, stable (rect unchanged across 2 consecutive frames; Playwright auto-waiting), receives-events (hit-test confirms ref is the front-most pointer target). Opt-out viacheckStable=false/checkReceivesEvents=false(both defaulttrue). Failure-reason substrings extended:"defunct","not stable","obscured by"join the existing agent branch surface. - Snapshot-in-action-response (Wave 3, Playwright
setIncludeSnapshotpattern): 8 action handlers (tap,hover,drag,type,press_key,scroll,navigate,navigate_back) acceptincludeSnapshot=trueand append the post-action snapshot YAML to the success response. The agent no longer needs a mandatory follow-updusk_snapcall.duskSnapBuildwidened from@visibleForTestingto public (legitimate production reuse).press_keyhandler endOfFrame omission fixed in passing. - Structured error envelope + fuzzy-match suggestions (Wave 3):
lib/src/utils/error_envelope.dartwithDuskErrorEnvelopecarryingtype+widget_path+suggestions[]. 10 type values:timeout,not_found,obscured,disabled,stale,zero_rect,off_viewport,not_stable,missing_param,unexpected. 6 factories. Dual-write intoerrorDetail(JSON envelope alongside the free-form message) preserves backward compat for substring-matching agents. Levenshtein with prefix-bonus drives the suggestions list fornot_found.RefRegistry.activeRefs()added to support candidate collection. ext.dusk.wait_for_network_idle(Wave 3): pollsTelescopeStore.pendingHttpCountuntil the count hits zero for a configurableidleMswindow. ParamstimeoutMs(5000),idleMs(500),pollIntervalMs(200). Function-pointer indirection (pendingHttpCountReaderexported fromdusk.dart) keeps dusk free of a hard telescope dependency; magic-side wires the real reader at install time. New CLI commanddusk:wait_for_network_idle.- 4 utility tools (Wave 3):
dusk_console(telescope log reader, function-pointer indirection viarecentLogsReader),dusk_exceptions(telescope exception reader viarecentExceptionsReader),dusk_dblclick(two synthesised taps with 100ms inter-tap delay, shared 6-step actionability gate + snapshot embed),dusk_set_checkbox(idempotentCheckbox/Switchtoggle via element walk; no-op when current value matches target). ext.dusk.observe(Wave 4): Stagehand-style observe-once-act-many pattern. Walks every activePipelineOwnersemantics tree, filters interactive nodes (buttons / textfields / links / checkboxes / dropdowns via_roleFor/_isInteractive), mints a re-resolvableq<N>ref per candidate (Playwright Locator pattern; nevere<N>), and returns a structured JSON list{candidates: [...], count: N}. Each candidate carriesref,role,label,value,bounds,isEnabled,isVisible, plus enricher-projected fields. Params:intent(caller hint, echoed only),limit(default 50),roles(comma-separated filter),includeEnrichers.dusk:hot_reload_and_snap(Wave 4): CLI-side orchestration viaVmServiceClient.reloadSources(in-isolate handler cannot reload its own isolate; deadlock avoidance). Sequence: reload -> wait -> snap -> screenshot -> exceptions -> bundle. Success envelope{reloaded, durationMs, snapshot, screenshot, recentExceptions}; compile-error envelope skips snap/screenshot but still gathers exceptions. Screenshot failure surfaces as partial-resultscreenshotErrorrather than aborting the round-trip. MCP descriptor uses theartisan:substrate routing prefix (extensionMethod: 'artisan:dusk:hot_reload_and_snap').dusk:installis now self-sufficient (Wave 5 pre-publish). Phase 1 patcheslib/main.dart(unchanged contract). Phase 2 chainsdart run fluttersdk_artisan install(scaffoldsbin/dispatcher.dart+./bin/fsaAOT wrapper) followed bydart run fluttersdk_artisan plugin:install fluttersdk_dusk(registersDuskArtisanProvider; artisan 0.0.5 auto-purges the AOT bundle cache). Both Phase 2 sub-process calls are file-marker-guarded (bin/dispatcher.dart,.artisan/installed/fluttersdk_dusk.json) so re-runs are fast no-ops; failures swallow with a warning so Phase 1'slib/main.dartinject remains the guaranteed contract regardless of the consumer'sdartPATH / sandbox state. Net effect: a fresh consumer needs onlyflutter pub add fluttersdk_dusk+dart run fluttersdk_dusk dusk:installto reach a working./bin/fsa list+ MCPtools/listsurface.ext.dusk.findsubstring predicate +dusk:find --contains=<substring>CLI flag (Wave 5; pre-publish E2E pass). Existing--text=<exact>semantics unchanged; agents now have a brittle / dynamic-label fallback.DuskQuery.containsTextfield is the carrier; matching walks Semantics labels first, thenText.data, mirroring thetextpath.dusk:drag --fromRef=<eN> --toRef=<eN>flag aliases parallel to the--refshape used bydusk:tap/dusk:hover(Wave 5). Legacy--startRef/--endRefflags retained for back-compat.dusk:scroll --direction=<up|down|left|right> --pixels=<N>convenience flags that translate to signed--dy/--dx(Wave 5). Explicit--dy/--dxstill win when both forms supplied.- Surface deltas (live counts): CLI commands: 32 (
lib/src/commands/*_command.dart); MCP tool descriptors: 31 (dusk_artisan_provider.dart); VM Service extensions: 28ext.dusk.*+ 3artisan:dusk:*substrate-routed.
Fixed (pre-publish macOS + web E2E pass, Wave 5) #
dusk_resize_viewportMCP arg parsing (GAP I): handler castctx.input.option('width') as String?which failed when MCPtools/calldelivers{"width":390}as a native JSON int rather than a stringified arg. Resize command now defensively readsint/double/boolfrom either type via_readInt/_readDouble/_readBoolhelpers. CLI invocations still work unchanged (ArgParser-emitted strings).
Fixed #
ext.dusk.focuson TextField + EditableText (GAP C): handler walked UP from the snap-captured Semantics element looking for aFocusancestor; for TextField the FocusNode sits BELOW the captured element (insideEditableText/FocusableActionDetector). Now falls back to a descendant walk that picks the firstEditableText.focusNodeorFocus.focusNodeit finds. Reproducer:dusk:focus --ref=<textbox-eN>previously returnedno Focus ancestor; now returnsfocused: true.ext.dusk.scrollwith ref pointing at the Scrollable itself (GAP D):Scrollable.maybeOf(context)walks UP, so passing the ListView's own ref (e.g. fromdusk:find --key=my-list) returned null. Handler now resolves in three stages: (1) target element IS a Scrollable, use its state; (2) Scrollable ancestor (legacy); (3) descendant Scrollable walk (when ref is a parent like a Scaffold wrapping a list).dusk:press_key --key=case-sensitivity (NIT 5): agents calling--key=TABor--key=enterhitunknown keyeven though the supported set covered the intent. Lookup now does a case-insensitive fallback over_kKeyMap.keyswhen the direct hit misses; canonical PascalCase keys (Tab,Enter,ArrowUp) remain documented.dusk:screenshotsuccess message now reports decoded byte count + KB + format, e.g.Wrote 239456 bytes (233.8 KB, jpeg) to ./shot.jpg(NIT 1). Previously the line referenced the base64 character count which misled agents parsing for byte size.dusk:screenshotmissing-output error now suggests the canonical invocationdusk:screenshot --output=./shot.jpg --format=jpeg(NIT 8).- README + installation.md document the full 3-step install flow:
flutter pub add fluttersdk_dusk+dart run fluttersdk_dusk dusk:install+dart run fluttersdk_artisan install && dart run fluttersdk_artisan plugin:install fluttersdk_dusk(GAP B). Previously theplugin:installstep was missing, leaving consumers with./bin/fsa listshowing 0 dusk:* commands.installation.mdcarries a new## Register with artisansection explaining the fastcli scaffold + plugin registration.
Test coverage #
- 678 tests passing (2026-05-23 pre-publish,
flutter test --exclude-tags=integration --timeout=30s). Scope covers handler entry points (params + error paths + happy paths where reachable underflutter_test), 32 CLI commands (name / boot / description / configure / handle / missing-arg validation),DuskArtisanProvider.commands()/mcpTools()shape,DuskPlugin.install()idempotency +DUSK_DISABLEenv-var kill switch,RefRegistrymint / lookup / disposeGroup / disposeAll / refsForGroup / registerQuery / lookupQuery, actionability gate (6-step: defunct / enabled / zero-rect / off-viewport / not-stable / obscured),encodeToJpegPNG-to-JPEG roundtrip + quality boundaries (1, 100, error), modal-route classification, dispatcher contract, CDP client + device presets + resize/device commands, Wave 3 structured error envelopes, Wave 4 observe + hot-reload-and-snap, Wave 5 find-contains substring + descendant focus walk + Scrollable-own-ref scroll. Pre-publish E2E pass against a fresh vanilla Flutter consumer (/tmp/dusk_e2e) verified 27 of 32 CLI commands + MCPinitialize+tools/list(41 tools = 31 dusk_* + 10 artisan_*) +tools/call dusk_snap(identical to CLI) +tools/call dusk_evaluate(actual evaluation via artisan 0.0.5 substrate routing). - Coverage: dusk ~79% line coverage via
flutter test --coverage. The remaining gap covers engine-dependent paths that hang theflutter_testfake-clock harness: handlerendOfFramewaits,Future.delayedpoll loops inwait_for, realtoImage()rasterisation inscreenshotsuccess paths, and private_defaultProcessStartTime/_parsePsLstartdoctor seam defaults. End-to-end coverage for those paths is captured by the example/ playground sweep.
Known gaps #
dusk:doctorruns in pure-Dart CLI context and cannot importpackage:flutter/rendering.dartwithout draggingdart:ui(breaksdart runinvocation). Two checks defang gracefully as a result:semanticsEnabledProbedefaults totrue(the only ERROR-class check, so doctor cannot ERROR from CLI) andenrichersProbedefaults to0(always WARNs on Check 3). The real probes belong to a future VM-Service-attached doctor invocation that calls into the running app.scroll,select_option, andpress_keyintentionally skip the actionability gate: scroll targets the parent scrollable not the ref, select_option dispatches through Material/Cupertino popup machinery that owns its own enabled check, and press_key targets the focused widget rather than a ref. Adding the gate to these three handlers is V1.x candidate work.RefRegistry._queries(q-handle store) is monotonically growing within a debug session; onlyRefRegistry.disposeAll()clears it. Worst-case memory bounded by debug-session lifetime; per-handle eviction is V1.x candidate work.
Risks Accepted #
dart-lang/webdev#2642live regression: "Hot restart broken when running DWDS without Chrome Debug Port". Integration smoke test (test/integration/cdp_smoke_test.dart) surfaces this if active. Mitigation lives in the user's pinned Flutter SDK; plan does not block on regression resolution.- Flutter SDK >= 3.30.0 required for
--cdp-port(perflutter/flutter#170612). Lower versions get an actionable error from bothartisan doctor(advisory) andartisan start --cdp-port(fail-fast). - GAP E (drag synthesis vs Flutter Draggable):
dusk:dragreturns success but Flutter'sDragTarget.onAcceptWithDetailsdoes not fire on synthesised events in some configurations (Pointer Down + 5x Move + Up sequence may not match Draggable's gesture recognizer expectations on certain platforms / dwell times). Verified via E2E showroom (2026-05-23). Tracked for a 0.0.2 follow-up; agents needing drag should fall back to a pair ofdusk:tap+ manual scroll for now. - GAP G (advisory): receives-events check + q-refs on widgets with deep render subtrees: when
dusk:find --key=<name-field>resolves to aTextField(or any widget whosefindRenderObject()returns a top-level RenderObject), the actionability gate's receives-events check sees a hit-test path topped by a deeper descendant (e.g.RenderEditable) and tripsobscured by other widget. The_isDescendantOfwalk does not catch this case consistently. Workarounds: (1) use thee<N>ref from a priordusk:snaprather than aq<N>from--key; (2) pass--no-checkReceivesEventson the action. Tracked for a 0.0.2 follow-up; deeper investigation needed in the gate's hit-test path traversal. - GAP H (web):
dusk:screenshot+dusk:close_apptimeout on Chrome (DWDS): 10s timeout. macOS desktop works fine. The web path likely needs special handling forRepaintBoundary.toImage()under DWDS pixel pipeline + the platform-close semantics ofSystemNavigator.pop()(which closes the tab, so the response can't return). Workaround for close: rely on./bin/fsa stopSIGTERM (works). Workaround for screenshot on web: use the browser DevTools snapshot. Tracked for a 0.0.2 follow-up.
Backward compat #
DuskSnapshotEnricher typedef, DuskPlugin.install / DuskPlugin.enrichers / DuskPlugin.registerNavigateAdapter, RefRegistry public methods (register, lookup, registerQuery, lookupQuery, disposeAll, resetForTesting), and every MCP tool name / ext.dusk.* extension name are part of the public 0.0.1 contract. Future releases keep these stable across the 0.x line; any change requires a coordinated bump with magic + wind.