dartrics 0.2.2
dartrics: ^0.2.2 copied to clipboard
Citation-anchored Dart code-quality metrics (CK, Halstead, McCabe, Martin, Cognitive) plus Periphery-style unused public-API detection, shaped for AI refactor loops.
Changelog #
0.2.2 #
Bugfixes #
maximum-nesting-levelno longer counts named-argument closures (Widget builders, event handlers) as a nesting level.ListView.builder(itemBuilder: (...) {})andElevatedButton(onPressed: () {})were reporting1even when noif/forwas involved — directly contradicting the contract thatwidget_tree_depth.dartandflutter_aware.dartboth document ("a healthy Widget tree produces a nesting score of 0"). Closures still increment when passed positionally (xs.forEach((x) {}),xs.fold(0, (a, x) => …)) — those are higher-order calls, not declarative configuration. Innerif/forinside a named-argument closure still counts at the right depth.- The summary table now surfaces snapshot mode and the diff filter so the cache-mode default doesn't render as a regression. Without these rows the second run with no source changes filtered every violation through
_filterUnusedand rendered asunused declarations: 0indistinguishable from "really nothing fired". Addssnapshot mode: cache/files changed: 0 of 3 (no new findings)to the md summary, a top-levelsnapshot:block to the AI reporter, a[snapshot cache: 0 of 3 changed]tail tag to the console line, andsnapshotMode+changedFileCountfields at the JSON report root. Field additions only —# dartrics ai-report v1and the JSON1.0header stay valid.
0.2.1 #
Bugfix #
.pubignore'scoverage/pattern (no leading slash) matched at any depth, so the published0.1.0and0.2.0archives shipped withoutlib/src/coverage/coverage_loader.dartandlib/src/coverage/lcov_reader.darteven though both are imported bylib/src/cli/analyze_command.dartandlib/src/metrics/metric_engine.dart. Anyone who installed from pub.dev hit unresolved-import errors. The pattern is now/coverage/, matching.gitignore. No source-code changes; reinstall to recover a working package.
0.2.0 #
Public-API unused-code detection — element-resolution mode #
- The CLI's
dartrics analyzeanddartrics unusedpaths now run the public-API reachability analysis over the analyzer's resolved element graph instead of the simple-name reference graph that shipped in 0.1.0. The detector keys reachability on canonicalElement.ids of project-local declarations, so homonym methods on different classes are independent nodes (callingFoo.bar()no longer accidentally keepsBaz.bar()alive), prefixed imports keep distinct identities, and SDK / dependency symbols never pull through to project declarations they happen to share a name with. - Reachability is now tracked at member granularity. The detector reports unused instance methods, fields, getters, setters, and enum values in addition to the top-level kinds — same
UnusedKindenum as before, just with the per-class entries populated. Existing 0.1.0 callers will see new entries withkind: method | field | enumValuein the unused list once a class is reachable; pass--filter class,function,extension,typedef(or setunused: { filter: [...] }inanalysis_options.yaml) to restore the top-level-only shape. - New
--filter <kinds>CLI flag (and matchingunused: { filter: [...] }YAML key) narrows the report to a subset of declaration kinds. Accepted names:function,method,class,field,typedef,enum,extension.enumtargets individual enum constants; enum types are filtered withclass. Comma-separate or repeat the flag (--filter method,field). Unknown names exitExitCode.usagewith a did-you-mean style error. - Auto-rooting rules added to keep the per-member reports clean:
- Members marked
@overrideare rooted (covers interface / superclass overrides without us walking the supertype hierarchy). - Object dunder names —
toString,hashCode,==,noSuchMethod,runtimeType— are rooted; the language runtime calls them, not user source. - When a class carries any keep-alive annotation (
@JsonSerializable,@reflectiveTest, every codegen preset, …) every public member of that class is rooted too — these annotations signal generator / reflective consumers that read members by name.
- Members marked
- New
reflectiveTestkeep-alive preset added tokeep_alive_presets.dartso@reflectiveTestclasses frompackage:test_reflective_loaderkeep theirtest_*members alive. LibraryElement.exportNamespacenow drives theexcludeExportedroot set, so re-exportedlib/src/types (and every public method / field / getter / setter on them) survive without relying on textualshowmatching.- The parse-only
UnusedDetector.detectentry point stays as a fallback for tests / embedders that don't want a realAnalysisContextCollection.dartrics analyze/dartrics unusedroute through the newUnusedDetector.detectResolvedpath automatically — no caller changes required.
0.1.0 #
First public release. The CLI, the analyzer plugin, and the embeddable Dart API ship from a single package.
Design philosophy #
dartrics is built on the wager that the AI coding loop changes which software metrics are practically usable. The academic catalogue — McCabe 1976, CK 1994, LCOM4 1995, Martin 1994, Cognitive Complexity 2018 — has long been underused in everyday workflows because each of calculating the number, interpreting it, and acting on it was individually expensive for a human reviewer. An AI loop absorbs all three: the CLI computes in milliseconds, --auto-explain delivers the rationale, and the agent acts on it.
Each metric is treated as a single lens — one dimension of "hard to read." Lenses are independent and stackable: an agent can iterate through dozens in a session, refactor under each, then re-evaluate. dartrics surfaces what each lens reads; the accept / refactor / dismiss decision is first-class and stays in the loop.
The metric set, the thresholds, and the Flutter / test relaxations below are calibrated for Dart. The lens framing and the AI-loop contract are language-agnostic.
Metrics #
- Function / method: cyclomatic complexity (McCabe 1976), cognitive complexity (Sonar 2018), maximum nesting level (control-flow only —
if/for/while/switch/try/closure; widget-literal chains do not count), number of parameters (Fowler 1999 — positional-only; named parameters are weight-zero because the call sitefoo(a: …, b: …)carries each argument's name on the spot, dissolving the position-counting load Fowler's lens targets — same rule asboolean-trap), boolean-trap (McConnell Code Complete 2004; Bloch Effective Java item 36 — count ofbool-typed parameters, warning ≥ 2), source lines of code. Halstead Volume (Halstead 1977), method length, andwidget-tree-depth(deepest chain of nested constructor calls — Flutter community ~5–7 threshold) ship off-by-default — opt in withdartrics: { metrics: { <id>: { enabled: true } } }. Method length is default-off because its correlation with SLOC in production code is high enough (often > 0.95) that emitting both is redundant noise — opt in when you specifically want the "screen real estate" reading (counts blank lines + comments) on top of SLOC's "actual code volume" reading. Halstead Difficulty / Effort and the Maintainability Index (Oman 1992) were dropped because they are pure derivations of the underlying token counts andCC + V + LOCrespectively. - Class: number of methods, weighted methods per class (CK 1994), LCOM4 (Hitz & Montazeri 1995, connected-component variant), CBO and RFC (CK 1994), class length. CK's DIT and NOC are intentionally not provided — Dart's mixin / composition-over-inheritance culture keeps inheritance chains shallow, so they rarely produce signal.
- Library / file: efferent / afferent coupling, instability (Martin 1994). Abstractness and distance from main sequence ship default-off because Martin's framing assumes "package = release unit" and Dart's 1-file-1-library granularity makes the per-file values brittle (a single
abstract class Fooin its own file scores A=1.0 without saying anything about the design layer it participates in). Opt-in until the directory-level aggregation lands. - Each metric exposes
rationale(one-paragraph explanation anchored to the original paper),refactorHints(concrete moves), andpolarity(down/up/neutral) so AI loops know which direction is "healthier" for the regression diff.
Subcommands #
dartrics analyzeruns every metric and the public-API unused detector over the analysis root.dartrics unusedruns only the public-API reachability detector (fast path).dartrics report <input.json>re-emits a previously saved JSON report in a different format.dartrics rulescatalogues every metric with its rationale + refactor hints in--reporter ai|md|json|console.dartrics regression [--before <ref>] [--after <ref>]compares metrics between two git states (default:HEAD~1vs the working tree). Uses git worktrees for the historical side. Diff entries are classified asimproved/regressed/unchanged/added/removedperMetricPolarity. A built-in cosmetic-split heuristic flags refactors that look like AI just shuffled complexity into one-line helpers without actually reducing it.dartrics manualprints the AI-facing operator's manual to stdout. The content is a mirror ofdoc/manual.mdembedded as a const string in the executable, so it travels withdart pub global activate dartricsand is reachable from any agent loop without a separate doc download. A parity test enforces byte-equality with the markdown source so the two cannot drift.dartrics doctorvalidates thedartrics:block inanalysis_options.yaml. Surfaces unknown metric ids (with did-you-mean suggestions via Levenshtein distance ≤ 2), unknown unused presets, and threshold orderings inconsistent with each metric's polarity (e.g.cyclomatic-complexity: { warning: 20, error: 10 }is flagged because lower-is-better metrics neederror ≥ warning). Read-only — never edits the config. Exit codes: 0 clean, 1 warnings, 78 invalid YAML /ConfigException.dartrics explain <id>reverse-looks-up a violation by its stable 16-hex-char id and prints the matching entry plus the metric's rationale + refactor hints. Reads a JSON report (the format produced bydartrics analyze --reporter json) from stdin or--input <path>. AI agents that see the same id reappear across runs ("my fix didn't take") can retrieve full context for that id without re-reading the entire report.
AI integration (--reporter ai) #
- Token-efficient YAML-ish output starting with
# dartrics ai-report v1. The header is contractual; field renames or removals trigger a new header (v2). - Auto-explain (default on;
--no-auto-explainopts out) attaches each fired metric's rationale + refactor hints to the report'sexplain:block — AI loops no longer need to know to pass--explain <id>for every threshold they care about. --explain <metric-id>(repeatable) is still honoured and unions with auto-explain; explicit ids stay first in the order so authored prompts remain deterministic.- Stable violation
id— every violation carries a 16-hex-charsha256("<file>|<scope>|<metric>")so AI loops can correlate runs ("a3f1c4e9…showed up again ⇒ my fix didn't take"). Surfaces in the JSON / AI / md reporters and aspartialFingerprints.dartrics/v1in SARIF. Exported ascomputeViolationId(file, scope, metricId)for embedders. --limit <n>caps the violations + unused entries shown by the AI / md reporters after the priority sort. AI report records the dropped count in atruncated:block; md report appends_+ N more violation(s) hidden by --limit_. JSON / SARIF / console stay unlimited.--coverage <path>(auto-detectscoverage/lcov.info) attaches per-scope line and branch coverage to every emitted violation. The reporter sorts by a priority key that puts low-coverage entries first andcomplexityJustifiedones last so token budget lands on the most actionable items.complexityJustified: trueflags CC / Cognitive violations whose scope has branch coverage ≥ 0.8 (or line ≥ 0.95 when noBRDA:records are present) — earned complexity AI loops should leave alone. Two sibling fields surface the engine's decision:complexityJustifiedBy(branchorline, whichever rule won) andcomplexityJustifiedThreshold(the literal cutoff that rule used). Reporters pass the trio through verbatim — JSON, AI / YAML, MD, SARIF,dartrics explain.- Deliberate dismissal lets agents triage a specific
(file, scope, metric)triple via// dartrics:dismiss <metric> reason="…"comments or adartrics-dismissals.yamlsidecar. Both channels are opt-in throughdartrics: { dismissals: … }inanalysis_options.yaml. Validated entries decorate the violation withdismissed: true+ carriedreason/by/at; entries that failrequireReason/minReasonLength/requireAuthor/requireTimestampkeep the violation live and stamp it withdismissalRejected: <why>plus a stderr WARNING.--strict-dismissignores every dismissal for the run. Stale-entry detection (default-on,warnStale: true): dismissals that never matched a live violation in the analyzed file set surface as a stderr WARNING and as astaleDismissals:block on the AI / JSON reports, so AI loops can prune dead entries when scopes are renamed / deleted or metrics drop below threshold. Skipped for files filtered out by--since/ snapshot. --since <git-ref>filters output to declarations whose owning.dartfile changed between<ref>andHEAD. Cross-file analysis still resolves the full project so LCOM4 / library coupling / public-API reachability stay accurate; only the emitted records are filtered.- End-to-end loop walkthrough — setup → propose → apply → verify, with sample prompts and troubleshooting — lives in
doc/ai-loop.md. - AI-facing operator's manual — each metric framed as a lens on "hard to read" with the accept / refactor / dismiss decision step made first-class — lives in
doc/manual.md.
Reporters #
console— human-readable summary line + per-violation entries.json— stable schema forjqpipelines and SARIF transformation; carriesanalyzedFiles(sha256 list) when snapshot mode is engaged.md— Markdown for PR comments and issue bodies, formatted viapackage:dapper.formatMarkdown.ai— described above.sarif2.1.0 — GitHub Code Scanning / GitLab ingestion.tool.driver.rulesis populated for every metric that fired in the run, carrying the rationale infullDescription, the refactor hints inhelp.markdown, andhelpUrideep-linking back to the README anchor — so the platform surfaces the full lens explanation inline next to each result instead of an opaque rule id.
Public-API unused-code detection #
- Periphery-style BFS reachability over a name-based reference graph rooted at
main, declarations annotated with@pragma('vm:entry-point'), and (whenexcludeExportedis enabled)lib/exports outsidelib/src/. Followsexport ... show ...clauses so re-exportedlib/src/symbols stay reachable. Reports unused public functions, classes, mixins, extensions, typedefs, enums, and top-level fields. - Code-gen keep-alive annotations are always on:
freezed,json_serializable,dart_mappable,go_router_builder,auto_route,riverpod_generator,injectable,hive,drift. Listingpresets:inanalysis_options.yamlis no longer required (the field is still parsed for backward compatibility with older configs but no longer narrows the keep-alive set). The simple-name match means an annotation from a package you don't use simply never fires, so there's no per-project cost to leaving every preset on. - Generated Dart files (
*.g.dart,*.freezed.dart,*.gr.dart,*.config.dart,*.mocks.dart,*.pb*.dart,*.gen.dart) are skipped during file collection. Override withAnalyzerRunner(includeGenerated: true)if you really want them. - Private (underscore-prefixed) names are intentionally skipped —
dart analyze'sdead_codelint already covers them. dartrics unused --applydeletes detected top-level declarations from disk (analogous todart fix --apply). Refuses to run on a dirty git tree without--force. Files undertest/orintegration_test/are excluded by default; pass--include-teststo include them. Supports function / class / typedef / extension deletion; method / field / enum-value deletion is reported asunsupportedbecause the range computation needs containing-declaration awareness that is deferred. Imports left unused after deletion can be cleaned up withdart fix --apply.
Analyzer plugin #
plugins: dartricsinanalysis_options.yamlenables five function-level rules (dartrics_cyclomatic_complexity/_cognitive_complexity/_maximum_nesting_level/_number_of_parameters/_boolean_trap) inline indart analyzeand the IDE.- Rule thresholds are configurable through the same
dartrics:section the CLI uses (long form{ warning: <n>, error: <n> }or bare-integer short form). The plugin honoursflutter: truefor the same skip rules as the CLI. - Heavier metrics (LCOM4, CBO, RFC, library coupling) and the unused detector stay CLI-only — they need a project-wide index that an analysis-server plugin can't maintain efficiently per file.
- Diagnostics surface at INFO severity due to an upstream
analysis_server_plugin0.3.x constraint (non-INFOLintCodecrashes the plugin isolate).
Flutter-aware mode #
dartrics: { flutter: true }is the default. Its only effect in 0.1.0 is to skipnumber-of-parameterson widget constructors, which stays as a cushion for the rare positional-style widget constructor — in practice an idiomaticMyWidget({super.key, required this.title, ...})already scores 0 from NOP's positional-only semantic, so the skip is a no-op for typical Flutter code.Widget.build()is measured normally —maximum-nesting-levelonly counts control-flow constructs (if/for/while/switch/try/closure), so a healthy declarative tree produces a depth of 0 without any special-casing, andmethod-lengthis informative even on declarative trees.- Visual depth from chained Widget literals (
Container(child: Container(...))) is the responsibility of thewidget-tree-depthlens — opt-in for Flutter authors that want this signal, default warning 7 (matching Flutter community practice of ~5–7 before extracting a sub-widget). - Detection is AST-only across
StatelessWidget,StatefulWidget,State,ConsumerWidget,ConsumerStatefulWidget,HookWidget,HookConsumerWidget. Setflutter: falseto forcenumber-of-parameterson widget constructors too.
Test-aware mode #
dartrics: { test: true }is the default. When the file under analysis sits undertest/orintegration_test/and its basename ends in_test.dart(the conventionaldart testdiscovery marker),method-length/source-lines-of-code/maximum-nesting-levelare skipped at function level andclass-length/number-of-methodsare skipped at class level. AAA blocks and nestedgroup/setUp/testscaffolding don't dominate the diagnostic stream that way. Helpers undertest/whose basename does not end in_test.dart(e.g.test/helpers.dart) stay under the strict thresholds. Settest: falseto apply the production-grade thresholds to test files too.- Cyclomatic complexity, cognitive complexity, number-of-parameters, boolean-trap, LCOM4 / CBO / RFC, and the library-level lenses still apply — branchy or tangled tests are still hard to read.
CLI surface #
- Common options:
--config,--reporter,--output,--root,--since,--explain,--snapshot,--coverage,--strict-dismiss,--concurrency,--limit,--auto-explain/--no-auto-explain,--fatal-warnings,--fatal-style,-v. dartrics --versionprints the build's version. The same string is exported asdartricsVersionfrompackage:dartrics/dartrics.dart.- Exit codes are sysexits-aligned: 0 success, 1 violations (with
--fatal-warnings), 64 usage, 65 data, 70 internal, 78 config.
Embedding #
lib/dartrics.dartis a deliberately tight Dart API: the eight function-level metric calculators (CyclomaticComplexity,CognitiveComplexity,MaxNestingLevel,NumberOfParameters,BooleanTrap,MethodLength,SourceLinesOfCode,HalsteadVolume), the calculator interface (FunctionMetric,FunctionMetricInput,MetricPolarity), anddartricsVersion. That's it. Report assembly, regression diff, coverage attachment, dismissal validation, snapshot persistence, the unused detector, and class- / library-level metrics are CLI-only — the JSON reporter is the supported on-wire format for those scopes. Keeping the public Dart surface small means internal evolution doesn't break consumers we don't yet have; if a Dart-level handle is missing for a real use case, please file an issue rather than reaching intopackage:dartrics/src/.example/main.dartshows a 30-line standalone embedding.
Performance #
- File resolution in
AnalyzerRunner.resolveAllusespackage:poolto run up to--concurrencyresolves in flight at once. Default mirrors the host CPU count (clamped to 16). Output ordering remains alphabetical so reports stay deterministic across runs. The win on smaller trees (≈50 files) is ≈10 % wall-time; larger codebases get more.
Tooling #
.github/workflows/analyze.yamlruns format / analyze / test on Ubuntu for every push and PR, then uploadscoverage:test_with_coverageoutput to Codecov where per-PR coverage gating happens.- 100% line coverage on
lib/is treated as a correctness signal — uncovered lines are read as evidence of dead code, not as a coverage gap.