lambe 0.7.0
lambe: ^0.7.0 copied to clipboard
Query JSON, YAML, TOML, HCL, and Markdown files with a composable pipeline DSL. Like jq but multi-format, with cleaner syntax. CLI tool + Dart library + MCP server for AI agents.
0.7.0 #
Shape-gated tab completion, single-source-of-truth pipe-op metadata,
and inferShape correctness fixes. Builds on the 0.6.0 shape work:
the completer now uses the same shape machinery that powers
--explain and as(fmt) to hide candidates that would throw at
runtime.
Added #
- Shape-gated pipe-op completion.
.x | <TAB>filters the candidate list by the inferred input shape. A map input hides list-only ops (flatten,sort,sum,first); a list input hides map-only ops (filter_keys,has,map_values,to_entries). Ops that accept any input (as,type) are offered everywhere. When the shape inference isSAny, every op is offered — rejection only happens when the op can be proven incompatible. - Single source of truth for pipe-op metadata.
lib/src/shape/pipe_ops.dartowns, for each of the 27 pipe ops: canonical name, input-shape acceptance predicate, output-shape inference rule, and parse metadata. The parser builds itszeroArgandoneArgalternatives from this table (customgrammar likeas(fmt)is still hand-written); the completer consults it for candidate filtering;inferShapedispatches pipe-op cases through it. Adding a new op with standard grammar is a single spec entry plus an AST case (compile-enforced via sealedLamExpr) plus an evaluator case (compile-enforced). PipeOpInfo,PipeOpParseKind,pipeOpSpecs,pipeOpInfoFor,pipeOpInfoForName,acceptsInputShape,inferPipeOpShape. Exported frompackage:lambe/lambe.dartso tools can reason about op metadata without parsing a query.pipeOpSpecsis the iteration-friendly view,pipeOpInfoFor(astNode)resolves by AST type,pipeOpInfoForName(str)resolves by name. ThePipeOpInforecord shape may gain additional fields in future minor releases as the shape machinery evolves (e.g. richer element-level predicates, documentation strings). Callers that only need stable access should prefer the helper functions (acceptsInputShape,inferPipeOpShape,pipeOpInfoForName) over destructuringPipeOpInforecords directly.- Consistency test matrix.
test/pipe_ops_consistency_test.dartruns every pipe op against a representative value of every concrete shape kind and cross-checks the spec'sacceptspredicate with the evaluator's actual runtime behavior. Drift between spec and evaluator fails loudly instead of silently.
Fixed #
inferShapeno longer lies on structurally incompatible input.flatten,sort,reverse,unique,filter_values,lengthpreviously returned the input shape unchanged when given something the runtime evaluator would reject (e.g.flattenon a map). They now widen toSAny, so--explainreports the truth and downstream inference doesn't propagate impossible shapes.- Re-assertion filter. Candidates whose text exactly matches
what's already typed in
[start, end)are filtered out before returning. Accepting such a candidate is a no-op on the text but moves the cursor backward, which users read as "Tab erased what I typed." Tab on fully-typed tokens is now a silent no-op.
Changed #
- Parser pipe-op rules generated from the spec table.
lib/src/parser.dart's_pipeOpis built by iteratingpipeOpSpecslongest-name-first and dispatching onPipeOpParseKind. The hand-written alternation for the 26 non-custom ops is gone.as(fmt)remains hand-written because its grammar takes a closed keyword set. pipeOpNamesre-exported fromshape/pipe_ops.dart. The parser, the completer, and the misspelling-suggestion logic all read from the same derived list.
Breaking #
Completionstypedef now carries anendfield. Callers that destructured as(:start, :candidates)must destructure as(:start, :end, :candidates)and splice withtext.replaceRange(start, end, candidate)instead oftext.replaceRange(start, cursor, candidate). The new field lets callers splice[start, end)and preserve any trailing whitespace the user typed after a complete token, which the previousstart..cursorsplice consumed.
Docs #
ROADMAP.md. Publishes the 0.7.0 / 0.8.0 / 0.9.0 plan plus explicit non-goals (no Turing-completeness, no streaming, no jq feature parity).- Removed
PLAN_COMPLETER_WHITESPACE_FIX.md(shipped) andISSUES.md(items resolved or tracked on GitHub).
0.6.1 #
Tab completion fix: trailing whitespace in the REPL query no longer
corrupts the replacement offset. Typing .dependencies, a space, then
Tab now completes against .dependencies instead of producing
..dependencies.
Fixed #
- Completer: the replacement
startoffset is now correct when the query has trailing whitespace (space, tab, CR, LF, or any mixture). Previously.users+ Tab returnedstart: 1instead ofstart: 0, which caused the REPL and the arda-web playground to splice the candidate in the wrong position. - Completer:
??,?., and??=were previously split across multiple tokens in the unparsed-remainder classifier. They now match as single operators before falling through.
Changed #
- Completer: unparsed-remainder classification no longer uses regex.
Two small Rumil parsers (
_pipeCtx,_fieldTailCtx) handle pipe-op and field-tail contexts, withposition()for offset tracking. Whitespace handling is uniform across space, tab, CR, and LF. - Dependencies:
rumil,rumil_parsers,rumil_expressionsbumped to^0.6.0. Rumil 0.6.0 adds theposition()primitive used by the completer fix.
0.6.0 #
Shape-aware output with interactive bridging. Lambe now infers the structural shape of query results, reports incompatibilities with target output formats as structured errors, and can bridge common mismatches through a new language combinator or through interactive prompts.
Added #
ShapeADT. A sealed hierarchy (SAny,SNull,SBool,SNum,SString,SList,SMap) describing the structural kind of a value.shapeOf(value)infers the shape of any JSON-shaped value in time proportional to structure depth, using bounded sampling on lists.renderShape(shape)produces the canonical human-readable form (list<map<a: number, b: string>>).canWriteAs(value, format)andcanWriteShapeAs(shape, format). Return aShapeReport(WritableorNotWritable). TheNotWritablecase carries the mismatched shape, the format's requirement, and a list ofRemediationrecords describing curated query-fragments that bridge the mismatch.inferShape(ast, inputShape). A structural interpreter overLamExpr. Given the shape of the value.refers to, returns the shape the query would produce. Every pipeline operator has a rule; the interpreter falls back toSAnywhere output cannot be determined without runtime values.synthesize(from, target)andsynthesizeWithLabels(from, target). Produce AST fragments (or fullRemediationrecords) that bridgefromtotarget's shape requirement.applyBridge(user, bridge)composes a user query with a bridge fragment into a single AST viaPipe, avoiding string manipulation.as(format)combinator. A new pipeline operator written directly in the query language:.users | as(toml)produces a TOML-compatible value if exactly one curated bridge applies, and throws with the candidate list otherwise. Acceptsjson,yaml,toml,csv,tsv,hcl.--explainCLI flag. Prints the inferred shape at each pipe stage of a query, plus the set of output formats the final shape can be serialized as. Performs static analysis only; does not execute the query. Works with or without input data.- Interactive suggestion prompts. When
lam --to <fmt>would produce anOutputShapeErroron an interactive terminal, the CLI now lists the available remediations and applies the chosen one. The REPL shows the same prompt inline and retries the query with the selected bridge. - Structured MCP error payload. The
lambe_queryMCP tool now returns shape-mismatch errors as a JSON object witherror,message,format,got_shape,original_expression, and asuggestionsarray (each entry withid,label,template_text,apply_as,explanation). Agents can respond by calling the tool again with anapply_asquery verbatim. parseAst(expression)andevaluateAst(ast, data)library entry points. The existingquery(expression, data)is now defined asevaluateAst(parseAst(expression), data). Callers that parse once and evaluate against multiple inputs, or that compose a parsed AST with a remediation viaapplyBridge, should use these directly.OutputShapeErrorsubclass ofQueryError. Carries the structuredNotWritablereport with getters forformat,got,required, andsuggestions. Existingcatch (QueryError)handlers continue to work; the new subclass is available for code that wants to render suggestions programmatically.
Changed #
- Completer migrated to shape-based inference. The REPL's tab
completer now walks the parsed AST over a single inferred
Shapetree rather than over a reduced value. Behaviour is unchanged (the same candidates are returned for every case). Benchmark medians are within run-to-run noise of the previous release. - CLI error messages for unwritable output.
lam --to <fmt>now reports shape mismatches with a short teaching message and a list of candidate bridges appended with|, rather than a raw runtime exception.
Fixed #
- AOT benchmark harness.
tool/bench/run.dartgained--aotand--runs Nflags. The AOT path removes JIT warmup from the measurement; the multi-run median of medians suppresses per-process noise so smaller regressions are visible.
0.5.0 #
Added #
to_numberpipeline op. Parses a string as a number; pass-through for existing numbers. Matches CSV and TSV cells, which are strings by default:. | map(.price | to_number) | sum. ThrowsQueryErroron strings that do not parse.typepipeline op. Returns the runtime type of the input as a string:"null","boolean","number","string","array", or"object". Example:. | filter((. | type) == "number").query()andeval()normalize input data. Maps and lists with non-canonical static types (e.g.Map<dynamic, dynamic>from some third-party decoders, or typed literals like<int>[1, 2, 3]) are recursively rebuilt asMap<String, Object?>andList<Object?>before evaluation. Previously these caused cryptic type-cast errors inside the evaluator.queryStringskips this step sinceparseInputalready produces canonical trees. Maps with non-string keys throwQueryErrorwith a clear message.
Performance #
- REPL tab completion is now independent of dataset size. The completer
reduces the data to a shape representative (one sample per list, all map
keys preserved) before walking the partial AST, so operations like
sort_by,group_by, anduniqueno longer execute against the full data. Median completion latency at 1M records drops from ~380ms–1.2s (depending on pipeline ops) to ~1–2ms. Peak resident set during a completion drops from hundreds of MB to the cost of the shape tree. Completion semantics are unchanged: the candidate lists are identical. Benchmark harness undertool/bench/.
Fixed #
unique,unique_by, andgroup_bynow use structural equality on collection-valued keys. Previously these operations relied on Dart's native==forListandMap, which is reference equality, so[{"a":1}, {"a":1}] | uniquereturned both entries instead of one. The evaluator now canonicalizes keys via JSON with sorted map keys before insertion into the hash set/map. Scalar keys (num,bool,String,null) still deduplicate by value as before. Key order in maps no longer affects equality:{"a":1, "b":2}and{"b":2, "a":1}are treated as equal.EvalExceptionfromrumil_expressionsis now wrapped asQueryErrorat the public API boundary (query()andeval()). Previously, type errors in the evaluator (e.g.,.x > 5where.xis a string, ornull + 1) would leak the underlyingEvalExceptionwith a full Dart stack trace, crashingbin/lam.dartwith exit code 255 instead of reporting a clean error with exit code 1. The REPL was not affected because it already had a catch-all handler. The docstring forquery()already advertisedQueryErroras the evaluation error type; this brings the implementation in line with the contract.- REPL banner now uses the actual
lambeVersionfrom_version.dartinstead of a hardcodedv0.1.0string.
Docs #
- Tagline in the library doc comment and MCP server instructions changed from "universal" to "multi-format" — accurate given the specific format set (JSON, YAML, TOML, HCL, CSV, TSV, Markdown).
AGENTS.mdno longer references the unimplemented..(recursive descent) operator in Markdown query examples. The 0.4.0 changelog noted this was removed fromAI.mdbutAGENTS.mdwas missed.AI.mdandAGENTS.mdpipeline operation lists now includeto_numberandtype.
Release infrastructure #
- Release matrix now builds Linux ARM64 and macOS ARM64 (Apple Silicon) in addition to x64 and Windows. The MCP registry manifest covers all five platforms.
- GitHub Actions bumped:
upload-artifactv4→v7,download-artifactv4→v8,action-gh-releasev2→v3.
0.4.0 #
Added #
- Pipeline ops are now valid bare expressions with implicit
.input.has("k"),length,keys,sum,filter(...),map(...)and every other pipe op can appear as standalone expressions —has("k")parses as sugar for. | has("k"). This also unblocks common shapes likemap(has("email")),filter(has("k")), andfilter(length > 0). Bare ops are only consulted after the other_atomalternatives fail, so existing forms like{length}object shorthand,.lengthfield access, and"\(length)"string interpolation keep their prior meaning.
Breaking #
- XML input/output support removed.
Format.xml,OutputFormat.xml, and XML extension detection (.xml,.pom,.csproj,.svg) are gone. The XML→native projection was lossy in ways that silently produced wrong query results (repeated sibling elements collapsed under last-wins map semantics; attributes were dropped entirely). Rather than ship a footgun, XML is dropped for now. The underlying XML parser inrumil_parsersis unchanged and remains spec-compliant; a future lambe release may reintroduce XML with a proper projection (array-preserved siblings, attribute preservation) once the design is settled.
MCP surface #
output_formatparameter on thelambe_queryMCP tool. AI agents can now request yaml/toml/csv/tsv/hcl output directly, matching the CLI's--toflag. Defaults to json.- CSV and TSV exposed through the MCP surface. The library always
supported them; the MCP
formatenum was missing them. - MCP tool descriptions now document common pitfalls:
&&/||for boolean logic (notand/or), bracket syntax for hyphenated keys,has()and other pipeline ops requiring a leading|, and the[{key, values}]shape ofgroup_byoutput. - Build-time version generation.
tool/gen_version.dartreadspubspec.yamland writeslib/src/_version.dart, which the MCP server uses to report its version. Run after bumping the pubspec; the release workflow also runs it automatically. test/doc_examples_test.dart— AI-doc and MCP-instruction examples are now test-gated. Everylam '...'in AI.md and every embedded query in the MCP server's tool descriptions/instructions is parsed and evaluated against a fixture. Prevents future phantom-feature drift (e.g., LLM-drafted examples that advertise syntax the parser doesn't implement).
Fixed #
- MCP server now reports its actual version.
bin/mcp_server.darthad hardcoded0.1.0since that release and was never bumped. - Removed phantom
..(recursive descent) references from docs. The operator was advertised inAI.mdand the MCP server instructions as a Markdown query pattern but was never implemented. Callers who saw it would have hit parse errors. - Fixed broken example in
AI.md:filter(has("resources") == false)→filter((. | has("resources")) == false).hasis a pipeline op and cannot appear as a bare expression.
0.3.0 #
Added #
- Markdown support. CommonMark Markdown (.md, .markdown) is now a queryable input format. Parsed into a typed AST with node types like heading, paragraph, link, code_block, list, image, emphasis, etc.
mdToNativepublic API for convertingMdDocumentto queryable Dart types- Markdown query examples in MCP server instructions, AI.md, and AGENTS.md
Changed #
- Bumped rumil, rumil_parsers, rumil_expressions to ^0.5.0
- Rewrote
tool/manpage.dartto useparseMarkdown+parseYamlfrom rumil_parsers instead of handrolled parser - 491 tests (was 465)
0.2.0 #
Breaking #
|is expression composition.PipeOpsealed class removed. Pipeline operations are nowLamExprsubtypes. Any expression can appear after|:.users[0] | {name, age},. | if .active then "yes" else "no".
Improved #
- Parser error messages show position pointers and contextual descriptions
- "Did you mean?" suggestions for misspelled pipeline operations
- MCP tool descriptions expanded with syntax reference and common patterns
- Expanded recipes: object projection, string interpolation, chaining patterns
Added #
doc/jq-to-lambe.mdmigration guidetest/syntax_examples_test.dartbacking every example indoc/syntax.md- 465 tests (was 369)
0.1.1 #
- Added
.mcp.jsonfor automatic MCP server discovery in AI coding assistants - Documented MCP server setup in README
- Added query syntax guide, REPL guide, recipes, and man page to
doc/
0.1.0 #
Core #
- Query AST: sealed
LamExprhierarchy (16 subtypes) + sealedPipeOp(24 subtypes) - Left-recursive parser via Rumil's
rule()+ Warth seed-growth - Operator precedence via layered
chainl1calls - Null propagation: navigation propagates null, computation throws on type errors
- Tolerant parsing via
.recover()for REPL completion and multi-line detection
Query Language #
- Property access chains:
.users[0].address.city - Negative indexing:
.items[-1] - String key indexing:
.data["key"] - Slicing:
.[1:3],.[:3],.[2:],.[:-1] - Arithmetic:
+,-,*,/,% - Comparison:
<,<=,>,>=,==,!= - Boolean logic:
&&,||,! - Object construction with shorthand:
{name, total: .price * .qty} - Conditionals:
if .age > 65 then "senior" else "active" - String interpolation:
"\(.name) is \(.age) years old"
Pipeline Operations (24) #
- Filter and transform:
filter,map - Ordering:
sort,sort_by,reverse - Grouping:
group_by(returns{key, values}structure) - Deduplication:
unique,unique_by - Structure:
flatten,keys,values,length,first,last - Aggregation:
sum,avg,min,max - Map operations:
filter_values,map_values,filter_keys - Existence:
has - Entry conversion:
to_entries,from_entries
Multi-format I/O #
- Input: JSON, YAML, TOML, HCL, XML, CSV, TSV with auto-detection
- Output:
--to json/yaml/toml/xml/csvfor format conversion --schemafor data structure inference--assertfor CI/CD validation (exit 0 if true, 1 if false)
Interactive REPL (lam -i) #
- Parser-driven tab completion on field names, pipeline operations, and inner fields
- Syntax highlighting and colorized JSON output
- Persistent history (
~/.lambe_history) with Ctrl+R reverse search - Multi-line input with
\continuation and parser-driven bracket detection - Ctrl+Left/Right word movement, Ctrl+A/E/K/U editing shortcuts
- REPL commands:
:schema,:to,:raw,:pretty,:load,:history,:help,:quit
API #
- Library:
query(),queryJson(),queryString(),parse(),eval() - Output:
formatOutput(),inferSchema() - CLI:
lam '<expression>' [file]with all flags - MCP server:
lambe_query,lambe_schema,lambe_asserttools
Ecosystem #
lambe_testpackage with matchers:lamWhere,lamEquals,lamMatches,lamHas- MCP server installable via
dart pub global activate lambe→lam-mcp