LLM Tag Parser

The streaming tag parser for AI applications

Parse and isolate tagged blocks reactively as LLM responses stream in. Subscribe to blocks and receive values chunk-by-chunk as they are generated: no waiting for the complete response.

API Docs · GitHub

The Problem
The Solution
Quick Start
How It Works
Feature Highlights
Complete Example
API Reference
Robustness
LLM Provider Setup
Contributing
License

The Problem

LLMs stream responses token-by-token. Often, they generate mixed responses containing both conversational text and specialized blocks like thoughts, tool calls, or code blocks. Traditional string-searching or regex-based approaches fail because:

Approach	Problem
Wait for complete response	Introduces high latency, defeats the purpose of streaming
Substring search on raw stream	Fails on partial tags split across chunks, high complexity
Custom state-machine parser	Hard to implement, error-prone, handles boundaries poorly

The Solution

LLM Tag Parser processes streams token-by-token as they arrive, allowing you to subscribe to content inside tags, content outside tags, and even nested tags the moment they begin streaming.

Instead of waiting for the entire response to finish, you can:

Render thought blocks in a collapsible UI element in real-time
Stream tool parameters progressively
Keep conversational text completely separated from structured code blocks

Quick Start

# pubspec.yaml
dependencies:
  llm_tag_parser: ^0.2.1

import 'package:llm_tag_parser/llm_tag_parser.dart';

final parser = LlmTagParser(
  stream: llmResponseStream,
  tags: [
    LlmTag(open: '<thinking>', close: '</thinking>'),
  ],
);

// Stream thought content chunk-by-chunk
parser.within('<thinking>').stream.listen((chunk) {
  print('Thought: $chunk');
});

// Stream conversational text outside thinking blocks
parser.outside('<thinking>').stream.listen((chunk) {
  print('Chat: $chunk');
});

How It Works

Two APIs for Every Match

Every isolated block (within or outside) provides both a stream for real-time updates and a future for the complete value:

final content = parser.within('<thinking>');

content.stream.listen((chunk) => ...); // Incremental chunks as they arrive
final complete = await content.future;  // The fully accumulated string

Use Case	API
Smooth UI typing effects	`.stream`
Accumulating tool calls, JSON, or processing	`.future`

Feature Highlights

Streaming Inner Content

Isolate and render block contents as they stream in:

parser.within('<thinking>').stream.listen((chunk) {
  updateThoughtUI(chunk);
});

Streaming Outer Content

Capture only the main conversational text, omitting thoughts or tool invocations completely:

parser.outside('<thinking>').stream.listen((chunk) {
  updateMainChatUI(chunk);
});

Hierarchical Nesting

Tag parsing supports nesting. You can drill down into tag hierarchies:

// Extract tool_use that is inside a thinking block
parser.within('<thinking>').within('<tool_use>').stream.listen((chunk) {
  print('Nested tool chunk: $chunk');
});

Attribute Extraction

XML-style tags often pack metadata:

final parser = LlmTagParser(
  stream: stream,
  tags: [
    LlmTag(open: '<interface {attrs}>', close: '</interface>'),
  ],
);

// Get a Map of the parsed attributes
final attrs = await parser.within('<interface {attrs}>').attributes;
print('ID: ${attrs['id']}');

// Or listen to an attribute value as it streams
parser.within('<interface {attrs}>').attribute('id').listen((idValue) {
  print('ID updated: $idValue');
});

// Convenience helpers (equivalent to the above)
final id = await parser.within('<interface {attrs}>').getAttributeFuture('id');
parser.within('<interface {attrs}>').getAttributeStream('id').listen(print);

Parallel Tag Instances

When a response contains multiple occurrences of the same tag (e.g. several parallel <tool_use> blocks), each occurrence is a fully isolated TagNode with its own independent stream and future. Use .instances to route each one as it opens:

parser.within('<tool_use>').instances.listen((tagNode) {
  // tagNode is a distinct TagNode for this specific occurrence
  tagNode.stream.listen((chunk) {
    print('Tool chunk: $chunk');
  });

  tagNode.future.then((full) {
    print('Tool complete: $full');
  });

  // Access parsed attributes directly on the TagNode
  print('Tool name: ${tagNode.attributes['name']}');
  tagNode.getAttributeFuture('name').then(print);
});

Chronological Node Stream

The parser exposes a unified, chronological nodes stream of LlmNode objects, preserving the exact timeline order of the response. This is the low-level primitive that backs all higher-level APIs:

parser.nodes.listen((node) {
  if (node is TextNode) {
    print('Text at depths ${node.depths}: ${node.text}');
  } else if (node is TagNode) {
    print('Tag opened: ${node.tag}, attributes: ${node.attributes}');
    node.stream.listen((chunk) => print('  chunk: $chunk'));
  }
});

Node Type	Description
`TextNode`	A chunk of plain text, tagged with current nesting `depths`
`TagNode`	A tag opening event, carrying `attributes`, `stream`, and `future`

Both node types carry a depths map (Map<String, int>) indicating how deep inside each registered tag the content was emitted at.

Custom Delimiters

Not every model produces XML. You can register any pair of custom tags:

final parser = LlmTagParser(
  stream: stream,
  tags: [
    LlmTag(open: '[thinking]', close: '[/thinking]'),
    LlmTag(open: r'$think$', close: r'$end$'),
  ],
);

Robust Stream Buffering

To prevent race conditions where a subscriber listens to the stream after the initial tokens have already passed, the parser automatically buffers. A late subscriber will always receive all prior emitted chunks:

final content = parser.within('<thinking>');

// Delay subscription by some time
await Future.delayed(const Duration(milliseconds: 100));

// Still receives all chunks from the beginning of the block
content.stream.listen((chunk) => print(chunk));

Complete Example

import 'package:llm_tag_parser/llm_tag_parser.dart';

void main() async {
  final stream = llm.streamChat('Explain quantum physics');

  final parser = LlmTagParser(
    stream: stream,
    tags: [
      LlmTag(open: '<thinking>', close: '</thinking>'),
      LlmTag(open: '<interface {attrs}>', close: '</interface>'),
    ],
  );

  // Render thought block live
  parser.within('<thinking>').stream.listen((chunk) {
    print('Thought chunk: $chunk');
  });

  // Extract attributes and content from interface block
  final interface = parser.within('<interface {attrs}>');
  
  interface.attributes.then((attrs) {
    print('Render UI Panel ID: ${attrs['id']}');
  });

  interface.stream.listen((chunk) {
    print('UI Code chunk: $chunk');
  });

  // Stream conversational response
  parser.outside('<thinking>').outside('<interface {attrs}>').stream.listen((chunk) {
    print('Chat chunk: $chunk');
  });

  // Handle multiple parallel tool_use blocks via instances
  parser.within('<tool_use>').instances.listen((tagNode) {
    print('Tool opened: ${tagNode.attributes['name']}');
    tagNode.future.then((result) => print('Tool result: $result'));
  });
}

API Reference

LlmTagParser

Member	Type	Description
`.within(tag)`	`LlmTagContent`	Isolate the inner content of a tag.
`.outside(tag)`	`LlmTagContent`	Isolate the outer content (conversational text) around a tag.
`.nodes`	`Stream<LlmNode>`	Unified chronological stream of all `TextNode`s and `TagNode`s.

LlmTagContent

.stream                    // Stream<String>       — buffered, replays past chunks to late subscribers
.future                    // Future<String>       — resolves with the complete accumulated text
.attributes                // Future<Map<String, String>> — resolves with parsed attributes map
.attribute(name)           // Stream<String?>      — streams the individual attribute value
.getAttributeStream(name)  // Stream<String?>      — convenience alias for .attribute(name)
.getAttributeFuture(name)  // Future<String?>      — convenience alias for awaiting .attributes[name]
.instances                 // Stream<TagNode>      — emits a TagNode for each new tag occurrence
.within(tag)               // LlmTagContent        — chains nested tag lookups
.outside(tag)              // LlmTagContent        — chains nested sibling filters

LlmNode Types

// Base class
abstract class LlmNode {
  final Map<String, int> depths; // nesting depth per registered tag
}

// Plain text emitted between (or inside) tags
class TextNode extends LlmNode {
  final String text;
}

// A tag opening event
class TagNode extends LlmNode {
  final String tag;
  final Map<String, String> attributes;
  Stream<String> get stream;                        // content stream for this instance
  Future<String> get future;                        // complete content future
  Future<String?> getAttributeFuture(String name);  // attribute by name (future)
  Stream<String?> getAttributeStream(String name);  // attribute by name (stream)
}

Robustness

Battle-tested resilience handling the realities of streaming LLM outputs:

Category	What is Covered
Backtracking	False alarm tag beginnings (like `x < thinking`) are gracefully returned to conversational text instead of being swallowed.
Ambiguity	Handles overlapping tag prefixes (like `<think>` and `<thinking>`) using longest-match win resolution.
Self-Closing Tags	Automatically recognizes `<tag />` forms, closing the content stream immediately and extracting attributes.
Instance Isolation	Each tag occurrence (e.g. parallel `<tool_use>` blocks) is a fully isolated `TagNode` — zero content bleeding between sibling instances.
Attribute Keys	Full support for namespaces, hyphens, periods, and numbers in keys (e.g., `data-id`, `xml:lang`, `ns:a.b-c_d`).
Unquoted Values	Handles forgiving unquoted value assignments gracefully (e.g., `id=main`).
Escaped Quotes	Parses escaped quotation characters (e.g., `\"`, `\'`) inside values without data truncation.
Boolean Flags	Automatically identifies boolean/key-only attributes (e.g., `disabled` or `checked`) and maps them as flags.
Delimiter Collisions	Quote-aware tag boundaries prevent parsing errors when mathematical operators (like `age > 21`), nested brackets, or tag closing sequences occur inside attribute values.
Chunk Boundaries	Token and attribute detection remains fully invariant whether keywords and values arrive as a single chunk or are split character-by-character across streaming boundaries.

LLM Provider Setup

OpenAI

final response = await openai.chat.completions.create(
  model: 'gpt-4',
  messages: messages,
  stream: true,
);

final contentStream = response.map((chunk) => 
  chunk.choices.first.delta.content ?? ''
);

final parser = LlmTagParser(
  stream: contentStream,
  tags: [LlmTag(open: '<thinking>', close: '</thinking>')],
);

Anthropic Claude

final stream = anthropic.messages.stream(
  model: 'claude-3-5-sonnet',
  messages: messages,
);

final contentStream = stream.map((event) => event.delta?.text ?? '');

final parser = LlmTagParser(
  stream: contentStream,
  tags: [LlmTag(open: '<thinking>', close: '</thinking>')],
);

Google Gemini

final response = model.generateContentStream(prompt);
final contentStream = response.map((chunk) => chunk.text ?? '');

final parser = LlmTagParser(
  stream: contentStream,
  tags: [LlmTag(open: '<thinking>', close: '</thinking>')],
);

Utilities

XML Tag Utilities (`XmlTagUtilities`)

When parsing structured blocks that utilize standard XML/HTML format (including dot-notation, namespaces, and self-closing tags), you can cleanly extract the tag names or query properties:

XmlTagUtilities.getTagName(String openKey): A static helper that parses an XML-like tag definition (e.g., <Material.Card {attrs}> or <ui:button>) and extracts the pure tag name (Material.Card or ui:button). Returns null if the tag definition does not follow the <...> XML format.
TagNode.tagName: A convenient getter on TagNode that returns the clean tag name (using XmlTagUtilities.getTagName) directly during streaming.

final parser = LlmTagParser(
  stream: stream,
  tags: [LlmTag(open: '<Material.Card {attrs}>', close: '</Material.Card>')],
);

parser.within('<Material.Card {attrs}>').instances.listen((instance) {
  print(instance.tagName); // Prints: "Material.Card"
});

Contributing

Contributions welcome!

Check open issues on GitHub
Open an issue before making major changes
Run dart test before submitting
Match existing codebase code style

License

MIT - see LICENSE

Made for Flutter developers building the next generation of AI-powered apps

GitHub · pub.dev · Issues

LLM Tag Parser

Table of Contents

The Problem

The Solution

Quick Start

How It Works

Two APIs for Every Match

Feature Highlights

Streaming Inner Content

Streaming Outer Content

Hierarchical Nesting

Attribute Extraction

Parallel Tag Instances

Chronological Node Stream

Custom Delimiters

Robust Stream Buffering

Complete Example

API Reference

LlmTagParser

LlmTagContent

LlmNode Types

Robustness

LLM Provider Setup

Utilities

XML Tag Utilities (`XmlTagUtilities`)

Contributing

License

Libraries

llm_tag_parser package

LLM Tag Parser

Table of Contents

The Problem

The Solution

Quick Start

How It Works

Two APIs for Every Match

Feature Highlights

Streaming Inner Content

Streaming Outer Content

Hierarchical Nesting

Attribute Extraction

Parallel Tag Instances

Chronological Node Stream

Custom Delimiters

Robust Stream Buffering

Complete Example

API Reference

LlmTagParser

LlmTagContent

LlmNode Types

Robustness

LLM Provider Setup

Utilities

XML Tag Utilities (XmlTagUtilities)

Contributing

License

Libraries

llm_tag_parser package

XML Tag Utilities (`XmlTagUtilities`)