html5lib 0.0.3 copy "html5lib: ^0.0.3" to clipboard
html5lib: ^0.0.3 copied to clipboard

discontinued
outdatedDart 1 only

library for working with HTML documents

html5lib in Pure Dart #

This is a pure Dart html5 parser. It's a port of html5lib from Python. Since it's 100% Dart you can use it safely from a script or server side app.

Eventually the parse tree API will be compatible with dart:html, so the same code will work on the client or the server.

Installation #

Add this to your pubspec.yaml (or create it):

dependencies:
  html5lib:
    git: https://github.com/dart-lang/html5lib.git

Then run the Pub Package Manager (comes with the Dart SDK):

pub install

Usage #

Parsing HTML is easy!

#import('package:html5lib/html5parser.dart', prefix: 'html5parser');

main() {
  var document = html5parser.parse(
    '<body>Hello world! <a href="www.html5rocks.com">HTML5 rocks!');
  print(document.outerHTML);
}

You can pass a String, RandomAccessFile, or list of bytes to parse. There's also parseFragment for parsing a document fragment, and HTMLParser if you want more low level control. Finally, you can get the simple DOM tree types like this:

#import('package:html5lib/treebuilders/simpletree.dart');

Updating #

You can upgrade the library with:

pub update

Disclaimer: the APIs are not finished. Updating may break your code. If that happens, you can check the commit log, to figure out what the change was.

Implementation Status #

Right now the tokenizer, html5parser, and simpletree are working.

These files from the html5lib directory still need to be ported:

  • ihatexml.py
  • sanitizer.py
  • filters/*
  • serializer/*
  • most of treebuilders/*
  • treewalkers/*
  • most of tests

Running Tests #

All tests should be passing.

# Make sure dependencies are installed
pub install

# Run command line tests
#export DART_SDK=path/to/dart/sdk
tests/run.sh
0
likes
0
pub points
6%
popularity

Publisher

verified publisherlabs.dart.dev

library for working with HTML documents

License

unknown (license)

Dependencies

args, logging, unittest

More

Packages that depend on html5lib