html5lib 0.0.8 copy "html5lib: ^0.0.8" to clipboard
html5lib: ^0.0.8 copied to clipboard

discontinued
outdatedDart 1 only

library for working with HTML documents

html5lib in Pure Dart #

This is a pure Dart html5 parser. It's a port of html5lib from Python. Since it's 100% Dart you can use it safely from a script or server side app.

Eventually the parse tree API will be compatible with dart:html, so the same code will work on the client or the server.

Installation #

Add this to your pubspec.yaml (or create it):

dependencies:
  html5lib: any

Then run the Pub Package Manager (comes with the Dart SDK):

pub install

Usage #

Parsing HTML is easy!

import 'package:html5lib/html5parser.dart' as html5parser;
import 'package:html5lib/dom.dart');

main() {
  var document = html5parser.parse(
    '<body>Hello world! <a href="www.html5rocks.com">HTML5 rocks!');
  print(document.outerHTML);
}

You can pass a String, RandomAccessFile, or list of bytes to parse. There's also parseFragment for parsing a document fragment, and HTMLParser if you want more low level control.

Updating #

You can upgrade the library with:

pub update

Disclaimer: the APIs are not finished. Updating may break your code. If that happens, you can check the commit log, to figure out what the change was.

If you want to avoid breakage, you can also put the version constraint in your pubspec.yaml in place of the word any.

Implementation Status #

Right now the tokenizer, html5parser, and simpletree are working.

These files from the html5lib directory still need to be ported:

  • ihatexml.py
  • sanitizer.py
  • filters/*
  • serializer/*
  • some of treebuilders/*
  • treewalkers/*
  • the tests corresponding to the above files

Running Tests #

All tests should be passing.

# Make sure dependencies are installed
pub install

# Run command line tests
#export DART_SDK=path/to/dart/sdk
test/run.sh
0
likes
0
pub points
4%
popularity

Publisher

verified publisherlabs.dart.dev

library for working with HTML documents

Repository (GitHub)
View/report issues

License

unknown (LICENSE)

Dependencies

args, logging, unittest

More

Packages that depend on html5lib