html_parser_plus 0.1.0 copy "html_parser_plus: ^0.1.0" to clipboard
html_parser_plus: ^0.1.0 copied to clipboard

Use custom rule to parse html document.

Html Parser #

This is a package use to parse html document into nodes and string.

Install #

flutter pub add html_parser_plus

Getting started #

import 'package:html_parser_plus/html_parser_plus.dart';

void main() {
  const String htmlString = '''
      <html lang="en">
      <body>
      <div><a href='https://github.com/simonkimi'>author</a></div>
      <div class="head">div head</div>
      <div class="container">
          <table>
              <tbody>
                <tr>
                    <td id="td1" class="first1">1</td>
                    <td id="td2" class="first1">2</td>
                    <td id="td3" class="first2">3</td>
                    <td id="td4" class="first2 form">4</td>

                    <td id="td5" class="second1">one</td>
                    <td id="td6" class="second1">two</td>
                    <td id="td7" class="second2">three</td>
                    <td id="td8" class="second2">four</td>
                </tr>
              </tbody>
          </table>
      </div>
      <div class="end">end</div>
      </body>
      </html>
      ''';
  final parser = HtmlParser();
  var node = parser.parse(htmlString);
  parser.query(node, '//div/a@text');
  parser.query(
    node,
    '//div/a/@href|function:replace(https://,)|function:substring(0,10)',
  );
  parser.queryNodes(node, '//tr/td|function:sublist(0,2)');

  const String jsonString = '''
      {"author":"Cals Ranna","website":"https://github.com/CalsRanna","books":[{"name":"Hello"},{"name":"World"},{"name":"!"}]}
      ''';

  node = parser.parse(jsonString);
  parser.query(node, r'$.author');
  parser.query(
    node,
    r'$.website|function:replace(https://,)|function:substring(0,10)',
  );
  parser.queryNodes(node, r'$.books|function:sublist(0,2)');
}


Usage #

So far, we have supported:

  • some xpath syntax by xpath_selector
  • almost all jsonPath syntax by json_path

And five functions list below:

  • sublist for List<HtmlParserNode>
  • replace for String
  • replaceRegExp for String
  • substring for String
  • trim for String

You should know that in the function, the params can be or not be wrapped by ' or ".

And the rule is like //div/a@text|function:replace(Author,)|function:replace(' ','').

Use | to pipe all rules.