Beautiful Soup Dart #

Dart native package inspired by Beautiful Soup 4 Python library. Provides easy ways of navigating, searching, and modifying the HTML tree.

Usage #

A simple usage example:

import 'package:beautiful_soup_dart/beautiful_soup.dart';

/// 1. parse a document String
BeautifulSoup bs = BeautifulSoup(html_doc_string);
// use BeautifulSoup.fragment(html_doc_string) if you parse a part of html

/// 2. navigate quickly to any element
bs.body!.a!; // navigate quickly with tags, use outerHtml or toString to get outer html
bs.find('p', class_: 'story'); // finds first element with html tag "p" and which has "class" attribute with value "story"
bs.findAll('a', attrs: {'class': true}); // finds all elements with html tag "a" and which have defined "class" attribute with whatever value
bs.find('', selector: '#link1'); // find with custom CSS selector (other parameters are ignored)
bs.find('*', id: 'link1'); // any element with id "link1"
bs.find('*', regex: r'^b'); // find any element which tag starts with "b", for example: body, b, ...
bs.find('p', string: r'^Article #\d*'); // find "p" element which text starts with "Article #[number]"
bs.find('a', attrs: {'href': 'http://example.com/elsie'}); // finds by "href" attribute

/// 3. perform any other actions for the navigated element
Bs4Element bs4 = bs.body!.p!; // navigate quickly with tags
bs4.name; // get tag name
bs4.string; // get text
bs4.toString(); // get String representation of this element, same as outerHtml
bs4.innerHtml; // get html elements inside the element
bs4.className; // get class attribute value
bs4['class']; // get class attribute value
bs4['class'] = 'board'; // change class attribute value to 'board'
bs4.children; // get all element's children elements
bs4.replaceWith(otherBs4Element); // replace with other element
... and many more

Check test folder for more examples.

Table of Contents #

The unlinked titles are not yet implemented.

Navigating the tree
- Going down
- Going up
  - .parent
  - .parents
- Going sideways
  - .nextSibling and .previousSibling
  - .nextSiblings and .previousSiblings
- Going back and forth
  - .nextElement and .previousElement - returns next/previous Bs4Element
  - .nextElements and .previousElements
  - .nextParsed and .previousParsed - returns next/previous any parsed Node (doc comments, tags, text), to get its data as String use node.data
  - .nextParsedAll and .previousParsedAll
Searching the tree
- findFirstAny() - returns the top most (first) element of the parse tree, of any tag type
- findAll()
- find()
- findParents() and findParent()
- findNextSiblings() and findNextSibling()
- findPreviousSiblings() and findPreviousSibling()
- findAllNextElements() and findNextElement()
- findAllPreviousElements() and findPreviousElement()
- findNextParsedAll() and findNextParsed()
- findPreviousParsedAll() and findPreviousParsed()
Modifying the tree
- Changing tag names and attributes
- Modifying .string
- append()
- extend()
- newTag()
- insert()
- insertBefore() and insertAfter()
- clear()
- extract()
- decompose()
- replaceWith()
- wrap()
- unwrap()
- smooth()
Output
- prettify() - partial support
- .text and getText()

Other methods from the Element from html package can be accessed via bs4element.element.

Features and bugs #

Please file feature requests and bugs at the issue tracker or feel free to raise a PR.

beautiful_soup_dart 0.3.0
beautiful_soup_dart: ^0.3.0 copied to clipboard

Metadata

Beautiful Soup Dart #

Usage #

Table of Contents #

Features and bugs #

← Metadata

Publisher

Weekly Downloads

Metadata

Documentation

License

Dependencies

More

beautiful_soup_dart 0.3.0 beautiful_soup_dart: ^0.3.0 copied to clipboard

Metadata

Beautiful Soup Dart #

Usage #

Table of Contents #

Features and bugs #

← Metadata

Publisher

Weekly Downloads

Metadata

Documentation

License

Dependencies

More

beautiful_soup_dart 0.3.0
beautiful_soup_dart: ^0.3.0 copied to clipboard