BeautifulSoup class

Beautiful Soup is a library for pulling data out of HTML files. It provides ways of navigating, searching, and modifying the parse tree. It commonly saves programmers hours or days of work.

How it should be used? 3 easy steps.

1. parse a document

BeautifulSoup bs = BeautifulSoup(html_doc_string);
BeautifulSoup bs = BeautifulSoup.fragment(html_doc_string); // if it is just a part of html

2. navigate quickly to any element

Bs4Element bs4 = bs.body.p; // quickly with tags
Bs4Element bs4 = bs.find('p', class_: 'story'); // finds first element with html tag "p" and which has "class" attribute with value "story"
Bs4Element bs4 = bs.findAll('a', attrs: {'class': true}); // finds all elements with html tag "a" and which have defined "class" attribute with whatever value
Bs4Element bs4 = bs.find('', selector: '#link1'); // find with custom CSS selector (other parameters are ignored)
Bs4Element bs4 = bs.find('*', id: 'link1'); // find by id
Bs4Element bs4 = bs.find('*', regex: r'^b'); // find any element which tag starts with "b", for example: body, b, ...
Bs4Element bs4 = bs.find('p', string: r'^Article #\d*'); // find "p" element which text starts with "Article #[number]"
Bs4Element bs4 = bs.find('a', attrs: {'href': 'http://example.com/elsie'}); // finds by "href" attribute

3. perform any actions

bs4.name; // get tag name
bs4.string; // get text
bs4.toString(); // get String representation of this element, same as outerHtml
bs4.innerHtml; // get html elements inside the element
bs4.className; // get class attribute value
bs4['class']; // get class attribute value
bs4['class'] = 'board'; // change class attribute value to 'board'
bs4.children; // get all element's children elements
bs4.replaceWith(otherBs4Element); // replace with other element

and many more!

Constructors

BeautifulSoup(String html_doc)
Beautiful Soup is a library for pulling data out of HTML files. It provides ways of navigating, searching, and modifying the parse tree. It commonly saves programmers hours or days of work.
BeautifulSoup.fragment(String html_doc)
Beautiful Soup is a library for pulling data out of HTML files. It provides ways of navigating, searching, and modifying the parse tree. It commonly saves programmers hours or days of work.

Properties

a Bs4Element?
Returns the first occurrence of this tag down the parse tree.
no setterinherited
b Bs4Element?
Returns the first occurrence of this tag down the parse tree.
no setterinherited
body Bs4Element?
Returns the first occurrence of this tag down the parse tree.
no setterinherited
dl Bs4Element?
Returns the first occurrence of this tag down the parse tree.
no setterinherited
doc ↔ dynamic
Returns Document or DocumentFragment, based on what parser was used with the BeautifulSoup constructor.
getter/setter pairinherited
element ↔ Element?
getter/setter pairinherited
h1 Bs4Element?
Returns the first occurrence of this tag down the parse tree.
no setterinherited
h2 Bs4Element?
Returns the first occurrence of this tag down the parse tree.
no setterinherited
h3 Bs4Element?
Returns the first occurrence of this tag down the parse tree.
no setterinherited
h4 Bs4Element?
Returns the first occurrence of this tag down the parse tree.
no setterinherited
h5 Bs4Element?
Returns the first occurrence of this tag down the parse tree.
no setterinherited
h6 Bs4Element?
Returns the first occurrence of this tag down the parse tree.
no setterinherited
hashCode int
The hash code for this object.
no setterinherited
Returns the first occurrence of this tag down the parse tree.
no setterinherited
html Bs4Element?
Returns the first occurrence of this tag down the parse tree.
no setterinherited
i Bs4Element?
Returns the first occurrence of this tag down the parse tree.
no setterinherited
img Bs4Element?
Returns the first occurrence of this tag down the parse tree.
no setterinherited
ol Bs4Element?
Returns the first occurrence of this tag down the parse tree.
no setterinherited
p Bs4Element?
Returns the first occurrence of this tag down the parse tree.
no setterinherited
runtimeType Type
A representation of the runtime type of the object.
no setterinherited
table Bs4Element?
Returns the first occurrence of this tag down the parse tree.
no setterinherited
text String
Returns the text of an element.
no setterinherited
title Bs4Element?
Returns the first occurrence of this tag down the parse tree.
no setterinherited
ul Bs4Element?
Returns the first occurrence of this tag down the parse tree.
no setterinherited

Methods

find(String name, {String? id, String? class_, Map<String, Object>? attrs, Pattern? regex, Pattern? string, String? selector}) Bs4Element?
Looks through a tag’s descendants and retrieves descendant that matches your filters.
inherited
findAll(String name, {String? id, String? class_, Map<String, Object>? attrs, Pattern? regex, Pattern? string, String? selector, int? limit}) List<Bs4Element>
Looks through a tag’s descendants and retrieves all descendants that match your filters.
inherited
findAllNextElements(String name, {String? id, String? class_, Map<String, Object>? attrs, Pattern? regex, Pattern? string, String? selector, int? limit}) List<Bs4Element>
These methods use nextElements to iterate over elements that come after it in the document.
inherited
findAllPreviousElements(String name, {String? id, String? class_, Map<String, Object>? attrs, Pattern? regex, Pattern? string, String? selector, int? limit}) List<Bs4Element>
These methods use previousElements to iterate over the tags and strings that came before it in the document.
inherited
findFirstAny() Bs4Element?
Returns the top most (first) element of the parse tree, of any tag type.
inherited
findNextElement(String name, {String? id, String? class_, Map<String, Object>? attrs, Pattern? regex, Pattern? string, String? selector}) Bs4Element?
These methods use nextElements to iterate over elements that come after it in the document.
inherited
findNextParsed({RegExp? pattern, int? nodeType}) → Node?
These methods use nextParsed to iterate over the tags, comments, strings, etc. that came after it in the document.
inherited
findNextParsedAll({RegExp? pattern, int? nodeType, int? limit}) List<Node>
These methods use nextParsed to iterate over the tags, comments, strings, etc. that came after it in the document.
inherited
findNextSibling(String name, {String? id, String? class_, Map<String, Object>? attrs, Pattern? regex, Pattern? string, String? selector}) Bs4Element?
These methods use nextSiblings to iterate over the rest of an element’s siblings in the tree.
inherited
findNextSiblings(String name, {String? id, String? class_, Map<String, Object>? attrs, Pattern? regex, Pattern? string, String? selector, int? limit}) List<Bs4Element>
These methods use nextSiblings to iterate over the rest of an element’s siblings in the tree.
inherited
findParent(String name, {String? id, String? class_, Map<String, Object>? attrs, Pattern? regex, Pattern? string, String? selector}) Bs4Element?
findAll and find work their way down the tree, looking at tag’s descendants.
inherited
findParents(String name, {String? id, String? class_, Map<String, Object>? attrs, Pattern? regex, Pattern? string, String? selector, int? limit}) List<Bs4Element>
findAll and find work their way down the tree, looking at tag’s descendants.
inherited
findPreviousElement(String name, {String? id, String? class_, Map<String, Object>? attrs, Pattern? regex, Pattern? string, String? selector}) Bs4Element?
These methods use previousElements to iterate over the tags and strings that came before it in the document.
inherited
findPreviousParsed({RegExp? pattern, int? nodeType}) → Node?
These methods use previousParsed to iterate over the tags, comments, strings, etc. that came before it in the document.
inherited
findPreviousParsedAll({RegExp? pattern, int? nodeType, int? limit}) List<Node>
These methods use previousParsed to iterate over the tags, comments, strings, etc. that came before it in the document.
inherited
findPreviousSibling(String name, {String? id, String? class_, Map<String, Object>? attrs, Pattern? regex, Pattern? string, String? selector}) Bs4Element?
These methods use previousSiblings to iterate over an element’s siblings that precede it in the tree.
inherited
findPreviousSiblings(String name, {String? id, String? class_, Map<String, Object>? attrs, Pattern? regex, Pattern? string, String? selector, int? limit}) List<Bs4Element>
These methods use previousSiblings to iterate over an element’s siblings that precede it in the tree.
inherited
getText({String separator = '', bool strip = false}) String
Returns the text of an element.
inherited
noSuchMethod(Invocation invocation) → dynamic
Invoked when a nonexistent method or property is accessed.
inherited
prettify() String
The method will turn a BeautifulSoup parse tree into a nicely formatted String, with a separate line for each tag and each string.
inherited
toString() String
A string representation of this object.
override

Operators

operator ==(Object other) bool
The equality operator.
inherited

Static Methods

newTag(String? name, {Map<String, String>? attrs, String? string}) Bs4Element
Creates a new Bs4Element.