girasol library
Classes
- AfterAfterBodyPhase
- AfterAfterFramesetPhase
- AfterBodyPhase
- AfterFramesetPhase
- AfterHeadPhase
- BeforeHeadPhase
- BeforeHtmlPhase
- BrowserHttpClient
- A browser-like HTTP client that automatically sets appropriate headers
- CrawlDocument
- CrawlRequest
- Represents an HTTP request
- CrawlResponse
- Represents an HTTP response with its request
- CsvItem
- Necessary for the item that must be processed by the CSV pipeline
-
CSVPipeline<
Item extends CsvItem> - The CSV pipeline is able to receive an item and save it into a CSV file
- FileDocument
- FileItem
-
FilePipeline<
Item extends FileItem> - Girasol
- It's the orchestrator of the crawlers.
- HTMLDocument
- HtmlParser
- Parser for HTML, which generates a tree structure from a stream of (possibly malformed) characters.
- InBodyPhase
- InCaptionPhase
- InCellPhase
- InColumnGroupPhase
- InForeignContentPhase
- InFramesetPhase
- InHeadPhase
- InitialPhase
- InRowPhase
- InSelectInTablePhase
- InSelectPhase
- InTableBodyPhase
- InTablePhase
- InTableTextPhase
- JsonDocument
- JsonItem
- Necessary for the item that must be processed by the JSON pipeline
-
JSONPipeline<
Item extends JsonItem> - The JSON pipeline is able to receive an item data and save it into an outputFile
-
ParsedData<
T> - A result containing data extracted from a page
- ParsedEmpty
- ParsedLink
- A result containing a discovered link to crawl
- ParseResult
- Base class for crawling results
- Phase
- Base class for helper object that implements each phase of processing.
-
Pipeline<
InputType> - A pipeline is used to interact with the scraped data
-
StaticWebCrawler<
ParsedItem> - The static web crawler is a crawler that is based on HTTP requests only. It's light and fast but it cannot run Javascript.
- TextDocument
- TextPhase
-
WebCrawler<
ParsedItem> - XmlAttribute
- XML attribute node.
- XmlBuilder
- A builder to create XML trees with code.
- XmlCDATA
- XML CDATA node.
- XmlComment
- XML comment node.
- XmlDeclaration
- XML document declaration.
- XmlDefaultEntityMapping
- Default entity mapping for XML, HTML, and HTML5 entities.
- XmlDoctype
- XML doctype node.
- XmlDocument
- XML document node.
- XmlDocumentFragment
- XML document fragment node.
- XmlElement
- XML element node.
- XmlEntityMapping
- Describes the decoding and encoding of character entities.
- XmlItem
- Necessary for the item that must be processed by the XML pipeline
- XmlName
- XML entity name.
- XmlNode
- Immutable abstract XML node.
- XmlNullEntityMapping
- Entity mapping that skips all entity conversion, both on decoding and encoding input.
-
XMLPipeline<
Item extends XmlItem> - The XML pipeline is able to receive an item and save it into an XML file.
- XmlPrettyWriter
- A visitor that writes XML nodes correctly indented and with whitespaces adapted.
- XmlProcessing
- XML processing instruction.
- XmlText
- XML text node.
- XmlToken
- Shared tokens for XML reading and writing.
- XmlWriter
- A visitor that writes XML nodes exactly as they were parsed.
Enums
- XmlAttributeType
- Enum of the attribute quote types.
- XmlNodeType
- Enum of the different XML node types.
Mixins
- XmlFormatException
- Mixin for exceptions that follow the FormatException of Dart.
- XmlHasAttributes
- Mixin for nodes with attributes.
-
XmlHasChildren<
T extends XmlNode> - Mixin for nodes with children.
- XmlHasName
- Mixin for all nodes with a name.
-
XmlHasParent<
T extends XmlNode> - Mixin for nodes with a parent.
- XmlHasVisitor
- Mixin for classes that can be visited using an XmlVisitor.
- XmlHasWriter
- Mixin to serialize XML to a StringBuffer.
- XmlVisitor
- Basic visitor over XmlHasVisitor nodes.
Extensions
- XmlAncestorsExtension on XmlNode
- XmlComparisonExtension on XmlNode
- XmlDescendantsExtension on XmlNode
- XmlFindExtension on XmlNode
- XmlFollowingExtension on XmlNode
- XmlMutatorExtension on XmlNode
- XmlNodesExtension on XmlNode
- XmlNormalizerExtension on XmlNode
- XmlParentExtension on XmlNode
- XmlPrecedingExtension on XmlNode
- XmlSiblingExtension on XmlNode
- XmlStringExtension on XmlNode
Properties
- defaultEntityMapping ↔ XmlEntityMapping
-
The entity mapping used when nothing else is specified.
getter/setter pair
Functions
-
getElementNameTuple(
Element e) → (String, String?) - Convenience function to get the pair of namespace and localName.
-
parse(
dynamic input, {String? encoding, bool generateSpans = false, String? sourceUrl}) → Document - Parse an html5 document into a tree.
-
parseFragment(
dynamic input, {String container = 'div', String? encoding, bool generateSpans = false, String? sourceUrl}) → DocumentFragment - Parse an html5 document fragment into a tree.
Typedefs
- TokenProccessor = Token Function(Token token)
Exceptions / Errors
- ParseError
- Error in parsed document.
- XmlException
- Abstract exception class.
- XmlNodeTypeException
- Exception thrown when an unsupported node type is used.
- XmlParentException
- Exception thrown when the parent relationship between nodes is invalid.
- XmlParserException
- Exception thrown when parsing of an XML document fails.
- XmlTagException
- Exception thrown when the end tag does not match the open tag.