reader_mode library

A Dart port of Mozilla's Readability.js content extraction library.

This library extracts the main readable content from a web page, stripping away navigation, ads, and other non-content elements.

Licensing

This library is licensed under the Apache License 2.0, except for jsdom_parser.dart which is licensed under the Mozilla Public License v2.0. See the LICENSE file for details.

Usage

import 'package:reader_mode/reader_mode.dart';

// Simple usage with parse() function
final article = parse(htmlString, baseUri: 'https://example.com');
print(article?.title);
print(article?.content);

// With options
final article = parse(
  htmlString,
  parser: ParserType.html,  // Use html package instead of JSDOMParser
  charThreshold: 1000,
  keepClasses: true,
);

Quick Readability Check

Before parsing, you can check if a page is likely readable:

import 'package:html/parser.dart' as html;
import 'package:reader_mode/reader_mode.dart';

final document = html.parse(htmlString);
if (isProbablyReaderable(document)) {
  final article = parse(htmlString);
}

Dual Parser Support

This library supports two parsers via the ParserType enum:

Classes

Article
Article result from Readability parsing.
Attribute
Represents an HTML/XML attribute name-value pair.
Comment
Represents an HTML comment node.
Document
Represents an HTML document.
DocumentFragment
Represents a document fragment.
DomAttribute
Interface for an attribute.
DomDocument
Interface for a document node.
DomDocumentFragment
Interface for a document fragment node.
DomElement
Interface for an HTML element node.
DomNode
Base interface for all DOM nodes.
DomStyle
CSS style interface for element styles.
Element
Represents an HTML element node.
HtmlDomAttribute
Adapter for html package attributes.
HtmlDomDocument
Adapter for html package Document.
HtmlDomDocumentFragment
Adapter for html package DocumentFragment.
HtmlDomElement
Adapter for html package Element.
HtmlDomNode
Adapter for html package Node.
JsdomDomAttribute
Adapter for JSDOMParser Attribute.
JsdomDomDocument
Adapter for JSDOMParser Document.
JsdomDomDocumentFragment
Adapter for JSDOMParser DocumentFragment.
JsdomDomElement
Adapter for JSDOMParser Element.
JsdomDomNode
Adapter for JSDOMParser Node.
JSDOMParser
A lightweight DOM parser that converts HTML strings to a DOM tree.
Node
Base class for all DOM nodes.
Readability
Main Readability parser class.
ReadabilityOptions
Configuration options for the Readability parser.
ReaderableOptions
Options for isProbablyReaderable.
Style
Represents the style property of an element, backed by the style attribute.
TextNode
Represents a text node.

Enums

NodeType
Node type constants matching the DOM specification.
ParserType
Parser type for HTML content extraction.
SpecialNodeName
Special node names for non-element nodes.

Extensions

ElementExtensions on Element
Extension methods for Element to provide common DOM operations.

Functions

isNodeVisible(Element node) bool
Checks whether a node is visible based on its style and attributes.
isProbablyReaderable(Document doc, [ReaderableOptions? options]) bool
Determines whether a document is likely to contain readable article content.
parse(String html, {ParserType parser = ParserType.jsdom, String? baseUri, bool debug = false, ReadabilityLogger? logger, int maxElemsToParse = 0, int numTopCandidates = 5, int charThreshold = 500, List<String> classesToPreserve = const [], bool keepClasses = false, String serializer(DomElement)?, bool enableJSONLD = true, RegExp? allowedVideoRegex, double linkDensityModifier = 0}) Article?
Parse HTML content and extract the main article.

Typedefs

ReadabilityLogger = void Function(List args)
Callback type for logging messages from the Readability parser.