TextPage class

Represents a parsed text page, equivalent to PyMuPDF's fitz.TextPage.

Extracts text from the page content stream using the PDF text operators. Supports output in multiple formats:

  • Plain text
  • Blocks (with bounding boxes)
  • Words (with bounding boxes)
  • Dict (nested structure)
  • HTML
  • XHTML
  • XML

Constructors

TextPage.fromPage(Page page, PdfParser parser)
Create a TextPage by parsing the page's content stream.
factory

Properties

hashCode int
The hash code for this object.
no setterinherited
page Page
The page this text page was extracted from.
final
rect Rect
The page rectangle.
no setter
runtimeType Type
A representation of the runtime type of the object.
no setterinherited

Methods

debugRuns() List<Map<String, dynamic>>
Debug: dump text runs with positions.
extractBlocks() List<TextBlock>
Extract text as blocks with bounding boxes.
extractDict({bool raw = false}) TextDict
Extract text as a detailed dictionary.
extractHtml() String
Extract text as HTML.
extractText() String
Extract plain text from the page.
extractWords() List<TextWord>
Extract words with bounding boxes.
extractXhtml() String
Extract text as XHTML.
extractXml() String
Extract text as XML.
noSuchMethod(Invocation invocation) → dynamic
Invoked when a nonexistent method or property is accessed.
inherited
toString() String
A string representation of this object.
inherited

Operators

operator ==(Object other) bool
The equality operator.
inherited