PdfDocument class

Represents a parsed PDF document.

This contains:

  • elements: Document structure with text, tables, and images as DocxNodes. Images appear here as DocxImage nodes with layout context.
  • images: Convenience list of raw PdfExtractedImage for direct access to image bytes without traversing the document tree.

Constructors

PdfDocument({required List<DocxNode> elements, required List<PdfExtractedImage> images, List<String> warnings = const [], int pageCount = 0, double pageWidth = 612, double pageHeight = 792, String version = '1.4', PdfParser? parser})

Properties

annotations List<PdfAnnotation>
All annotations in the document.
no setter
attachments List<PdfAttachment>
Document attachments (embedded files).
no setter
elements List<DocxNode>
Extracted document elements (paragraphs, tables, images).
final
encryption PdfEncryption?
Encryption information (null if not encrypted).
no setter
formData Map<String, dynamic>
Gets form data as a map of field names to values.
no setter
formFields List<PdfFormField>
Form fields in the document.
no setter
hasAnnotations bool
Whether the document has annotations.
no setter
hasForm bool
Whether the document has a form.
no setter
hashCode int
The hash code for this object.
no setterinherited
hasOutlines bool
Whether the document has bookmarks.
no setter
hasWarnings bool
Whether parsing had warnings.
no setter
images List<PdfExtractedImage>
Direct access to extracted images with metadata.
final
isEncrypted bool
Whether the document is encrypted.
no setter
isPdfA bool
Whether the document claims PDF/A compliance.
no setter
isTagged bool
Whether the PDF is a Tagged PDF (has logical structure).
no setter
layers List<PdfLayer>
Optional Content Groups (Layers).
no setter
metadata PdfMetadata
Document metadata (title, author, dates, etc.).
no setter
namedDestinations Map<String, int>
Named destinations in the document. Maps destination names to page numbers.
no setter
openDestination int?
The open destination (where document opens). Returns page number, or null if not set.
no setter
outlines List<PdfOutlineItem>
Document outline (bookmarks/table of contents).
no setter
pageCount int
Number of pages in the PDF.
final
pageHeight double
Page height in points.
final
pageInfos List<PdfPageInfo>
Detailed information for all pages.
no setter
pageLabels List<String>
Returns page labels for all pages (e.g., "i", "ii", "1", "2"). Returns "1", "2", etc. if no labels are defined.
no setter
pageWidth double
Page width in points.
final
paragraphCount int
Gets the number of extracted paragraphs.
no setter
pdfaConformance String?
PDF/A conformance level (e.g., "1a", "1b", "2u", "3b"). Returns null if not PDF/A.
no setter
permissions PdfPermissions?
Document permissions (from encryption). Returns null if document is not encrypted.
no setter
runtimeType Type
A representation of the runtime type of the object.
no setterinherited
structureTree → PdfStructureTree?
The logical structure tree (if present).
no setter
text String
Gets all text content as a single string.
no setter
version String
PDF version (e.g., "1.4").
final
warnings List<String>
Warnings encountered during parsing.
final
xmpMetadata XmpMetadata?
XMP metadata (Dublin Core, etc.). Returns null if no XMP metadata exists.
no setter

Methods

extractTextForPage(int pageNumber) String
Extracts text from a specific page. pageNumber is 0-indexed.
getAnnotationsForPage(int pageNumber) List<PdfAnnotation>
Gets annotations for a specific page.
getFormField(String name) PdfFormField?
Gets a form field by name.
getPageInfo(int pageNumber) PdfPageInfo?
Gets info for a specific page.
noSuchMethod(Invocation invocation) → dynamic
Invoked when a nonexistent method or property is accessed.
inherited
toDocx() DocxBuiltDocument
Converts to a DocxBuiltDocument for export.
toString() String
A string representation of this object.
inherited

Operators

operator ==(Object other) bool
The equality operator.
inherited