PivoxScraper class

A comprehensive web scraping solution built on Pivox

Constructors

PivoxScraper.new({required AdvancedWebScraper scraper, required RateLimiter rateLimiter, required UserAgentRotator userAgentRotator, required CookieManager cookieManager, required DataCacheManager storageManager, required ScrapingJobScheduler jobScheduler})
Creates a new PivoxScraper with the given components

Properties

cookieManager CookieManager
Gets the cookie manager
no setter
hashCode int
The hash code for this object.
no setterinherited
jobScheduler ScrapingJobScheduler
Gets the scraping job scheduler
no setter
rateLimiter RateLimiter
Gets the rate limiter
no setter
runtimeType Type
A representation of the runtime type of the object.
no setterinherited
scraper AdvancedWebScraper
Gets the advanced web scraper
no setter
storageManager DataCacheManager
Gets the data cache manager
no setter
userAgentRotator UserAgentRotator
Gets the user agent rotator
no setter

Methods

addUserAgent(String userAgent) → void
Adds a user agent to the rotator
cancelJob(String id) → void
Cancels a scraping job
clearCookies(String domain) → void
Clears cookies for a domain
close() Future<void>
Closes all resources
exportData(String filename) Future<String>
Exports data to a JSON file
extractData({required String html, required String selector, String? attribute, bool asText = true}) List<String>
Extracts data from HTML content using CSS selectors
extractStructuredData({required String html, required Map<String, String> selectors, Map<String, String?>? attributes}) List<Map<String, String>>
Extracts structured data from HTML content using CSS selectors
fetchHtml({required String url, Map<String, String>? headers, int? timeout, int? retries}) Future<String>
Fetches HTML content from the given URL
fetchJson({required String url, Map<String, String>? headers, int? timeout, int? retries}) Future<Map<String, dynamic>>
Fetches JSON content from the given URL
getAllStructuredData() Future<List<Map<String, dynamic>>>
Gets all structured data from the storage manager
getData(String id) → dynamic
Gets data from the storage manager
getJobs() List<ScrapingJob>
Gets all scheduled jobs
getStructuredData(String id) Future
Gets structured data from the storage manager
importData(String filePath) Future<void>
Imports data from a JSON file
noSuchMethod(Invocation invocation) → dynamic
Invoked when a nonexistent method or property is accessed.
inherited
restoreJobs({bool storeResults = true, void onResult(List<Map<String, String>>, String)?, void onError(Exception, String)?}) → void
Restores all jobs from storage
scheduleJob({required String id, required String url, required int interval, required Map<String, String> selectors, Map<String, String?>? attributes, bool storeResults = true, void onResult(List<Map<String, String>>)?, void onError(Exception)?}) → void
Schedules a scraping job
setDomainDelay(String domain, int delayMs) → void
Sets a custom delay for a domain
storeData(String id, dynamic data) Future<void>
Stores data in the storage manager
storeStructuredData(String id, String source, dynamic data) Future<void>
Stores structured data in the storage manager
submitForm({required String url, String method = 'POST', required Map<String, String> formData, Map<String, String>? headers, int? timeout, int? retries}) Future<String>
Submits a form with the given data
toString() String
A string representation of this object.
inherited

Operators

operator ==(Object other) bool
The equality operator.
inherited

Static Methods

create({required ProxyManager proxyManager, int defaultTimeout = 30000, int maxRetries = 3, int defaultDelayMs = 1000, Map<String, int>? domainDelays, List<String>? userAgents, bool handleCookies = true, bool followRedirects = true, bool useDatabase = true}) Future<PivoxScraper>
Factory constructor to create a PivoxScraper with default components