PivoxScraper class
A comprehensive web scraping solution built on Pivox
Constructors
- PivoxScraper.new({required AdvancedWebScraper scraper, required RateLimiter rateLimiter, required UserAgentRotator userAgentRotator, required CookieManager cookieManager, required DataCacheManager storageManager, required ScrapingJobScheduler jobScheduler})
- Creates a new PivoxScraper with the given components
Properties
-
Gets the cookie manager
no setter
- hashCode → int
-
The hash code for this object.
no setterinherited
- jobScheduler → ScrapingJobScheduler
-
Gets the scraping job scheduler
no setter
- rateLimiter → RateLimiter
-
Gets the rate limiter
no setter
- runtimeType → Type
-
A representation of the runtime type of the object.
no setterinherited
- scraper → AdvancedWebScraper
-
Gets the advanced web scraper
no setter
- storageManager → DataCacheManager
-
Gets the data cache manager
no setter
- userAgentRotator → UserAgentRotator
-
Gets the user agent rotator
no setter
Methods
-
addUserAgent(
String userAgent) → void - Adds a user agent to the rotator
-
cancelJob(
String id) → void - Cancels a scraping job
-
clearCookies(
String domain) → void - Clears cookies for a domain
-
close(
) → Future< void> - Closes all resources
-
exportData(
String filename) → Future< String> - Exports data to a JSON file
-
extractData(
{required String html, required String selector, String? attribute, bool asText = true}) → List< String> - Extracts data from HTML content using CSS selectors
-
extractStructuredData(
{required String html, required Map< String, String> selectors, Map<String, String?> ? attributes}) → List<Map< String, String> > - Extracts structured data from HTML content using CSS selectors
-
fetchHtml(
{required String url, Map< String, String> ? headers, int? timeout, int? retries}) → Future<String> - Fetches HTML content from the given URL
-
fetchJson(
{required String url, Map< String, String> ? headers, int? timeout, int? retries}) → Future<Map< String, dynamic> > - Fetches JSON content from the given URL
-
getAllStructuredData(
) → Future< List< Map< >String, dynamic> > - Gets all structured data from the storage manager
-
getData(
String id) → dynamic - Gets data from the storage manager
-
getJobs(
) → List< ScrapingJob> - Gets all scheduled jobs
-
getStructuredData(
String id) → Future - Gets structured data from the storage manager
-
importData(
String filePath) → Future< void> - Imports data from a JSON file
-
noSuchMethod(
Invocation invocation) → dynamic -
Invoked when a nonexistent method or property is accessed.
inherited
-
restoreJobs(
{bool storeResults = true, void onResult(List< Map< , String)?, void onError(Exception, String)?}) → voidString, String> > - Restores all jobs from storage
-
scheduleJob(
{required String id, required String url, required int interval, required Map< String, String> selectors, Map<String, String?> ? attributes, bool storeResults = true, void onResult(List<Map< )?, void onError(Exception)?}) → voidString, String> > - Schedules a scraping job
-
setDomainDelay(
String domain, int delayMs) → void - Sets a custom delay for a domain
-
storeData(
String id, dynamic data) → Future< void> - Stores data in the storage manager
-
storeStructuredData(
String id, String source, dynamic data) → Future< void> - Stores structured data in the storage manager
-
submitForm(
{required String url, String method = 'POST', required Map< String, String> formData, Map<String, String> ? headers, int? timeout, int? retries}) → Future<String> - Submits a form with the given data
-
toString(
) → String -
A string representation of this object.
inherited
Operators
-
operator ==(
Object other) → bool -
The equality operator.
inherited
Static Methods
-
create(
{required ProxyManager proxyManager, int defaultTimeout = 30000, int maxRetries = 3, int defaultDelayMs = 1000, Map< String, int> ? domainDelays, List<String> ? userAgents, bool handleCookies = true, bool followRedirects = true, bool useDatabase = true}) → Future<PivoxScraper> - Factory constructor to create a PivoxScraper with default components