Changelog #

1.1.0 - Performance and Reliability Improvements (2025-04-17) #

Performance Features #

Parallel Processing Framework: New TaskScheduler and ScrapingTask classes for advanced parallel scraping
Resource-Aware Concurrency: Adaptive concurrency limits based on CPU and memory usage
Priority-Based Task Scheduling: Execute tasks based on priority levels with dependency support
DataCacheX Integration: Efficient multi-level caching with DataCacheX for improved performance
Chunked Data Processing: Process large datasets in manageable chunks with the DataChunker class
Performance Extension Methods: New WebScraperPerformanceExtension with simplified performance optimization
Streaming HTML Parser: Process HTML incrementally to reduce memory usage for large documents
Concurrency Control: New ScrapingTaskQueue and ConcurrentWebScraper for parallel processing
Memory-Efficient Parsing: Chunking strategies for handling large HTML documents
Factory Methods: New PivoxFactory class with simplified component creation
Standardized Error Handling: Comprehensive ScrapingException hierarchy for better error management
Scraping Progress Dashboard: Real-time monitoring of scraping tasks with status indicators and performance metrics
Visual Status Indicators: Animated status badges, progress indicators, and notification system for important events
Interactive Selector Tools: CSS selector builder with syntax highlighting and validation
Element Inspector: Visual HTML element inspector with hierarchy navigation and attribute viewing
Selector Testing Playground: Interactive environment for testing CSS selectors on HTML content
Task Management UI: Detailed task information, cancellation, and retry capabilities
Notification System: Toast notifications for errors, warnings, and success messages

Performance Improvements #

Reduced Memory Footprint: Process large HTML documents without loading entirely into memory
Parallel Processing: Extract data from multiple URLs simultaneously
Priority-Based Scheduling: Prioritize important requests in the scraping queue
Batch Processing: Efficient bulk scraping with progress tracking
Improved Resource Management: Better control over CPU and memory usage during intensive operations
Increased Throughput: Up to 5x higher scraping throughput with parallel processing
Reduced Memory Usage: Up to 70% lower memory footprint with chunked processing
Faster Response Times: Up to 90% faster response times with efficient caching
Better Resource Utilization: Optimized CPU and memory usage with adaptive concurrency
Enhanced Scalability: Improved handling of large-scale scraping operations

Ethical Improvements #

Robots.txt Compliance: Enhanced support for respecting robots.txt directives
Rate Limiting: Domain-specific rate limiting with exponential backoff
Respectful Crawling: Automatic crawl delays based on site reputation

Interface Enhancements #

Modern Flat Design: Clean, responsive interface with consistent styling
Tabbed Interface: Organized access to different features through a tabbed layout
Interactive Components: Drag-and-drop, expandable panels, and tooltips for better usability
Responsive Layout: Adapts to different screen sizes and orientations
Visual Feedback: Immediate feedback for user actions with animations and transitions

Improvements #

Enhanced Logging: Added detailed logging throughout the scraping process
Better Fallback Mechanisms: Implemented fallbacks when proxies or data extraction fails
Structured Data Handling: Improved handling of structured data extraction
Performance Optimizations: Optimized proxy validation and rotation

Bug Fixes #

Fixed Proxy Validation: Resolved issues with parallel proxy validation using isolates
Enhanced Error Handling: Improved error handling throughout the web scraping process
Robust Data Extraction: Added safeguards against empty or malformed data during extraction
Improved ProxyHttpClient: Enhanced error handling and fallback mechanisms in the HTTP client
UI Improvements: Fixed issues with displaying scraped data in the example app

Documentation Updates #

Performance Optimization Guide: Comprehensive documentation on performance features
Example Application: Interactive example demonstrating parallel, cached, and chunked scraping
Best Practices: Guidelines for optimal performance configuration
UI Component Guide: Documentation for all UI components and their usage
Example Application: Comprehensive example demonstrating all UI features
Integration Guide: Instructions for integrating UI components into existing applications

1.0.0 - Initial Release (2025-04-15) #

Core Features #

Dynamic Free Proxy Sourcing: Automatically fetch and update proxies from trusted free sources
Smart Proxy Rotation: Multiple rotation strategies with intelligent proxy scoring
Parallel Proxy Health Validation: Built-in proxy verification system with parallel processing
Advanced Filtering: Powerful filtering options for proxies based on country, region, ISP, etc.
Performance Tracking: Comprehensive proxy performance tracking with detailed analytics
HTTP Client Integration: Seamless integration with http and dio packages

Web Scraping Features #

Dynamic User Agent Management: Automatically rotate through modern, realistic user agents
Specialized Site Handlers: Custom handlers for problematic websites with anti-scraping measures
Structured Data Extraction: Extract structured data from HTML content using CSS selectors
Rate Limiting: Respect website rate limits to avoid blocking
Cookie Management: Handle cookies for authenticated scraping
Adaptive Scraping Strategies: Adjust scraping behavior based on site reputation

Specialized Site Handlers #

OnlineKhabarHandler: Custom handler for onlinekhabar.com
VegaMoviesHandler: Custom handler for vegamovies sites

Initial Documentation #

Comprehensive API documentation
Detailed web scraping guide
Example applications

pivox 1.1.0
pivox: ^1.1.0 copied to clipboard

Metadata

Changelog #

1.1.0 - Performance and Reliability Improvements (2025-04-17) #

Performance Features #

Performance Improvements #

Ethical Improvements #

Interface Enhancements #

Improvements #

Bug Fixes #

Documentation Updates #

1.0.0 - Initial Release (2025-04-15) #

Core Features #

Web Scraping Features #

Specialized Site Handlers #

Initial Documentation #

← Metadata

Publisher

Weekly Downloads

Metadata

Topics

Documentation

License

Dependencies

More

pivox 1.1.0 pivox: ^1.1.0 copied to clipboard

Metadata

Changelog #

1.1.0 - Performance and Reliability Improvements (2025-04-17) #

Performance Features #

Performance Improvements #

Ethical Improvements #

Interface Enhancements #

Improvements #

Bug Fixes #

Documentation Updates #

1.0.0 - Initial Release (2025-04-15) #

Core Features #

Web Scraping Features #

Specialized Site Handlers #

Initial Documentation #

← Metadata

Publisher

Weekly Downloads

Metadata

Topics

Documentation

License

Dependencies

More

pivox 1.1.0
pivox: ^1.1.0 copied to clipboard