🥇Looter🥇
A simple yet fully-featured web scraper for both static and dynamically generated web pages.
This package is built upon multible packages with easier integration/abstraction, Check them in the dependencies section.
This package is still in it's early stages. If there's an issue, Please feel free to head to the repo and File a new issue.
Getting Started
1. Depend on it
Add this to your package's pubspec.yaml file:
dependencies:
looter: [latest version]
2. Install it
$ flutter pub get
3. Import it
import 'package:looter/looter.dart';
As easy as a couple of lines to scrape a web page!
void main() async {
//1. Initialize the Looter
// and specify wheather you are going to use a static or dynamic crawler.
// **Dynamic crawler uses puppeteer to initialize a headless browser.**
Looter looter = await Looter.initialize();
//2. Start Looting!
LootElement result = await looter
.from("http://books.toscrape.com")
.loot('article.product_pod h3 a', "bookTitle");
}
What can you do?
-
Loot one element with selector 1️⃣
LootElement result = await looter
.from("http://books.toscrape.com")
.loot('article.product_pod h3 a',
elementIdentifier: "bookTitle",
);
-
Loot all elements with selector 🔗
List<LootElement> result = await looter
.from("http://books.toscrape.com")
.lootAll('article.product_pod h3 a',
elementIdentifier: "bookTitle",
);
-
And my favorite, a Loot Loop ➰➰
List<LootElement?> result =
await looter.from("http://books.toscrape.com").loop(
'ol.row li', // give the looper the shared parents selector..
{
'article.product_pod h3 a': {"bookTitle": 'text'},
'div.image_container img': {"bookImage": 'src'},
'div.product_price p.price_color': {'bookPrice': 'text'},
'div.product_price instock availability': {'bookAvailability': 'text'},
// and if you want to loot multible children, use the array modifier! 'array:text', 'array:src', etc..
},
);
Checklist
- (x) Chrome Downloading Handler.
- () Loots Chaining.
- () Exporting as an Excel.
- () Creating a web API from the LootResult with a configurable JSON.
Contributing
Contributing is more than welcomed on any of my packages/plugins. I will try to keep adding suggested features as i go.
Versioning
- V1.0.0 - Initial Release.
- V1.1.0 - Refactored lootLoop function for easier handling.
- V1.2.0 - Added an Array modifier to the loop function.
- V1.2.5 - Abstracted headless chrome downloader function.
Authors
Michael Aziz - Github
License
This project is licensed under the MIT License - see the LICENSE.md file for details