trafilatura 1.0.1 copy "trafilatura: ^1.0.1" to clipboard
trafilatura: ^1.0.1 copied to clipboard

Dart port of Trafilatura - A library for web scraping, text extraction, and metadata extraction from HTML documents. Ported from the original Python library by kamranxdev (Kamran Khan).

130/ 160
pub points
36
downloads

We analyzed this package 5 days ago, and awarded it 130 pub points (of a possible 160):

Failed report section
Follow Dart file conventions
20 / 30trigger folding of the section

Failed check 0/10 points: Provide a valid pubspec.yaml

The package description is too long.

Search engines display only the first part of the description. Try to keep the value of the description field in your package's pubspec.yaml file between 60 and 180 characters.

Passed check 5/5 points: Provide a valid README.md

Passed check 5/5 points: Provide a valid CHANGELOG.md

Passed check 10/10 points: Use an OSI-approved license

Detected license: Apache-2.0.

Passed report section
Provide documentation
20 / 20trigger folding of the section

Passed check 10/10 points: 20% or more of the public API has dartdoc comments

94 out of 150 API elements (62.7 %) have documentation comments.

Some symbols that are missing documentation: trafilatura.CrawlParameters.CrawlParameters.new, trafilatura.DefaultConfig.DefaultConfig.new, trafilatura.DefaultConfig.extensiveDateSearch, trafilatura.DefaultConfig.maxFileSize, trafilatura.DefaultConfig.maxLinks.

Passed check 10/10 points: Package has an example

Passed report section
Platform support
20 / 20trigger folding of the section

Passed check 20/20 points: Supports 5 of 6 possible platforms (iOS, Android, Web, Windows, macOS, Linux)

  • ✓ Android

  • ✓ iOS

  • ✓ Windows

  • ✓ Linux

  • ✓ macOS

These platforms are not supported:

Package not compatible with platform Web

Because:

  • package:trafilatura/trafilatura.dart that imports:
  • package:trafilatura/src/meta.dart that imports:
  • package:trafilatura/src/deduplication.dart that imports:
  • package:trafilatura/src/utils.dart that imports:
  • dart:io
Partially passed report section
Pass static analysis
40 / 50trigger folding of the section

Partially passed check 40/50 points: code has no errors, warnings, lints, or formatting issues

Found 7 issues. Showing the first 2:

INFO: Dangling library doc comment.

bin/trafilatura.dart:2:1

  ╷
2 │ /// Command-line interface for Trafilatura.
  │ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  ╵

To reproduce make sure you are using the lints_core and run dart analyze bin/trafilatura.dart

INFO: Angle brackets will be interpreted as HTML.

lib/src/htmlprocessing.dart:383:30

    ╷
383 │ /// Convert <ul> and <ol> to <list> and underlying <li> elements to <item>.
    │                              ^^^^^^
    ╵

To reproduce make sure you are using the lints_core and run dart analyze lib/src/htmlprocessing.dart

Partially passed report section
Support up-to-date dependencies
30 / 40trigger folding of the section

Partially passed check 0/10 points: All of the package dependencies are supported in the latest version

Package Constraint Compatible Latest Notes
args ^2.4.2 2.7.0 2.7.0
charset ^2.0.1 2.0.1 2.0.1
collection ^1.18.0 1.19.1 1.19.1
convert ^3.1.1 3.1.2 3.1.2
crypto ^3.0.3 3.0.7 3.0.7
html ^0.15.4 0.15.6 0.15.6
http ^1.1.0 1.6.0 1.6.0
intl ^0.18.1 0.18.1 0.20.2
path ^1.8.3 1.9.1 1.9.1
xml ^6.4.0 6.6.1 7.0.1
Transitive dependencies
Package Constraint Compatible Latest Notes
async - 2.13.1 2.13.1
clock - 1.1.2 1.1.2
csslib - 1.0.2 1.0.2
http_parser - 4.1.2 4.1.2
meta - 1.18.2 1.18.2
petitparser - 7.0.2 7.0.2
source_span - 1.10.2 1.10.2
string_scanner - 1.4.1 1.4.1
term_glyph - 1.2.2 1.2.2
typed_data - 1.4.0 1.4.0
web - 1.1.1 1.1.1

To reproduce run dart pub outdated --no-dev-dependencies --up-to-date --no-dependency-overrides.

The constraint `^0.18.1` on intl does not support the stable version `0.19.0`.

Try running dart pub upgrade --major-versions intl to update the constraint.

The constraint `^6.4.0` on xml does not support the stable version `7.0.0`, that was published 24 days ago.

When xml is 30 days old, this package will no longer be awarded points in this category.

Try running dart pub upgrade --major-versions xml to update the constraint.

Passed check 10/10 points: Package supports latest stable Dart and Flutter SDKs

Passed check 20/20 points: Compatible with dependency constraint lower bounds

pub downgrade does not expose any static analysis error.

Analyzed with Pana 0.23.12, Dart 3.12.0.

Check the analysis log for details.

Weekly downloads

Display as:
By versions:
5
likes
130
points
36
downloads

Documentation

API reference

Publisher

verified publisherkamranx.dev

Weekly Downloads

Dart port of Trafilatura - A library for web scraping, text extraction, and metadata extraction from HTML documents. Ported from the original Python library by kamranxdev (Kamran Khan).

Repository (GitHub)
View/report issues
Contributing

License

Apache-2.0 (license)

Dependencies

args, charset, collection, convert, crypto, html, http, intl, path, xml

More

Packages that depend on trafilatura