fetchFromHttp method

Future<MetaInfo> fetchFromHttp(
  1. Uri url
)

Retrive MetaInfo from HTTP request from url.

If url is not HTTP or HTTPS, NonHttpUrlException will be thrown.

Optionally, userAgentString can be modified before making request that allowing to identify as another user agent rather than DEFAULT_USER_AGENT_STRING.

Once the request got response, it's body content will be html_parser.parse to Document directly and perform buildMetaInfo.

HTTP response code does not matter in this method that it only required to retrive HTML content from url.

Implementation

Future<MetaInfo> fetchFromHttp(Uri url) async {
  if (!RegExp(r"^https?$").hasMatch(url.scheme)) {
    throw NonHttpUrlException._(url);
  }

  Request req = Request("GET", url)
    ..headers['user-agent'] = userAgentString
    ..followRedirects = allowRedirect;

  Response resp = await req.send().then(Response.fromStream);
  String? mimeData = resp.headers["content-type"];

  final List<String> extTypes = [];

  if (mimeData != null) {
    extTypes.addAll(
        Mime.getExtensionsFromType(mimeData.split(';').first) ?? const []);
  }

  if (!_ignoreContentType &&
      !const <String>["html", "xhtml"].any(extTypes.contains)) {
    return MetaInfo();
  }

  return buildMetaInfo(html_parser.parse(resp.body));
}