fetchFromHttp method
Retrive MetaInfo from HTTP request from url.
If url is not HTTP or HTTPS, NonHttpUrlException
will be thrown.
Optionally, userAgentString can be modified before making request that allowing to identify as another user agent rather than DEFAULT_USER_AGENT_STRING.
Once the request got response, it's body content will be html_parser.parse
to Document directly and perform buildMetaInfo.
HTTP response code does not matter in this method that it only
required to retrive HTML content from url.
Implementation
Future<MetaInfo> fetchFromHttp(Uri url) async {
if (!RegExp(r"^https?$").hasMatch(url.scheme)) {
throw NonHttpUrlException._(url);
}
Request req = Request("GET", url)
..headers['user-agent'] = userAgentString
..followRedirects = allowRedirect;
Response resp = await req.send().then(Response.fromStream);
String? mimeData = resp.headers["content-type"];
final List<String> extTypes = [];
if (mimeData != null) {
extTypes.addAll(
Mime.getExtensionsFromType(mimeData.split(';').first) ?? const []);
}
if (!_ignoreContentType &&
!const <String>["html", "xhtml"].any(extTypes.contains)) {
return MetaInfo();
}
return buildMetaInfo(html_parser.parse(resp.body));
}