fetchFromHttp method
Retrive MetaInfo from HTTP request from url
.
If url
is not HTTP
or HTTPS
, NonHttpUrlException
will be thrown.
Optionally, userAgentString can be modified before making request that allowing to identify as another user agent rather than DEFAULT_USER_AGENT_STRING.
Once the request got response, it's body content will be html_parser.parse
to Document
directly and perform buildMetaInfo.
HTTP response code does not matter in this method that it only
required to retrive HTML content from url
.
Implementation
Future<MetaInfo> fetchFromHttp(Uri url) async {
if (!RegExp(r"^https?$").hasMatch(url.scheme)) {
throw NonHttpUrlException._(url);
}
Request req = Request("GET", url)
..headers['user-agent'] = userAgentString
..followRedirects = allowRedirect;
Response resp = await req.send().then(Response.fromStream);
String? mimeData = resp.headers["content-type"];
final List<String> extTypes = [];
if (mimeData != null) {
extTypes.addAll(
Mime.getExtensionsFromType(mimeData.split(';').first) ?? const []);
}
if (!_ignoreContentType &&
!const <String>["html", "xhtml"].any(extTypes.contains)) {
return MetaInfo();
}
return buildMetaInfo(html_parser.parse(resp.body));
}