getDocumentAnalysis method
Gets the results for an Amazon Textract asynchronous operation that analyzes text in a document.
You start asynchronous text analysis by calling
StartDocumentAnalysis, which returns a job identifier
(JobId
). When the text analysis operation finishes, Amazon
Textract publishes a completion status to the Amazon Simple Notification
Service (Amazon SNS) topic that's registered in the initial call to
StartDocumentAnalysis
. To get the results of the
text-detection operation, first check that the status value published to
the Amazon SNS topic is SUCCEEDED
. If so, call
GetDocumentAnalysis
, and pass the job identifier
(JobId
) from the initial call to
StartDocumentAnalysis
.
GetDocumentAnalysis
returns an array of Block objects.
The following types of information are returned:
-
Form data (key-value pairs). The related information is returned in two
Block objects, each of type
KEY_VALUE_SET
: a KEYBlock
object and a VALUEBlock
object. For example, Name: Ana Silva Carolina contains a key and value. Name: is the key. Ana Silva Carolina is the value. -
Table and table cell data. A TABLE
Block
object contains information about a detected table. A CELLBlock
object is returned for each cell in a table. -
Lines and words of text. A LINE
Block
object contains one or more WORDBlock
objects. All lines and words that are detected in the document are returned (including text that doesn't have a relationship with the value of theStartDocumentAnalysis
FeatureTypes
input parameter).
Block
object contains information about a selection element,
including the selection status.
Use the MaxResults
parameter to limit the number of blocks
that are returned. If there are more results than specified in
MaxResults
, the value of NextToken
in the
operation response contains a pagination token for getting the next set of
results. To get the next page of results, call
GetDocumentAnalysis
, and populate the NextToken
request parameter with the token value that's returned from the previous
call to GetDocumentAnalysis
.
For more information, see Document Text Analysis.
May throw InvalidParameterException. May throw AccessDeniedException. May throw ProvisionedThroughputExceededException. May throw InvalidJobIdException. May throw InternalServerError. May throw ThrottlingException. May throw InvalidS3ObjectException.
Parameter jobId
:
A unique identifier for the text-detection job. The JobId
is
returned from StartDocumentAnalysis
. A JobId
value is only valid for 7 days.
Parameter maxResults
:
The maximum number of results to return per paginated call. The largest
value that you can specify is 1,000. If you specify a value greater than
1,000, a maximum of 1,000 results is returned. The default value is 1,000.
Parameter nextToken
:
If the previous response was incomplete (because there are more blocks to
retrieve), Amazon Textract returns a pagination token in the response. You
can use this pagination token to retrieve the next set of blocks.
Implementation
Future<GetDocumentAnalysisResponse> getDocumentAnalysis({
required String jobId,
int? maxResults,
String? nextToken,
}) async {
ArgumentError.checkNotNull(jobId, 'jobId');
_s.validateStringLength(
'jobId',
jobId,
1,
64,
isRequired: true,
);
_s.validateNumRange(
'maxResults',
maxResults,
1,
1152921504606846976,
);
_s.validateStringLength(
'nextToken',
nextToken,
1,
255,
);
final headers = <String, String>{
'Content-Type': 'application/x-amz-json-1.1',
'X-Amz-Target': 'Textract.GetDocumentAnalysis'
};
final jsonResponse = await _protocol.send(
method: 'POST',
requestUri: '/',
exceptionFnMap: _exceptionFns,
// TODO queryParams
headers: headers,
payload: {
'JobId': jobId,
if (maxResults != null) 'MaxResults': maxResults,
if (nextToken != null) 'NextToken': nextToken,
},
);
return GetDocumentAnalysisResponse.fromJson(jsonResponse.body);
}