startStreamTranscription method
- required AudioStream audioStream,
- required MediaEncoding mediaEncoding,
- required int mediaSampleRateHertz,
- ContentIdentificationType? contentIdentificationType,
- ContentRedactionType? contentRedactionType,
- bool? enableChannelIdentification,
- bool? enablePartialResultsStabilization,
- bool? identifyLanguage,
- bool? identifyMultipleLanguages,
- LanguageCode? languageCode,
- String? languageModelName,
- String? languageOptions,
- int? numberOfChannels,
- PartialResultsStability? partialResultsStability,
- String? piiEntityTypes,
- LanguageCode? preferredLanguage,
- String? sessionId,
- int? sessionResumeWindow,
- bool? showSpeakerLabel,
- VocabularyFilterMethod? vocabularyFilterMethod,
- String? vocabularyFilterName,
- String? vocabularyFilterNames,
- String? vocabularyName,
- String? vocabularyNames,
Starts a bidirectional HTTP/2 or WebSocket stream where audio is streamed to Amazon Transcribe and the transcription results are streamed to your application.
The following parameters are required:
-
language-codeoridentify-languageoridentify-multiple-language -
media-encoding -
sample-rate
May throw BadRequestException.
May throw ConflictException.
May throw InternalFailureException.
May throw LimitExceededException.
May throw ServiceUnavailableException.
Parameter audioStream :
An encoded stream of audio blobs. Audio streams are encoded as either
HTTP/2 or WebSocket data frames.
For more information, see Transcribing streaming audio.
Parameter mediaEncoding :
Specify the encoding of your input audio. Supported formats are:
- FLAC
- OPUS-encoded audio in an Ogg container
- PCM (only signed 16-bit little-endian audio formats, which does not include WAV)
Parameter mediaSampleRateHertz :
The sample rate of the input audio (in hertz). Low-quality audio, such as
telephone audio, is typically around 8,000 Hz. High-quality audio
typically ranges from 16,000 Hz to 48,000 Hz. Note that the sample rate
you specify must match that of your audio.
Parameter contentIdentificationType :
Labels all personally identifiable information (PII) identified in your
transcript.
Content identification is performed at the segment level; PII specified in
PiiEntityTypes is flagged upon complete transcription of an
audio segment. If you don't include PiiEntityTypes in your
request, all PII is identified.
You can’t set ContentIdentificationType and
ContentRedactionType in the same request. If you set both,
your request returns a BadRequestException.
For more information, see Redacting or identifying personally identifiable information.
Parameter contentRedactionType :
Redacts all personally identifiable information (PII) identified in your
transcript.
Content redaction is performed at the segment level; PII specified in
PiiEntityTypes is redacted upon complete transcription of an
audio segment. If you don't include PiiEntityTypes in your
request, all PII is redacted.
You can’t set ContentRedactionType and
ContentIdentificationType in the same request. If you set
both, your request returns a BadRequestException.
For more information, see Redacting or identifying personally identifiable information.
Parameter enableChannelIdentification :
Enables channel identification in multi-channel audio.
Channel identification transcribes the audio on each channel independently, then appends the output for each channel into one transcript.
If you have multi-channel audio and do not enable channel identification, your audio is transcribed in a continuous manner and your transcript is not separated by channel.
If you include EnableChannelIdentification in your request,
you must also include NumberOfChannels.
For more information, see Transcribing multi-channel audio.
Parameter enablePartialResultsStabilization :
Enables partial result stabilization for your transcription. Partial
result stabilization can reduce latency in your output, but may impact
accuracy. For more information, see Partial-result
stabilization.
Parameter identifyLanguage :
Enables automatic language identification for your transcription.
If you include IdentifyLanguage, you must include a list of
language codes, using LanguageOptions, that you think may be
present in your audio stream.
You can also include a preferred language using
PreferredLanguage. Adding a preferred language can help
Amazon Transcribe identify the language faster than if you omit this
parameter.
If you have multi-channel audio that contains different languages on each channel, and you've enabled channel identification, automatic language identification identifies the dominant language on each audio channel.
Note that you must include either LanguageCode or
IdentifyLanguage or IdentifyMultipleLanguages in
your request. If you include more than one of these parameters, your
transcription job fails.
Streaming language identification can't be combined with custom language models or redaction.
Parameter identifyMultipleLanguages :
Enables automatic multi-language identification in your transcription job
request. Use this parameter if your stream contains more than one
language. If your stream contains only one language, use IdentifyLanguage
instead.
If you include IdentifyMultipleLanguages, you must include a
list of language codes, using LanguageOptions, that you think
may be present in your stream.
If you want to apply a custom vocabulary or a custom vocabulary filter to
your automatic multiple language identification request, include
VocabularyNames or VocabularyFilterNames.
Note that you must include one of LanguageCode,
IdentifyLanguage, or IdentifyMultipleLanguages
in your request. If you include more than one of these parameters, your
transcription job fails.
Parameter languageCode :
Specify the language code that represents the language spoken in your
audio.
If you're unsure of the language spoken in your audio, consider using
IdentifyLanguage to enable automatic language identification.
For a list of languages supported with Amazon Transcribe streaming, refer to the Supported languages table.
Parameter languageModelName :
Specify the name of the custom language model that you want to use when
processing your transcription. Note that language model names are case
sensitive.
The language of the specified language model must match the language code you specify in your transcription request. If the languages don't match, the custom language model isn't applied. There are no errors or warnings associated with a language mismatch.
For more information, see Custom language models.
Parameter languageOptions :
Specify two or more language codes that represent the languages you think
may be present in your media; including more than five is not recommended.
Including language options can improve the accuracy of language identification.
If you include LanguageOptions in your request, you must also
include IdentifyLanguage or
IdentifyMultipleLanguages.
For a list of languages supported with Amazon Transcribe streaming, refer to the Supported languages table.
Parameter numberOfChannels :
Specify the number of channels in your audio stream. This value must be
2, as only two channels are supported. If your audio doesn't
contain multiple channels, do not include this parameter in your request.
If you include NumberOfChannels in your request, you must
also include EnableChannelIdentification.
Parameter partialResultsStability :
Specify the level of stability to use when you enable partial results
stabilization (EnablePartialResultsStabilization).
Low stability provides the highest accuracy. High stability transcribes faster, but with slightly lower accuracy.
For more information, see Partial-result stabilization.
Parameter piiEntityTypes :
Specify which types of personally identifiable information (PII) you want
to redact in your transcript. You can include as many types as you'd like,
or you can select ALL.
Values must be comma-separated and can include: ADDRESS,
BANK_ACCOUNT_NUMBER, BANK_ROUTING,
CREDIT_DEBIT_CVV, CREDIT_DEBIT_EXPIRY,
CREDIT_DEBIT_NUMBER, EMAIL, NAME,
PHONE, PIN, SSN, AGE,
DATE_TIME, LICENSE_PLATE,
PASSPORT_NUMBER, PASSWORD,
USERNAME, VEHICLE_IDENTIFICATION_NUMBER, or
ALL.
Note that if you include PiiEntityTypes in your request, you
must also include ContentIdentificationType or
ContentRedactionType.
If you include ContentRedactionType or
ContentIdentificationType in your request, but do not include
PiiEntityTypes, all PII is redacted or identified.
Parameter preferredLanguage :
Specify a preferred language from the subset of languages codes you
specified in LanguageOptions.
You can only use this parameter if you've included
IdentifyLanguage and LanguageOptions in your
request.
Parameter sessionId :
Specify a name for your transcription session. If you don't include this
parameter in your request, Amazon Transcribe generates an ID and returns
it in the response.
Parameter sessionResumeWindow :
Specify the time window, in minutes, during which your transcription
session can be resumed, measured from the stream start time. This optional
parameter accepts integer values from 1 to 300 (5 hours).
For example, if your stream starts at 1 PM and you specify a
SessionResumeWindow of 30 minutes, you can reconnect to the
session as many times as you want until 1:30 PM.
Parameter showSpeakerLabel :
Enables speaker partitioning (diarization) in your transcription output.
Speaker partitioning labels the speech from individual speakers in your
media file.
For more information, see Partitioning speakers (diarization).
Parameter vocabularyFilterMethod :
Specify how you want your vocabulary filter applied to your transcript.
To replace words with ***, choose mask.
To delete words, choose remove.
To flag words without changing them, choose tag.
Parameter vocabularyFilterName :
Specify the name of the custom vocabulary filter that you want to use when
processing your transcription. Note that vocabulary filter names are case
sensitive.
If the language of the specified custom vocabulary filter doesn't match the language identified in your media, the vocabulary filter is not applied to your transcription. For more information, see Using vocabulary filtering with unwanted words.
Parameter vocabularyFilterNames :
Specify the names of the custom vocabulary filters that you want to use
when processing your transcription. Note that vocabulary filter names are
case sensitive.
If none of the languages of the specified custom vocabulary filters match the language identified in your media, your job fails. For more information, see Using vocabulary filtering with unwanted words.
Parameter vocabularyName :
Specify the name of the custom vocabulary that you want to use when
processing your transcription. Note that vocabulary names are case
sensitive.
If the language of the specified custom vocabulary doesn't match the language identified in your media, the custom vocabulary is not applied to your transcription. For more information, see Custom vocabularies.
Parameter vocabularyNames :
Specify the names of the custom vocabularies that you want to use when
processing your transcription. Note that vocabulary names are case
sensitive.
If none of the languages of the specified custom vocabularies match the language identified in your media, your job fails. For more information, see Custom vocabularies.
Implementation
Future<StartStreamTranscriptionResponse> startStreamTranscription({
required AudioStream audioStream,
required MediaEncoding mediaEncoding,
required int mediaSampleRateHertz,
ContentIdentificationType? contentIdentificationType,
ContentRedactionType? contentRedactionType,
bool? enableChannelIdentification,
bool? enablePartialResultsStabilization,
bool? identifyLanguage,
bool? identifyMultipleLanguages,
LanguageCode? languageCode,
String? languageModelName,
String? languageOptions,
int? numberOfChannels,
PartialResultsStability? partialResultsStability,
String? piiEntityTypes,
LanguageCode? preferredLanguage,
String? sessionId,
int? sessionResumeWindow,
bool? showSpeakerLabel,
VocabularyFilterMethod? vocabularyFilterMethod,
String? vocabularyFilterName,
String? vocabularyFilterNames,
String? vocabularyName,
String? vocabularyNames,
}) async {
_s.validateNumRange(
'mediaSampleRateHertz',
mediaSampleRateHertz,
8000,
48000,
isRequired: true,
);
_s.validateNumRange(
'numberOfChannels',
numberOfChannels,
2,
1152921504606846976,
);
_s.validateNumRange(
'sessionResumeWindow',
sessionResumeWindow,
1,
300,
);
final headers = <String, String>{
'x-amzn-transcribe-media-encoding': mediaEncoding.value,
'x-amzn-transcribe-sample-rate': mediaSampleRateHertz.toString(),
if (contentIdentificationType != null)
'x-amzn-transcribe-content-identification-type':
contentIdentificationType.value,
if (contentRedactionType != null)
'x-amzn-transcribe-content-redaction-type': contentRedactionType.value,
if (enableChannelIdentification != null)
'x-amzn-transcribe-enable-channel-identification':
enableChannelIdentification.toString(),
if (enablePartialResultsStabilization != null)
'x-amzn-transcribe-enable-partial-results-stabilization':
enablePartialResultsStabilization.toString(),
if (identifyLanguage != null)
'x-amzn-transcribe-identify-language': identifyLanguage.toString(),
if (identifyMultipleLanguages != null)
'x-amzn-transcribe-identify-multiple-languages':
identifyMultipleLanguages.toString(),
if (languageCode != null)
'x-amzn-transcribe-language-code': languageCode.value,
if (languageModelName != null)
'x-amzn-transcribe-language-model-name': languageModelName.toString(),
if (languageOptions != null)
'x-amzn-transcribe-language-options': languageOptions.toString(),
if (numberOfChannels != null)
'x-amzn-transcribe-number-of-channels': numberOfChannels.toString(),
if (partialResultsStability != null)
'x-amzn-transcribe-partial-results-stability':
partialResultsStability.value,
if (piiEntityTypes != null)
'x-amzn-transcribe-pii-entity-types': piiEntityTypes.toString(),
if (preferredLanguage != null)
'x-amzn-transcribe-preferred-language': preferredLanguage.value,
if (sessionId != null)
'x-amzn-transcribe-session-id': sessionId.toString(),
if (sessionResumeWindow != null)
'x-amzn-transcribe-session-resume-window':
sessionResumeWindow.toString(),
if (showSpeakerLabel != null)
'x-amzn-transcribe-show-speaker-label': showSpeakerLabel.toString(),
if (vocabularyFilterMethod != null)
'x-amzn-transcribe-vocabulary-filter-method':
vocabularyFilterMethod.value,
if (vocabularyFilterName != null)
'x-amzn-transcribe-vocabulary-filter-name':
vocabularyFilterName.toString(),
if (vocabularyFilterNames != null)
'x-amzn-transcribe-vocabulary-filter-names':
vocabularyFilterNames.toString(),
if (vocabularyName != null)
'x-amzn-transcribe-vocabulary-name': vocabularyName.toString(),
if (vocabularyNames != null)
'x-amzn-transcribe-vocabulary-names': vocabularyNames.toString(),
};
final response = await _protocol.sendRaw(
payload: audioStream,
method: 'POST',
requestUri: '/stream-transcription',
headers: headers,
exceptionFnMap: _exceptionFns,
);
final $json = await _s.jsonFromResponse(response);
return StartStreamTranscriptionResponse(
transcriptResultStream: TranscriptResultStream.fromJson($json),
contentIdentificationType: _s
.extractHeaderStringValue(
response.headers, 'x-amzn-transcribe-content-identification-type')
?.let(ContentIdentificationType.fromString),
contentRedactionType: _s
.extractHeaderStringValue(
response.headers, 'x-amzn-transcribe-content-redaction-type')
?.let(ContentRedactionType.fromString),
enableChannelIdentification: _s.extractHeaderBoolValue(
response.headers, 'x-amzn-transcribe-enable-channel-identification'),
enablePartialResultsStabilization: _s.extractHeaderBoolValue(
response.headers,
'x-amzn-transcribe-enable-partial-results-stabilization'),
identifyLanguage: _s.extractHeaderBoolValue(
response.headers, 'x-amzn-transcribe-identify-language'),
identifyMultipleLanguages: _s.extractHeaderBoolValue(
response.headers, 'x-amzn-transcribe-identify-multiple-languages'),
languageCode: _s
.extractHeaderStringValue(
response.headers, 'x-amzn-transcribe-language-code')
?.let(LanguageCode.fromString),
languageModelName: _s.extractHeaderStringValue(
response.headers, 'x-amzn-transcribe-language-model-name'),
languageOptions: _s.extractHeaderStringValue(
response.headers, 'x-amzn-transcribe-language-options'),
mediaEncoding: _s
.extractHeaderStringValue(
response.headers, 'x-amzn-transcribe-media-encoding')
?.let(MediaEncoding.fromString),
mediaSampleRateHertz: _s.extractHeaderIntValue(
response.headers, 'x-amzn-transcribe-sample-rate'),
numberOfChannels: _s.extractHeaderIntValue(
response.headers, 'x-amzn-transcribe-number-of-channels'),
partialResultsStability: _s
.extractHeaderStringValue(
response.headers, 'x-amzn-transcribe-partial-results-stability')
?.let(PartialResultsStability.fromString),
piiEntityTypes: _s.extractHeaderStringValue(
response.headers, 'x-amzn-transcribe-pii-entity-types'),
preferredLanguage: _s
.extractHeaderStringValue(
response.headers, 'x-amzn-transcribe-preferred-language')
?.let(LanguageCode.fromString),
requestId:
_s.extractHeaderStringValue(response.headers, 'x-amzn-request-id'),
sessionId: _s.extractHeaderStringValue(
response.headers, 'x-amzn-transcribe-session-id'),
sessionResumeWindow: _s.extractHeaderIntValue(
response.headers, 'x-amzn-transcribe-session-resume-window'),
showSpeakerLabel: _s.extractHeaderBoolValue(
response.headers, 'x-amzn-transcribe-show-speaker-label'),
vocabularyFilterMethod: _s
.extractHeaderStringValue(
response.headers, 'x-amzn-transcribe-vocabulary-filter-method')
?.let(VocabularyFilterMethod.fromString),
vocabularyFilterName: _s.extractHeaderStringValue(
response.headers, 'x-amzn-transcribe-vocabulary-filter-name'),
vocabularyFilterNames: _s.extractHeaderStringValue(
response.headers, 'x-amzn-transcribe-vocabulary-filter-names'),
vocabularyName: _s.extractHeaderStringValue(
response.headers, 'x-amzn-transcribe-vocabulary-name'),
vocabularyNames: _s.extractHeaderStringValue(
response.headers, 'x-amzn-transcribe-vocabulary-names'),
);
}