cloud_text_to_speech 2.3.0 copy "cloud_text_to_speech: ^2.3.0" to clipboard
cloud_text_to_speech: ^2.3.0 copied to clipboard

Single interface to Google, Microsoft and Amazon Text-To-Speech.

Cloud Text-To-Speech #

Single interface to Google, Microsoft, and Amazon Text-To-Speech. Flutter implementation of:

Features #

  • Universal implementation for accessing all providers with one interface.
  • Separate implementation for every provider so we could access every functionality.
  • Sanitize SSML input per provider so we send only supported SSML elements.
  • Locale names in English and native language so we could display language selector.
  • Fake name generation for Google voices that are generated randomly based on voice locale.
  • Accessible configurable output format (per provider), rate, and pitch.

Getting Started #

There are essentially two ways to use Cloud Text-To-Speech:

  • Universal: Using TtsUniversal to be able to configure the TTS provider dynamically and us it.
    • Single: Using TtsProviders.google, TtsProviders.microsoft, TtsProviders.amazon to use the single provider at a time.
    • Combine: Using TtsProviders.combine to combine all providers and get all voices at once.
  • Provider: Using TtsGoogle, TtsMicrosoft, TtsAmazon to get the most from provider's API.

Universal(Single) #

To init configuration use:

//Do init once and run it before any other method
TtsUniversal.init(
  provider: TtsProviders.amazon,
  googleParams: InitParamsGoogle(apiKey: 'API-KEY'),
  microsoftParams: InitParamsMicrosoft(
  subscriptionKey: 'SUBSCRIPTION-KEY', region: 'eastus'),
  amazonParams: InitParamsAmazon(
  keyId: 'KEY-ID', accessKey: 'ACCESS-KEY', region: 'us-east-1'),
  withLogs: true
);

To change provider use:

TtsUniversal.setProvider(TtsProviders.microsoft);

To get the list of all voices use:

//Get voices
final voicesResponse = await TtsUniversal.getVoices();

final voices = voicesResponse.voices;

//Print all available voices
print(voices);

//Pick an English Voice
final voice = voices
    .where((element) => element.locale.code.startsWith("en-"))
    .toList(growable: false)
    .first;

To convert TTS and get audio use:

//Generate Audio for a text
const text = "Amazon, Microsoft and Google Text-to-Speech API are awesome";

final ttsParams = TtsParamsUniversal(
        voice: voice, 
        audioFormat: AudioOutputFormatUniversal.mp3_64k, 
        text: text, 
        rate: 'slow', //optional
        pitch: 'default' //optional
      );

final ttsResponse = await TtsUniversal.convertTts(ttsParams);

//Get the audio bytes.
final audioBytes = ttsResponse.audio.buffer.asByteData();

Universal(Combine) #

To init configuration use:

//Do init once and run it before any other method
TtsUniversal.init(
  provider: TtsProviders.combine,
  googleParams: InitParamsGoogle(apiKey: 'API-KEY'),
  microsoftParams: InitParamsMicrosoft(
  subscriptionKey: 'SUBSCRIPTION-KEY', region: 'eastus'),
  amazonParams: InitParamsAmazon(
  keyId: 'KEY-ID', accessKey: 'ACCESS-KEY', region: 'us-east-1'),
  withLogs: true
);

To change provider use:

TtsUniversal.setProvider(TtsProviders.combine);

To get the list of all voices use:

//Get voices
final voicesResponse = await TtsUniversal.getVoices();

final voices = voicesResponse.voices;

//Print all available voices
print(voices);

//Pick an English Voice
final voice = voices
    .where((element) => element.locale.code.startsWith("en-"))
    .toList(growable: false)
    .first;

To convert TTS and get audio use:

//Generate Audio for a text
const text = "Amazon, Microsoft and Google Text-to-Speech API are awesome";

final ttsParams = TtsParamsUniversal(
        voice: voice, 
        audioFormat: AudioOutputFormatUniversal.mp3_64k, 
        text: text, 
        rate: 'slow', //optional
        pitch: 'default' //optional
);

final ttsResponse = await TtsUniversal.convertTts(ttsParams);

//Get the audio bytes.
final audioBytes = ttsResponse.audio.buffer.asByteData();

Google #

To init configuration use:

//Do init once and run it before any other method
TtsGoogle.init(
  params: InitParamsGoogle(apiKey: "API-KEY"), 
  withLogs:true
);

To get the list of all voices use:

//Get voices
final voicesResponse = await TtsGoogle.getVoices();

final voices = voicesResponse.voices;

//Print all voices
print(voices);

//Pick an English Voice
final voice = voices
    .where((element) => element.locale.code.startsWith("en-"))
    .toList(growable: false)
    .first;

To convert TTS and get audio use:

//Generate Audio for a text
final text = '<speak>Google<break time="2s"> Speech Service Text-to-Speech API is awesome!</speak>';

TtsParamsGoogle ttsParams = TtsParamsGoogle(
        voice: voice, 
        audioFormat: AudioOutputFormatGoogle.mp3, 
        text: text, 
        rate: 'slow', //optional
        pitch: 'default' //optional
      );

final ttsResponse = await TtsGoogle.convertTts(ttsParams);

//Get the audio bytes.
final audioBytes = ttsResponse.audio.buffer.asByteData();

Microsoft #

To init configuration use:

//Do init once and run it before any other method
TtsMicrosoft.init(
  params: InitParamsMicrosoft(
  subscriptionKey: "SUBSCRIPTION-KEY", region: "eastus"),
  withLogs: true
);

To get the list of all voices use:

//Get voices
final voicesResponse = await TtsMicrosoft.getVoices();

final voices = voicesResponse.voices;

//Print all voices
print(voices);

//Pick an English Voice
final voice = voices
    .where((element) => element.locale.code.startsWith("en-"))
    .toList(growable: false)
    .first;

To convert TTS and get audio use:

//Generate Audio for a text
final text = '<speak>Microsoft<break time="2s"> Speech Service Text-to-Speech API is awesome!</speak>';

TtsParamsMicrosoft ttsParams = TtsParamsMicrosoft(
        voice: voice, 
        audioFormat: AudioOutputFormatMicrosoft.audio48Khz192kBitrateMonoMp3, 
        text: text, 
        rate: 'slow', //optional
        pitch: 'default' //optional
      );

final ttsResponse = await TtsMicrosoft.convertTts(ttsParams);

//Get the audio bytes.
final audioBytes = ttsResponse.audio.buffer.asByteData();

Amazon #

To init configuration use:

//Do init once and run it before any other method
TtsAmazon.init(
  params: InitParamsAmazon(
    keyId: 'KEY-ID', 
    accessKey: 'ACCESS-KEY', 
    region: 'us-east-1'
  ),
  withLogs: true
);

To get the list of all voices use:

//Get voices
final voicesResponse = await TtsAmazon.getVoices();

final voices = voicesResponse.voices;

//Print all voices
print(voices);

//Pick an English Voice
final voice = voices
    .where((element) => element.locale.code.startsWith("en-"))
    .toList(growable: false)
    .first;

To convert TTS and get audio use:

//Generate Audio for a text
final text = '<speak>Amazon<break time="2s"> Speech Service Text-to-Speech API is awesome!</speak>';

TtsParamsAmazon ttsParams = TtsParamsAmazon(
        voice: voice, 
        audioFormat: AudioOutputFormatAmazon.audio48Khz192kBitrateMonoMp3, 
        text: text, 
        rate: 'slow', //optional
        pitch: 'default' //optional
);

final ttsResponse = await TtsAmazon.convertTts(ttsParams);

//Get the audio bytes.
final audioBytes = ttsResponse.audio.buffer.asByteData();

Notes #

There are things you should take care of:

  • Securing of your API keys and credentials, they could be extracted from your mobile or web app.
  • Sometimes Amazon Polly is not working in emulator, so you could get 403 error.
  • For fixing SSML/XML before passing it to TTS Params, you could use the xml package's, methods XmlDocument.parse(ssml).toXmlString().
  • Audio has uniform format for all providers, it is Uint8List that you could use to play it or save it to file.
  • Some player packages that are good fit are: audioplayers and assets_audio_player.