Augnito Flutter SDK

Overview

Use the Augnito Flutter SDK to enable Text To Speech and voice commands into a Flutter application.

Supported Platforms

  • Android
  • iOS

Requirements

Microphone access/recording permission is required in order to work.

Android

Add the following line to AndroidManifest.xml

<uses-permission android:name="android.permission.RECORD_AUDIO"/>

iOS

Add the following to Info.plist:

<key>NSMicrophoneUsageDescription</key>
<string>Microphone access required</string>

Getting Started

The majority of the functionality goes through the DictationManager class via the use of callbacks.

To instantiate a DictationManager object an AugnitoConfig must be provided with valid authorization and keys provided.

late DictationManager _dictationManager;
final AugnitoConfig _config = AugnitoConfig(
    "<your server>",
    "<your accountcode>",
    "<your accesskey>",
    "<your lmid>",
    "<your usertag>",
    sourceApp: "<your sourceapp>");

And on an initialization method

_dictationManager = DictationManager(_config,
    onConnected: _onConnected,
    onDisconnected: _onDisconnected,
    onError: _onError,
    onPartialResult: _onPartialResult,
    onFinalResult: _onFinalResult,
    onCommandResult: _onCommand,
    enableLogs: true);

Alternatively it can be initialized with a provided connection URL:

_dictationManager = DictationManager.fromCustomSpeechURL(
    "custom speech URL",
    onConnected: _onConnected,
    onDisconnected: _onDisconnected,
    onError: _onError,
    onPartialResult: _onPartialResult,
    onFinalResult: _onFinalResult,
    onCommandResult: _onCommand,
    enableLogs: true);

Dictation Manager

The DictationManager offers methods to initialize, receive, and stop communication with Augnito's server to provide Speech To Text and Commands support into a Flutter app.

The way it works is via methods and callbacks.

Logging

Logging can be enabled to have more insights on what may be causing errors via the enableLogs parameter at the moment of instantiation (constructor).

Dictation Manager Methods

method notes
toggleDictation Turns on or off the audio processing and server communication.
dispose disposes and releases resources used by the SDK

Disposing

The dispose method should be called when there is no more need for the DictationManager:

example on a widget:

@override
void dispose() {
  _dictationManager.dispose();
  super.dispose();
}

Dictation Manager Callbacks

Callbacks are used to receive the output of the SDK as well as errors that may occur. The callbacks are provided via the DictationManager constructor as named optional parameters.

callback notes
onConnected Invoked when the connection to the server has been established. This does not guarantees the audio stream has started.
onDisconnected Invoked when the server closes connection. It can be triggered on purpose by the DictationManager under certain conditions such as being unable to start audio streaming.
onError Invoked when an error occurs. Not all errors invoke a connection termination.
onPartialResult Invoked when an hypothesis or non processed final text has been obtained.
onFinalResult Invoked when a final text output (not command) has been processed.
onCommandResult Invoked when a command has been processed.

Example usage

Future<void> initPlatformState() async {
  if (!mounted) return;
  _dictationManager = DictationManager(_config,
      onConnected: _onConnected,
      onDisconnected: _onDisconnected,
      onError: _onError,
      onPartialResult: _onPartialResult,
      onFinalResult: _onFinalResult,
      onCommandResult: _onCommand,
      enableLogs: true);
}

void _onPartialResult(String hypothesis) {
  print("partial result: $hypothesis");
}

void _onFinalResult(String transcription) {
  print("final result: $transcription");
}

void _onCommand(ActionRecipe actionRecipe) {
  print("command: ${actionRecipe.name}");
  print(actionRecipe.toString());
}

void _onConnected() {
  setState(() {
    _status = "online";
  });
}

void _onDisconnected() {
  setState(() {
    _status = "offline";
  });
}

void _onError(DictationError error) {
  print("error ${error.errorMessage}");
}

Dictation Manager Error Types

Dictation Manager will report back on errors via the onError callback. Besides the method itself an object with further details is provided.

The errorType property is the best way to determine what could be happening underneath. Possible types:

enum DictationErrorType {
  noNetworkConnection,
  lowBandwidth,
  serviceDown,
  noDictationStopMic,
  invalidAuthCredentials,
  socketDisconnect,
  socketConnectionError,
  audioRecorder,
  audioRecorderCouldNotInitializePermission,
  audioRecorderCouldNotInitializeGeneric,
  unknown
}

Even further information will be provided on the errorMessage property when possible.

A note on noDictationStopMic: this will not terminate the connection but it's and indicator the microphone is open but idle. The consumer app may want to warn the user or terminate the session if this keeps happening.


Commands

Commands are represented by the ActionRecipe object. It includes the information required to be processed by the consumer app.

Commands are returned by the SDK via the onCommandResult callback (DictationManager).

ActionRecipe

As stated the result of a command (either Static or Dynamic) will be an instance of an ActionRecipe.

When a command is returned by the SDK the consumer needs to analyze it and determine what to do based on the properties of the ActionRecipe instance.

For example, a Static Command resulting ActionRecipe will always have it's isStaticCommand property set as true.

Other fields such as searchText, chooseNumber or selectFor are used to determine what to do with the command and in what context. For instance a Replace X with Y dynamic command will produce an ActionRecipe like:

// replace oranges with apples
{
  name: 'replace', // AugnitoCommands.replace entry
  isCommand: true,
  searchText: 'oranges',
  selectFor: 'apples'
  ...
}

ActionRecipe fields

property notes
name Command name. Can be matched against one of the entries of AugnitoCommands class.
action
chooseNumber Unit: on commands that affect items it indicates how many.
searchText On most select/action commands this represents the item to be searched. For example paragraph, line, word, etc. On some commands this indicates what is being searched for.
selectFor Usually used to determine an action on most commands. For example delete, underline, etc.
isCommand Indicates the ActionRecipe is a command.
isStaticCommand Indicates this ActionRecipe is a static command.
nextPrevious Direction on how this command should affect the items.
receivedText Original received text.
receivedTextWithoutSpace Original text without spaces.

Using Commands

Commands are meant to be used on their on and not in conjunction with regular dictation.

For example:

Dictating Select next 3 words will produce a command

Dictating Patient presents with fever and chills select last 2 words will only produce a transcription

Matching commands

Commands can be matched with entries on the AugnitoCommands dictionary. In the previous example the name equals to AugnitoCommands.replace.

Static Commands

Commands that have not effect on specific units or items are denominated as static commands.

Microphone Control Static Commands
stopMic
Selection Static Commands
selectAll
selectChar
selectWord
selectLine
selectNextLine
selectPreviousLine
selectParagraph
Lists Static Commands
startNumberList
startBulletList
stopBulletList
stopNumberList
stopList
Action Static Commands
undoIt
redoIt
Text Modification Commands
startBoldText
stopBoldText
startBulletText
stopBulletText
startItalicText
stoptItalicText
Modification Commands
deleteIt
pasteIt
copyIt
cutIt
headerIt
underlineIt
boldIt
italicizeIt
capitalizeIt
bulletIt
Navigation Static Commands
moveUp
moveDown
moveLeft
moveRight
goToLineStart
goToLineEnd
giveSpace
backspace
goToDocumentEnd
goToDocumentStart
goToNextPage
goToPreviousPage
goToNextParagraph
goTo
Field Navigation Static Commands
nextField
previousField

Dynamic Commands

Dynamic commands are more complex commands built based on the speech input and may contain extra information (always within the ActionRecipe itself).

Line and Paragraph number

Commands that affect a line or paragraph number via an action command such as select, bold, delete, etc.

^(select|choose|copy text|copytext|cut text|cuttext|correct|bold|underline|delete|header|capitalize|unbold|debold|dbold|uncapitalize|remove|capitalise|dcapitalise|decapitalicize|decapitalize|uncapitalize|Uncap|d capitalise|d capitalize|d underline|dunderline|deunderline|ununderline|goto|moveto|move|italicize|italicise|unitalicise|unitalicize)(the)?(line|para|paragraph|\n\n|\n)(number[ed]?)(.*?)(\.)?$

Try online

A correctly parsed Line and Paragraph number will have a structure similar to:

{
  name: AugnitoCommands.goToLineNumber // or AugnitoCommands.goToParaNumber 
  isCommand: true,
  chooseNumber: 10,
  searchText: "line",
  selectFor: "goToStart"
  ...
}

Example voice command

  • choose line number 10
  • delete paragraph number 5
  • bold the para number 1

Go To New Line or New Paragraph

Simple Go To new line or new paragraph command.

{
  name: AugnitoCommands.selectLine // or AugnitoCommands.selectParagraph 
  isCommand: true,
  searchText: "line",
  selectFor: "gotoend"
  ...
}

Example voice command

  • go to new line
  • go to new paragraph

Active X Dynamic Command

Action against the active line, paragraph, space, etc.

^(select|choose|copy text|copytext|cut text|cuttext|correct|bold|underline|delete|header|capitalize|unbold|debold|dbold|uncapitalize|remove|capitalise|dcapitalise|decapitalicize|decapitalize|uncapitalize|Uncap|d capitalise|d capitalize|d underline|dunderline|deunderline|ununderline|goto|moveto|move|italicize|italicise|unitalicise|unitalicize)(the)?(line|para|paragraph|\n\n|\n)(number[ed]?)(.*?)(\.)?$

Try online

Example voice command

  • select active paragraph
  • delete the current line
{
  name: AugnitoCommands.selectActiveChar
  isCommand: true,
  ...
}

{
  name: AugnitoCommands.selectActiveWord
  isCommand: true,
  ...
}

{
  name: AugnitoCommands.selectActiveLine
  isCommand: true,
  ...
}

{
  name: AugnitoCommands.selectActiveParagraph
  isCommand: true,
  ...
}

Processes a direction plus a combination of item and distance.

^(last|previous|next|down)(.*?)(word[s]?|line[s]?|sentence[s]?|paragraph[s]?|para[s]?|char[s]?|character[s]?|space|\n\n|\n)$

Try online

Example voice commands

  • next 3 lines
  • last 5 paras

Go To X Dynamic Command

Processor to identify movement with direction and unit.

^(go|goto|gotothe|move|moveto|movethe)(last|previous|next|down)(.*?)(word[s]?|line[s]?|sentence[s]?|paragraph[s]?|para[s]?|char[s]?|character[s]?|space|\n\n|\n)$

Try online

Example voice commands

  • go last 4 word
  • go to last space
  • move the previous 4 words
// go last 4 word
{
  name: AugnitoCommands.selectWord
  isCommand: true,
  chooseNumber: 4,
  nextPrevious: 'last',
  selectFor: 'gotostart'
  ...
}

// go to last space
{
  name: AugnitoCommands.selectChar
  isCommand: true,
  chooseNumber: 0,
  nextPrevious: 'last',
  selectFor: 'gotostart'
  ...
}

// move the previous 4 words
{
  name: AugnitoCommands.selectWord
  isCommand: true,
  chooseNumber: 4,
  nextPrevious: 'previous',
  selectFor: 'gotostart'
  ...
}

Select / Action on Item with Unit and Direction Dynamic Command

Identifies an action on an item that includes direction and unit amount.

^(select|choose|copy text|copytext|cut text|cuttext|correct|bold|underline|delete|header|capitalize|unbold|debold|dbold|uncapitalize|remove|capitalise|dcapitalise|decapitalicize|decapitalize|uncapitalize|Uncap|d capitalise|d capitalize|d underline|dunderline|deunderline|ununderline|goto|moveto|move|italicize|italicise|unitalicise|unitalicize)(the)?(last|previous|next)(.*?)(word[s]?|line[s]?|sentence[s]?|paragraph[s]?|para[s]?|char[s]?|character[s]?|space|\n\n|\n\ns|\n|\ns)$

Try online

Example voice commands

  • choose the next 10 characters
  • underline previous new line
  • capitalize the next 5 words
  • delete previous 5 paragraphs
// choose the next 10 characters
{
  name: AugnitoCommands.selectChar
  isCommand: true,
  chooseNumber: 10,
  nextPrevious: 'next',
  searchText: 'characters',
  selectFor: ''
  ...
}

// underline previous new line
{
  name: AugnitoCommands.selectLine
  isCommand: true,
  chooseNumber: 1,
  nextPrevious: 'previous',
  searchText: '\n',
  selectFor: 'underline'
  ...
}

// capitalize the next 5 words
{
  name: AugnitoCommands.selectWord
  isCommand: true,
  chooseNumber: 5,
  nextPrevious: 'next',
  searchText: 'words',
  selectFor: 'capitalize'
  ...
}

// delete previous 5 paragraphs
{
  name: AugnitoCommands.selectParagraph
  isCommand: true,
  chooseNumber: 5,
  nextPrevious: 'previous',
  searchText: '\n\n',
  selectFor: 'delete'
  ...
}

Replace X with Y Dynamic Command

Identifies the replace operation between two elements.

replace[d]?([A-Z a-z 0-9]+)with([A-Z a-z 0-9]+)

Try online

Example voice commands

  • replace oranges with apples
  • replace blue with red
  • replace cars with buses
// replace oranges with apples
{
  name: AugnitoCommands.replace
  isCommand: true,
  searchText: 'oranges',
  selectFor: 'apples'
  ...
}

Select/Action Group Dynamic Command

Identifies a Select/Action command.

^(select|choose|copy text|copytext|cut text|cuttext|correct|bold|underline|delete|header|capitalize|unbold|debold|dbold|uncapitalize|remove|capitalise|dcapitalise|decapitalicize|decapitalize|uncapitalize|Uncap|d capitalise|d capitalize|d underline|dunderline|deunderline|ununderline|goto|moveto|move|italicize|italicise|unitalicise|unitalicize)(\sthe)?\s?(.*?)$

Try online

Example voice commands

  • delete mistaken
  • bold firstname
  • capitalize last name
// delete mistaken
{
  name: AugnitoCommands.select
  isCommand: true,
  searchText: 'mistaken',
  selectFor: 'delete'
  ...
}

Utils and Helpers

TextFieldProcessor

The SDK includes a processor to handle basic operations on the TextField widget (via the TextEditingController).

The methods included for the processor are:

  • addContent
  • selectLastLines
  • selectLatWords
  • deleteSelection

The usage of these methods is shown on the example app:

 void _processSelectLine(ActionRecipe actionRecipe) {
    TextFieldProcessor.selectLastLines(focusedEditor!.textEditingController,
        offset: actionRecipe.chooseNumber);

    if (actionRecipe.selectFor == Commands.Delete) {
      TextFieldProcessor.deleteSelection(focusedEditor!.textEditingController);
    }
  }

  void _processSelectWord(ActionRecipe actionRecipe) {
    TextFieldProcessor.selectLastWords(focusedEditor!.textEditingController,
        offset: actionRecipe.chooseNumber);
    if (actionRecipe.selectFor == Commands.Delete) {
      TextFieldProcessor.deleteSelection(focusedEditor!.textEditingController);
    }
  }

iOS and background UI modes

In order to keep the app working on the background the info.plist requires the following entries:

<key>UIBackgroundModes</key>
<array>
  <string>audio</string>
  <string>fetch</string>
</array>

Libraries

dictation/models/action_recipe
dictation/commands/dynamic/processors/active_x_command_processor
config/augnito_api_server
dictation/core/augnito_audio_stream
dictation/commands/augnito_command_builder
dictation/support/augnito_commands
config/augnito_config
dictation/commands/dynamic/augnito_dynamic_command_processor
dictation/support/augnito_event_types
augnito_flutter_sdk
audio/augnito_mic_stream
support/logging/augnito_print_logger
dictation/dto/augnito_server_response
support/augnito_source
dictation/commands/static/augnito_static_command_processor
dictation/core/augnito_web_client
dictation/commands/dynamic/processors/base_regex_command_processor
dictation/commands/utils/command_utils
dictation/support/dictation_error
dictation/dictation_manager
support/domain_utils
dictation/commands/dynamic/processors/generic_goto_processor
dictation/commands/dynamic/processors/goto_moveto_command_processor
dictation/commands/dynamic/processors/goto_moveto_paragraph_command_processor
dictation/commands/dynamic/processors/goto_x_command_processor
dictation/commands/dynamic/processors/line_paragraph_number_command_processor
dictation/dto/meta_event_response
dictation/commands/dynamic/processors/number_command_processor
dictation/commands/dynamic/processors/replace_command_processor
dictation/dto/result_response
config/sdk_config
dictation/commands/dynamic/processors/select_group_processor
dictation/commands/dynamic/processors/select_item_direction_processor
dictation/models/speech_to_text_output
text/utils/text_field_processor
dictation/commands/utils/word_to_integer_utils