GeminiRealtimeClient class

Client for the Gemini Live (BidiGenerateContent) WebSocket API.

Audio formats (per Google docs):

  • Input: PCM 16-bit, 16 kHz, mono, little-endian — base64-encoded into JSON realtime_input.media_chunks messages.
  • Output: PCM 16-bit, 24 kHz, mono, little-endian — base64 blobs inside serverContent.modelTurn.parts[].inlineData.data.

Lifecycle:

  1. connect() opens the WebSocket and sends the setup message.
  2. The server responds with a setup-complete signal → emitted as GeminiSetupComplete on events.
  3. Caller streams audio chunks in via sendAudio, optional text via sendText, and signals end-of-turn with sendTurnComplete (or lets the server's VAD do it).
  4. The server replies on events with audio + text deltas plus GeminiTurnComplete when done.
  5. close() shuts the socket cleanly.

Audio agnostic. This client never touches a microphone or a speaker. Callers wire in their own (record package on Itzli mobile, the Web Audio API in browser, etc.) and feed PCM bytes through. That keeps neomage pure-Dart with no platform shims.

Network policy. Uses WebSocketChannel.connect, which honours the system proxy on Dart VM. The TLS / cert chain is whatever the platform's default trust store accepts.

Constructors

GeminiRealtimeClient({required String apiKey, String model = defaultModel, String endpoint = defaultEndpoint, List<String> responseModalities = const ['AUDIO'], String? systemInstruction, WebSocketChannel connector(Uri uri)?})

Properties

apiKey String
final
connector → WebSocketChannel Function(Uri uri)?
Override for the WebSocket connector — tests inject a fake. Production callers leave it null.
final
endpoint String
final
events Stream<GeminiRealtimeEvent>
Hot stream of every event from the server. Subscribe before calling connect so you don't miss GeminiSetupComplete.
no setter
hashCode int
The hash code for this object.
no setterinherited
isClosed bool
Whether close has run (or the server closed first).
no setter
isReady bool
true once the initial setup handshake completed.
no setter
model String
final
responseModalities List<String>
Modalities the server should produce. Common values: ['AUDIO'] for pure voice, ['TEXT'] for speech-to-text mode, ['AUDIO', 'TEXT'] for both (the API limits combinations — check current docs).
final
runtimeType Type
A representation of the runtime type of the object.
no setterinherited
systemInstruction String?
Optional system instructions sent in the setup message.
final

Methods

close() Future<void>
Closes the WebSocket. Safe to call more than once.
connect() Future<void>
Opens the WebSocket and sends the setup envelope. Returns once the channel is open — GeminiSetupComplete arrives later on events after the server acks.
debugInjectServerMessage(Map<String, Object?> message) → void
Test-only entry point. Forwards a server message into the event stream without going through a WebSocket. Mirror of what _onMessage would do on real input.
noSuchMethod(Invocation invocation) → dynamic
Invoked when a nonexistent method or property is accessed.
inherited
sendAudio(Uint8List pcm16) Future<void>
Sends one chunk of microphone audio. pcm16 must be 16-bit PCM, 16 kHz, mono, little-endian. Chunk size: 80–200 ms is the sweet spot (1280–3200 samples). Throws if called before connect.
sendText(String text) Future<void>
Sends a text turn (e.g. when the user types instead of speaking).
sendTurnComplete() Future<void>
Signals that the user has finished their current turn. Use when you've disabled server-side VAD and want explicit boundaries.
toString() String
A string representation of this object.
inherited

Operators

operator ==(Object other) bool
The equality operator.
inherited

Constants

defaultEndpoint → const String
Default Gemini Live endpoint. Override with endpoint when pointing at a regional or proxy URL.
defaultModel → const String
Default model id for low-latency voice. Override per session.