GeminiRealtimeClient class

Client for the Gemini Live (BidiGenerateContent) WebSocket API.

Audio formats (per Google docs):

Input: PCM 16-bit, 16 kHz, mono, little-endian — base64-encoded into JSON realtime_input.media_chunks messages.
Output: PCM 16-bit, 24 kHz, mono, little-endian — base64 blobs inside serverContent.modelTurn.parts[].inlineData.data.

Lifecycle:

connect() opens the WebSocket and sends the setup message.
The server responds with a setup-complete signal → emitted as GeminiSetupComplete on events.
Caller streams audio chunks in via sendAudio, optional text via sendText, and signals end-of-turn with sendTurnComplete (or lets the server's VAD do it).
The server replies on events with audio + text deltas plus GeminiTurnComplete when done.
close() shuts the socket cleanly.

Audio agnostic. This client never touches a microphone or a speaker. Callers wire in their own (record package on Itzli mobile, the Web Audio API in browser, etc.) and feed PCM bytes through. That keeps neomage pure-Dart with no platform shims.

Network policy. Uses WebSocketChannel.connect, which honours the system proxy on Dart VM. The TLS / cert chain is whatever the platform's default trust store accepts.

Constructors

GeminiRealtimeClient({required String apiKey, String model = defaultModel, String endpoint = defaultEndpoint, List<String> responseModalities = const ['AUDIO'], String? systemInstruction, WebSocketChannel connector(Uri uri)?})

Properties

apiKey → String: final
connector → WebSocketChannel Function(Uri uri)?: Override for the WebSocket connector — tests inject a fake. Production callers leave it null.
final
endpoint → String: final
events → Stream<GeminiRealtimeEvent>: Hot stream of every event from the server. Subscribe before calling connect so you don't miss GeminiSetupComplete.
no setter
hashCode → int: The hash code for this object.
no setterinherited
isClosed → bool: Whether close has run (or the server closed first).
no setter
isReady → bool: true once the initial setup handshake completed.
no setter
model → String: final
responseModalities → List<String>: Modalities the server should produce. Common values: ['AUDIO'] for pure voice, ['TEXT'] for speech-to-text mode, ['AUDIO', 'TEXT'] for both (the API limits combinations — check current docs).
final
runtimeType → Type: A representation of the runtime type of the object.
no setterinherited
systemInstruction → String?: Optional system instructions sent in the setup message.
final

Methods

close() → Future<void>: Closes the WebSocket. Safe to call more than once.
connect() → Future<void>: Opens the WebSocket and sends the setup envelope. Returns once the channel is open — GeminiSetupComplete arrives later on events after the server acks.
debugInjectServerMessage(Map<String, Object?> message) → void: Test-only entry point. Forwards a server message into the event stream without going through a WebSocket. Mirror of what _onMessage would do on real input.
noSuchMethod(Invocation invocation) → dynamic: Invoked when a nonexistent method or property is accessed.
inherited
sendAudio(Uint8List pcm16) → Future<void>: Sends one chunk of microphone audio. pcm16 must be 16-bit PCM, 16 kHz, mono, little-endian. Chunk size: 80–200 ms is the sweet spot (1280–3200 samples). Throws if called before connect.
sendText(String text) → Future<void>: Sends a text turn (e.g. when the user types instead of speaking).
sendTurnComplete() → Future<void>: Signals that the user has finished their current turn. Use when you've disabled server-side VAD and want explicit boundaries.
toString() → String: A string representation of this object.
inherited

Operators

operator ==(Object other) → bool: The equality operator.
inherited

Constants

defaultEndpoint → const String: Default Gemini Live endpoint. Override with endpoint when pointing at a regional or proxy URL.
defaultModel → const String: Default model id for low-latency voice. Override per session.