Flutter Gemini Live

[English](https://github.com/JAICHANGPARK/flutter_gemini_live/blob/main/README.md) | [한국어](https://github.com/JAICHANGPARK/flutter_gemini_live/blob/main/README_KR.md)

A Flutter package for using the experimental Gemini Live API, enabling real-time, multimodal conversations with Google's Gemini models.
No Firebase / Firebase AI Logic dependency
Supports current Gemini Live model families (for example gemini-live-2.5-flash-preview and gemini-2.5-flash-native-audio-preview-12-2025).
Supports both TEXT and AUDIO response modalities (one modality per session).

https://github.com/user-attachments/assets/7d826f37-196e-4ddd-8828-df66db252e8e

✨ Features

Real-time Communication: Establishes a WebSocket connection for low-latency, two-way interaction.
Multimodal Input: Send text, images, and audio in a single conversational turn.
Streaming Responses: Receive text responses from the model as they are being generated.
Easy-to-use Callbacks: Simple event-based handlers for onOpen, onMessage, onError, and onClose.
Function Calling: Model can call external functions and receive results.
Session Resumption: Resume sessions after connection drops.
Voice Activity Detection (VAD): Automatic or manual voice activity detection.
Realtime Media Chunks: Send audio/image chunks in real-time.
Audio Transcription: Transcribe voice input and output to text.

Demo 1: Chihuahua vs muffin	Demo 2: Labradoodle vs fried chicken

Chihuahua vs muffin	Labradoodle vs fried chicken

🏁 Getting Started

Prerequisites

You need a Google Gemini API key to use this package. You can get your key from Google AI Studio.

Installation

Add the package to your pubspec.yaml file:

dependencies:
  gemini_live: ^2026.3.21 # Use the latest published version

or run this command (Recommend):

flutter pub add gemini_live

Install the package from your terminal:

flutter pub get

Now, import the package in your Dart code:

import 'package:gemini_live/gemini_live.dart';

🚀 Usage

Basic Example

Here is a basic example of how to use the gemini_live package to start a session and send a message.

Security Note: Do not hardcode your API key. It is highly recommended to use a .env file with a package like flutter_dotenv to keep your credentials secure.

import 'package:gemini_live/gemini_live.dart';

// 1. Initialize Gemini with your API key
final genAI = GoogleGenAI(apiKey: 'YOUR_API_KEY_HERE');
LiveSession? session;

// 2. Connect to the Live API
Future<void> connect() async {
  try {
    session = await genAI.live.connect(
      LiveConnectParameters(
        model: 'gemini-live-2.5-flash-preview',
        config: GenerationConfig(responseModalities: [Modality.TEXT]),
        callbacks: LiveCallbacks(
          onOpen: () => print('✅ Connection opened'),
          onMessage: (LiveServerMessage message) {
            // 3. Handle incoming messages from the model
            if (message.text != null) {
              print('Received chunk: ${message.text}');
            }
            if (message.serverContent?.turnComplete ?? false) {
              print('✅ Turn complete!');
            }
          },
          onError: (e, s) => print('🚨 Error: $e'),
          onClose: (code, reason) => print('🚪 Connection closed'),
        ),
      ),
    );
  } catch (e) {
    print('Connection failed: $e');
  }
}

// 4. Send a message to the model
void sendMessage(String text) {
  session?.sendText(text);
}

🆕 Key Live Features

Function Calling

The model can call external functions and receive results:

late final LiveSession session;

session = await genAI.live.connect(
  LiveConnectParameters(
    model: 'gemini-live-2.5-flash-preview',
    tools: [
      Tool(
        functionDeclarations: [
          FunctionDeclaration(
            name: 'get_weather',
            description: 'Get weather by city',
            parameters: {
              'type': 'OBJECT',
              'properties': {
                'city': {'type': 'STRING'},
              },
              'required': ['city'],
            },
          ),
        ],
      ),
    ],
    callbacks: LiveCallbacks(
      onMessage: (LiveServerMessage message) {
        // Handle function calls
        for (final call in message.toolCall?.functionCalls ?? const <FunctionCall>[]) {
          if (call.id == null || call.name == null) continue;
          print('Function call: ${call.name}');

          // Execute function and send response
          session.sendFunctionResponse(
            id: call.id!,
            name: call.name!,
            response: {'result': 'success'},
          );
        }
      },
    ),
  ),
);

Realtime Input

Send audio, image frames, and text in real-time. The video field currently accepts image/* MIME types such as image/jpeg or image/png:

// Send real-time text
session.sendRealtimeText('Realtime text input');

// Send media chunks
session.sendMediaChunks([
  Blob(mimeType: 'audio/pcm', data: base64Audio),
]);

// Combined real-time input
session.sendRealtimeInput(
  audio: Blob(mimeType: 'audio/pcm', data: base64Audio),
  video: Blob(mimeType: 'image/jpeg', data: base64Image),
  text: 'Text description',
);

// Signal audio stream end
session.sendAudioStreamEnd();

Manual Activity Detection

Disable automatic VAD and control manually:

final session = await genAI.live.connect(
  LiveConnectParameters(
    model: 'gemini-live-2.5-flash-preview',
    realtimeInputConfig: RealtimeInputConfig(
      automaticActivityDetection: AutomaticActivityDetection(
        disabled: true, // Disable automatic detection
      ),
    ),
  ),
);

// Signal activity start
session.sendActivityStart();

// Send voice data...

// Signal activity end
session.sendActivityEnd();

Session Resumption

Resume sessions after connection drops:

// First connection with session resumption
final session = await genAI.live.connect(
  LiveConnectParameters(
    model: 'gemini-live-2.5-flash-preview',
    sessionResumption: SessionResumptionConfig(
      handle: previousSessionHandle, // Previous session handle
    ),
  ),
);

// Receive session handle updates
if (message.sessionResumptionUpdate != null) {
  final newHandle = message.sessionResumptionUpdate!.newHandle;
  // Save newHandle for later use
}

Gemini API note: SessionResumptionConfig.transparent is not supported and will throw during setup validation.

Advanced Configuration

final session = await genAI.live.connect(
  LiveConnectParameters(
    model: 'gemini-2.5-flash-native-audio-preview-12-2025',
    // Realtime input configuration
    realtimeInputConfig: RealtimeInputConfig(
      automaticActivityDetection: AutomaticActivityDetection(
        disabled: false,
        startOfSpeechSensitivity: StartSensitivity.START_SENSITIVITY_HIGH,
        endOfSpeechSensitivity: EndSensitivity.END_SENSITIVITY_LOW,
        prefixPaddingMs: 300,
        silenceDurationMs: 500,
      ),
      activityHandling: ActivityHandling.START_OF_ACTIVITY_INTERRUPTS,
      turnCoverage: TurnCoverage.TURN_INCLUDES_ALL_INPUT,
    ),
    // Audio transcription
    inputAudioTranscription: AudioTranscriptionConfig(),
    outputAudioTranscription: AudioTranscriptionConfig(),
    // Context window compression
    contextWindowCompression: ContextWindowCompressionConfig(
      triggerTokens: '10000',
      slidingWindow: SlidingWindow(targetTokens: '5000'),
    ),
    // Proactivity
    proactivity: ProactivityConfig(proactiveAudio: true),
  ),
);

Gemini API note: AudioTranscriptionConfig.languageCodes is not currently supported in Gemini Live. Leave it unset.

Ephemeral Token (Client-to-Server)

Use apiVersion: 'v1alpha' and pass the issued ephemeral token (auth_tokens/...) as apiKey:

final genAI = GoogleGenAI(
  apiKey: 'auth_tokens/your_ephemeral_token',
  apiVersion: 'v1alpha',
);

💬 Live Chat Demo

This repository includes a comprehensive example application demonstrating the features of the gemini_live package.

Running the Demo App

Get an API Key: Make sure you have a Gemini API key from Google AI Studio.
Set Up the Project:
- Clone this repository.
- The example app now supports API key input from the UI.
- Run the app and open Settings (top-right icon) to paste your API key.
- Configure platform permissions for microphone and photo library access as needed.
- Run flutter pub get in the example directory.
Run the App:
```
cd example
flutter run
```

Demo Pages

The example app includes the following demo pages:

Chat Interface - Basic chat (text, image, audio)
Live API Features - Comprehensive demo of all new features
- VAD, transcription, session resumption, context compression, etc.
Function Calling - Function calling demo (weather/time/fx/place/reminder)
Realtime Media - Real-time audio/image-frame input demo

CLI Script Examples

Additional runnable scripts are available under examples/:

examples/basic_usage.dart
examples/function_calling.dart
examples/realtime_audio_video.dart
examples/manual_activity_detection.dart
examples/session_resumption.dart
examples/complete_features.dart
examples/ephemeral_token.dart (uses GEMINI_EPHEMERAL_TOKEN, apiVersion: 'v1alpha')

See examples/README.md for usage details.

How to Use the App

Connect: The app will attempt to connect to the Gemini API automatically. If the connection fails, tap the "Reconnect" button.
Send a Text Message:
- Type your message in the text field at the bottom.
- Tap the send (▶️) icon.
Send a Message with an Image:
- Tap the image (🖼️) icon to open your gallery.
- Select an image. A preview will appear.
- (Optional) Type a question about the image.
- Tap the send (▶️) icon.
Send a Voice Message:
- Tap the microphone (🎤) icon. Recording will start, and the icon will change to a red stop (⏹️) icon.
- Speak your message.
- Tap the stop (⏹️) icon again to finish. The audio will be sent automatically.

📚 API Reference

LiveSession Methods

sendText(String text) - Send text message
sendClientContent({List<Content>? turns, bool turnComplete}) - Send multi-turn content
sendRealtimeInput({...}) - Send real-time input (audio, image frames, text)
sendMediaChunks(List<Blob> mediaChunks) - Send media chunks
sendAudioStreamEnd() - Signal audio stream end
sendRealtimeText(String text) - Send real-time text
sendActivityStart() / sendActivityEnd() - Signal activity start/end
sendToolResponse({required List<FunctionResponse> functionResponses}) - Send tool response
sendFunctionResponse({required String id, required String name, required Map<String, dynamic> response}) - Send single function response
sendVideo(List<int> videoBytes, {String mimeType}) - Send image bytes through the Live API video field (image/* MIME types)
sendAudio(List<int> audioBytes) - Send audio
close() - Close connection
isClosed - Check connection status

LiveServerMessage Properties

text - Text response
data - Base64 encoded inline data
serverContent - Server content (modelTurn, turnComplete, etc.)
toolCall - Tool call request
toolCallCancellation - Tool call cancellation
sessionResumptionUpdate - Session resumption update
voiceActivity - Voice activity status
voiceActivityDetectionSignal - Voice activity detection signal
goAway - Server disconnect warning
usageMetadata - Token usage metadata

🤝 Contributing

Contributions of all kinds are welcome, including bug reports, feature requests, and pull requests! Please feel free to open an issue on the issue tracker.

Fork this repository.
Create your feature branch (git checkout -b feature/AmazingFeature).
Commit your changes (git commit -m 'Add some AmazingFeature').
Push to the branch (git push origin feature/AmazingFeature).
Open a Pull Request.

📜 License

See the LICENSE file for more details.