Google AI & Vertex AI Gemini API Dart Client
Unofficial Dart client for the Google AI Gemini Developer API and Vertex AI Gemini API with unified interface.
Note
The official google_generative_ai Dart package has been deprecated in favor of firebase_ai. However, since firebase_ai is a Flutter package rather than a pure Dart package, this unofficial client bridges the gap by providing a pure Dart, fully type-safe API client for both Google AI and Vertex AI.
Ideal for server-side Dart, CLI tools, and backend services needing a type-safe Gemini API client without Flutter dependencies.
Table of Contents
Features
đ Core Features (Both Google AI & Vertex AI)
Generation & Streaming
- â
Content generation (
generateContent) - â
Streaming support (
streamGenerateContent) with SSE - â
Request abortion (cancelable requests via
abortTrigger) - â
Token counting (
countTokens) - â
Tool support:
- Function calling (custom function declarations)
- Code execution (Python sandbox)
- Google Search grounding
- URL Context (fetch and analyze web content)
- File search (Semantic Retrieval with FileSearchStores)
- Google Maps (geospatial context)
- MCP servers (Model Context Protocol)
- â Safety settings support
Embeddings
- â
Content embeddings (
embedContent) - â Batch embeddings with automatic fallback
Models
- â
Model listing & discovery (
listModels,getModel) - â Content caching (full CRUD operations)
đĩ Google AI Only (Not available on Vertex AI)
Files API
- â File management (upload, list, get, download, delete)
- â Multiple upload methods: file path (IO), bytes (all platforms), streaming (IO)
- âšī¸ Vertex AI: Use Cloud Storage URIs or base64
Tuned Models
- â Tuned models (create, list, get, patch, delete, listOperations)
- â Generation with tuned models
- âšī¸ Vertex AI: Use Vertex AI Tuning API
Corpora & File Search
- â Corpus management (create, list, get, update, delete)
- â Document management within corpora
- â FileSearchStores for semantic retrieval (create, list, get, delete, import/upload files)
- âšī¸ Vertex AI: Use RAG Engine for enterprise RAG capabilities
Permissions
- â Permission management (create, list, get, update, delete)
- â
Ownership transfer (
transferOwnership) - â Grantee types (user, group, everyone)
- â Role-based access (owner, writer, reader)
Prediction (Video Generation)
- â
Synchronous prediction (
predict) - â
Long-running prediction (
predictLongRunning) for video generation with Veo - â Operation status polling for async predictions
- â RAI (Responsible AI) filtering support
Batch Operations
- â
Batch content generation (
batchGenerateContent) - â
Synchronous batch embeddings (
batchEmbedContents) - â
Asynchronous batch embeddings (
asyncBatchEmbedContent) - â Batch management (list, get, update, delete, cancel)
- â LRO polling for async batch jobs
Interactions API (Experimental)
- â Server-side conversation state management
- â
Multi-turn conversations with
previousInteractionId - â Streaming responses with SSE events
- â Function calling with automatic result handling
- â Agent support (e.g., Deep Research)
- â 17 content types (text, image, audio, function calls, etc.)
- â Background interactions with cancel support
Live API (WebSocket Streaming)
- â Real-time bidirectional WebSocket communication
- â Audio streaming (16kHz input, 24kHz output PCM)
- â Text streaming with real-time responses
- â Voice Activity Detection (VAD) - automatic and manual modes
- â Input/output audio transcription
- â Tool/function calling in live sessions
- â Session resumption with tokens
- â Context window compression for long conversations
- â Multiple voice options (Puck, Charon, Kore, etc.)
- â Ephemeral tokens for secure client-side authentication
Quick Comparison
| Aspect | Google AI | Vertex AI |
|---|---|---|
| Auth | API Key | OAuth 2.0 |
| Core Features | â Full support | â Full support |
| Files API | â Supported | â Use Cloud Storage URIs |
| Tuned Models | â Supported | â Different tuning API |
| File Search | â FileSearchStores | â RAG Engine |
| Enterprise | â None | â VPC, CMEK, HIPAA |
Why choose this client?
- â Type-safe with sealed classes
- â Multiple auth methods (API key, OAuth)
- â Minimal dependencies (http, logging only)
- â Works on all compilation targets (native, web, WASM)
- â Interceptor-driven architecture
- â Comprehensive error handling
- â Automatic retry with exponential backoff
- â Long-running operations (LRO polling)
- â Pagination support (Paginator utility)
- â 560+ tests
Quickstart
import 'package:googleai_dart/googleai_dart.dart';
// Initialize from environment variable (GOOGLE_GENAI_API_KEY)
final client = GoogleAIClient.fromEnvironment();
final response = await client.models.generateContent(
model: 'gemini-3-flash-preview',
request: GenerateContentRequest(
contents: [Content.text('Hello Gemini!')], // Convenience factory
),
);
// Use .text extension to get the response text
print(response.text);
client.close();
Or with explicit API key
final client = GoogleAIClient(
config: GoogleAIConfig.googleAI(
authProvider: ApiKeyProvider('YOUR_API_KEY'),
),
);
Installation
dependencies:
googleai_dart: {version}
API Versions
This client supports both stable and beta versions of the API:
| Version | Stability | Use Case | Features |
|---|---|---|---|
| v1 | Stable | Production apps | Guaranteed stability, breaking changes trigger new major version |
| v1beta | Beta | Testing & development | Early-access features, subject to rapid/breaking changes |
Default: The client defaults to v1beta for broadest feature access.
API Version Configuration
import 'package:googleai_dart/googleai_dart.dart';
// Use stable v1
final client = GoogleAIClient(
config: GoogleAIConfig.googleAI(
apiVersion: ApiVersion.v1,
authProvider: ApiKeyProvider('YOUR_API_KEY'),
),
);
// Use beta v1beta
final betaClient = GoogleAIClient(
config: GoogleAIConfig.googleAI(
apiVersion: ApiVersion.v1beta,
authProvider: ApiKeyProvider('YOUR_API_KEY'),
),
);
Vertex AI Support
To use Vertex AI, you need:
- A Google Cloud Platform (GCP) project
- Vertex AI Gemini API enabled
- Service account with
roles/aiplatform.userrole - OAuth 2.0 credentials
Complete Vertex AI Setup
import 'package:googleai_dart/googleai_dart.dart';
// Create an OAuth provider (implement token refresh logic)
class MyOAuthProvider implements AuthProvider {
@override
Future<AuthCredentials> getCredentials() async {
// Your OAuth token refresh logic here
final token = await getAccessToken(); // Implement this
// You can use https://pub.dev/packages/googleapis_auth
return BearerTokenCredentials(token);
}
}
// Configure for Vertex AI
final vertexClient = GoogleAIClient(
config: GoogleAIConfig.vertexAI(
projectId: 'your-gcp-project-id',
location: 'us-central1', // Or 'global', 'europe-west1', etc.
apiVersion: ApiVersion.v1, // Stable version
authProvider: MyOAuthProvider(),
),
);
// Use the same API as Google AI
final response = await vertexClient.models.generateContent(
model: 'gemini-3-flash-preview',
request: GenerateContentRequest(
contents: [
Content(
parts: [TextPart('Explain quantum computing')],
role: 'user',
),
],
),
);
Advanced Configuration
Custom Configuration Options
For advanced use cases, you can use the main GoogleAIConfig constructor with full control over all parameters:
import 'package:googleai_dart/googleai_dart.dart';
final client = GoogleAIClient(
config: GoogleAIConfig(
baseUrl: 'https://my-custom-proxy.example.com',
apiMode: ApiMode.googleAI,
apiVersion: ApiVersion.v1,
authProvider: ApiKeyProvider('YOUR_API_KEY'),
timeout: Duration(minutes: 5),
retryPolicy: RetryPolicy(maxRetries: 5),
// ... other parameters
),
);
Use cases for custom base URLs:
- Proxy servers: Route requests through a corporate proxy
- Testing: Point to a mock server for integration tests
- Regional endpoints: Use region-specific URLs if needed
- Custom deployments: Self-hosted or specialized endpoints
The factory constructors (GoogleAIConfig.googleAI() and GoogleAIConfig.vertexAI()) are convenience methods that set sensible defaults, but you can always use the main constructor for full control.
Usage
Authentication
All Authentication Methods
The client uses an AuthProvider pattern for flexible authentication:
import 'package:googleai_dart/googleai_dart.dart';
// API Key authentication (query parameter)
final client = GoogleAIClient(
config: GoogleAIConfig(
authProvider: ApiKeyProvider('YOUR_API_KEY'),
),
);
// API Key as header
final clientWithHeader = GoogleAIClient(
config: GoogleAIConfig(
authProvider: ApiKeyProvider(
'YOUR_API_KEY',
placement: AuthPlacement.header,
),
),
);
// Bearer token (for OAuth)
final clientWithBearer = GoogleAIClient(
config: GoogleAIConfig(
authProvider: BearerTokenProvider('YOUR_BEARER_TOKEN'),
),
);
// Custom OAuth with automatic token refresh
class CustomOAuthProvider implements AuthProvider {
@override
Future<AuthCredentials> getCredentials() async {
// Your token refresh logic here
// Called on each request, including retries
return BearerTokenCredentials(await refreshToken());
}
}
final oauthClient = GoogleAIClient(
config: GoogleAIConfig(
authProvider: CustomOAuthProvider(),
),
);
Basic Generation
Basic Generation Example
import 'package:googleai_dart/googleai_dart.dart';
// Initialize from environment variable (GOOGLE_GENAI_API_KEY)
final client = GoogleAIClient.fromEnvironment();
final response = await client.models.generateContent(
model: 'gemini-3-flash-preview',
request: GenerateContentRequest(
contents: [Content.text('Explain quantum computing')],
),
);
// Use .text extension for easy text extraction
print(response.text);
client.close();
Streaming
Streaming Example
import 'package:googleai_dart/googleai_dart.dart';
// Assumes you have a configured client instance
await for (final chunk in client.models.streamGenerateContent(
model: 'gemini-3-flash-preview',
request: GenerateContentRequest(
contents: [Content.text('Write a poem about AI')],
),
)) {
// Use .text extension for each chunk
final text = chunk.text;
if (text != null) print(text);
}
Canceling Requests
Request Cancellation Examples
You can cancel long-running requests using an abort trigger:
import 'dart:async';
import 'package:googleai_dart/googleai_dart.dart';
// Assumes you have a configured client instance
final abortController = Completer<void>();
// Start request with abort capability
final requestFuture = client.models.generateContent(
model: 'gemini-3-flash-preview',
request: request,
abortTrigger: abortController.future,
);
// To cancel:
abortController.complete();
// Handle cancellation
try {
final response = await requestFuture;
} on AbortedException {
print('Request was canceled');
}
This works for both regular and streaming requests. You can also use it with timeouts:
// Auto-cancel after 30 seconds
final response = await client.models.generateContent(
model: 'gemini-3-flash-preview',
request: request,
abortTrigger: Future.delayed(Duration(seconds: 30)),
);
Function Calling
Function Calling Example
import 'package:googleai_dart/googleai_dart.dart';
// Assumes you have a configured client instance
final tools = [
Tool(
functionDeclarations: [
FunctionDeclaration(
name: 'get_weather',
description: 'Get current weather',
parameters: Schema(
type: SchemaType.object,
properties: {
'location': Schema(type: SchemaType.string),
},
required: ['location'],
),
),
],
),
];
final response = await client.models.generateContent(
model: 'gemini-3-flash-preview',
request: GenerateContentRequest(
contents: [/* ... */],
tools: tools,
),
);
Grounding Tools
googleai_dart supports multiple grounding tools that enhance model responses with real-world data.
Google Search Grounding
Ground responses with real-time web information:
final response = await client.models.generateContent(
model: 'gemini-3-flash-preview',
request: GenerateContentRequest(
contents: [Content.text('Who won Euro 2024?')],
// Enable Google Search grounding with an empty map
tools: [Tool(googleSearch: {})],
),
);
// Use .text extension for easy text extraction
print(response.text);
// Access grounding metadata
final metadata = response.candidates?.first.groundingMetadata;
if (metadata != null) {
// Search queries executed by the model
print('Queries: ${metadata.webSearchQueries}');
// Web sources used
for (final chunk in metadata.groundingChunks ?? []) {
if (chunk.web != null) {
print('Source: ${chunk.web!.title} - ${chunk.web!.uri}');
}
}
// Search entry point widget (required for attribution)
if (metadata.searchEntryPoint?.renderedContent != null) {
print('Widget HTML available for display');
}
}
Or use the Interactions API for streaming:
await for (final event in client.interactions.createStream(
model: 'gemini-3-flash-preview',
input: 'What are today\'s top technology news?',
tools: [GoogleSearchTool()],
)) {
if (event case ContentDeltaEvent(:final delta)) {
if (delta is TextDelta) {
print(delta.text);
} else if (delta is GoogleSearchCallDelta) {
print('Searching: ${delta.queries?.join(", ")}');
}
}
}
See google_search_example.dart for a complete example.
URL Context
Fetch and analyze content from specific URLs (up to 20 URLs, max 34MB per URL):
final response = await client.models.generateContent(
model: 'gemini-3-flash-preview',
request: GenerateContentRequest(
contents: [
Content.text('Summarize the main points from: https://dart.dev/overview'),
],
// Enable URL Context with an empty map
tools: [Tool(urlContext: {})],
),
);
// Use .text extension for easy text extraction
print(response.text);
Or with the Interactions API:
await for (final event in client.interactions.createStream(
model: 'gemini-3-flash-preview',
input: 'Summarize https://pub.dev/packages/googleai_dart',
tools: [UrlContextTool()],
)) {
if (event case ContentDeltaEvent(:final delta)) {
if (delta is TextDelta) {
stdout.write(delta.text);
} else if (delta is UrlContextCallDelta) {
print('Fetching: ${delta.urls?.join(", ")}');
}
}
}
See url_context_example.dart for a complete example.
Google Maps
Add geospatial context for location-based queries:
final response = await client.models.generateContent(
model: 'gemini-3-flash-preview',
request: GenerateContentRequest(
contents: [Content.text('Find Italian restaurants nearby')],
// Enable Google Maps with widget support
tools: [Tool(googleMaps: GoogleMaps(enableWidget: true))],
// Provide user location context (as Map)
toolConfig: {
'retrievalConfig': {
'latLng': {
'latitude': 40.758896,
'longitude': -73.985130,
},
},
},
),
);
// Access place information
final metadata = response.candidates?.first.groundingMetadata;
for (final chunk in metadata?.groundingChunks ?? []) {
if (chunk.maps != null) {
print('Place: ${chunk.maps!.title}');
print('Place ID: ${chunk.maps!.placeId}');
}
}
// Widget token for rendering Google Maps widget
if (metadata?.googleMapsWidgetContextToken != null) {
print('Widget token: ${metadata!.googleMapsWidgetContextToken}');
}
See google_maps_example.dart for a complete example.
File Search (Semantic Retrieval)
Search your own documents using FileSearchStores:
// Create a FileSearchStore
final store = await client.fileSearchStores.create(
displayName: 'My Knowledge Base',
);
// Upload a document with custom chunking and metadata
final uploadResponse = await client.fileSearchStores.upload(
parent: store.name!,
filePath: '/path/to/document.pdf',
mimeType: 'application/pdf',
request: UploadToFileSearchStoreRequest(
displayName: 'Technical Documentation',
chunkingConfig: ChunkingConfig(
whiteSpaceConfig: WhiteSpaceConfig(
maxTokensPerChunk: 200,
maxOverlapTokens: 20,
),
),
customMetadata: [
FileSearchCustomMetadata(key: 'author', stringValue: 'Jane Doe'),
FileSearchCustomMetadata(key: 'year', numericValue: 2024),
],
),
);
// Use FileSearch in generation with optional metadata filter
final response = await client.models.generateContent(
model: 'gemini-3-flash-preview',
request: GenerateContentRequest(
contents: [
Content.text('What does the documentation say about X?'),
],
tools: [
Tool(
fileSearch: FileSearch(
fileSearchStoreNames: [store.name!],
topK: 5,
metadataFilter: 'author = "Jane Doe"',
),
),
],
),
);
// Access grounding metadata (citations)
final metadata = response.candidates?.first.groundingMetadata;
for (final chunk in metadata?.groundingChunks ?? []) {
if (chunk.retrievedContext != null) {
print('Source: ${chunk.retrievedContext!.title}');
}
}
// Cleanup
await client.fileSearchStores.delete(name: store.name!);
See file_search_example.dart for a complete example.
Embeddings
Embeddings Example
import 'package:googleai_dart/googleai_dart.dart';
// Assumes you have a configured client instance
final response = await client.models.embedContent(
model: 'gemini-embedding-001',
request: EmbedContentRequest(
content: Content(
parts: [TextPart('Hello world')],
),
taskType: TaskType.retrievalDocument,
),
);
print(response.embedding.values); // List<double>
File Management
Complete File Management Example
Upload files for use in multimodal prompts:
import 'dart:io' as io;
import 'package:googleai_dart/googleai_dart.dart';
// Assumes you have a configured client instance
// Upload a file
final file = await client.files.upload(
filePath: '/path/to/image.jpg',
mimeType: 'image/jpeg',
displayName: 'My Image',
);
print('File uploaded: ${file.name}');
print('State: ${file.state}');
print('URI: ${file.uri}');
// Wait for file to be processed (if needed)
while (file.state == FileState.processing) {
await Future.delayed(Duration(seconds: 2));
file = await client.files.get(name: file.name);
}
// Use the file in a prompt
final response = await client.models.generateContent(
model: 'gemini-3-flash-preview',
request: GenerateContentRequest(
contents: [
Content(
parts: [
TextPart('Describe this image'),
FileData(
fileUri: file.uri,
mimeType: file.mimeType,
),
],
role: 'user',
),
],
),
);
// List all files
final listResponse = await client.files.list(pageSize: 10);
for (final f in listResponse.files ?? []) {
print('${f.displayName}: ${f.state}');
}
// Download file content (if needed)
final bytes = await client.files.download(name: file.name);
// Save to disk or process bytes
await io.File('downloaded_file.jpg').writeAsBytes(bytes);
// Delete the file when done
await client.files.delete(name: file.name);
Note: Files are automatically deleted after 48 hours.
Context Caching
Complete Context Caching Example
Context caching allows you to save frequently used content and reuse it across requests, reducing latency and costs:
import 'package:googleai_dart/googleai_dart.dart';
// Assumes you have a configured client instance
// Create cached content with system instructions
final cachedContent = await client.cachedContents.create(
cachedContent: const CachedContent(
model: 'models/gemini-3-flash-preview',
displayName: 'Math Expert Cache',
systemInstruction: Content(
parts: [TextPart('You are an expert mathematician...')],
),
ttl: '3600s', // Cache for 1 hour
),
);
// Use cached content in requests (saves tokens!)
final response = await client.models.generateContent(
model: 'gemini-3-flash',
request: GenerateContentRequest(
cachedContent: cachedContent.name,
contents: [Content.text('Explain the Pythagorean theorem')],
),
);
// Update cache TTL
await client.cachedContents.update(
name: cachedContent.name!,
cachedContent: const CachedContent(
model: 'models/gemini-3-flash-preview',
ttl: '7200s', // Extend to 2 hours
),
updateMask: 'ttl',
);
// Clean up when done
await client.cachedContents.delete(name: cachedContent.name!);
Benefits:
- â Reduced latency for requests with large contexts
- â Lower costs by reusing cached content
- â Consistent system instructions across requests
Grounded Question Answering
Complete generateAnswer Examples
The generateAnswer API provides answers grounded in specific sources, ideal for Retrieval Augmented Generation (RAG):
import 'package:googleai_dart/googleai_dart.dart';
// Assumes you have a configured client instance
// Using inline passages (for small knowledge bases)
final response = await client.models.generateAnswer(
model: 'aqa',
request: const GenerateAnswerRequest(
contents: [
Content(
parts: [TextPart('What is the capital of France?')],
role: 'user',
),
],
answerStyle: AnswerStyle.abstractive, // Or: extractive, verbose
inlinePassages: GroundingPassages(
passages: [
GroundingPassage(
id: 'passage-1',
content: Content(
parts: [TextPart('Paris is the capital of France.')],
),
),
],
),
temperature: 0.2, // Low temperature for factual answers
),
);
// Check answerability
if (response.answerableProbability != null &&
response.answerableProbability! < 0.5) {
print('â ī¸ Answer may not be grounded in sources');
}
// Using semantic retriever (for large corpora)
final ragResponse = await client.models.generateAnswer(
model: 'aqa',
request: const GenerateAnswerRequest(
contents: [
Content(
parts: [TextPart('What are the key features of Dart?')],
role: 'user',
),
],
answerStyle: AnswerStyle.verbose,
semanticRetriever: SemanticRetrieverConfig(
source: 'corpora/my-corpus',
query: Content(
parts: [TextPart('Dart programming language features')],
),
maxChunksCount: 5,
minimumRelevanceScore: 0.5,
),
),
);
Features:
- â Multiple answer styles (abstractive, extractive, verbose)
- â Inline passages or semantic retriever grounding
- â Answerability probability for quality control
- â Safety settings support
- â Input feedback for blocked content
Batch Operations
Batch Operations Example
import 'package:googleai_dart/googleai_dart.dart';
// Assumes you have a configured client instance
// Create a batch for processing multiple requests
// The model in the batch is auto-populated from the method parameter
final batch = await client.models.batchGenerateContent(
model: 'gemini-3-flash-preview',
batch: const GenerateContentBatch(
displayName: 'My Batch Job',
inputConfig: InputConfig(
requests: InlinedRequests(
requests: [
InlinedRequest(
request: GenerateContentRequest(
contents: [Content.text('What is 2+2?')],
),
),
InlinedRequest(
request: GenerateContentRequest(
contents: [Content.text('What is 3+3?')],
),
),
],
),
),
),
);
// Monitor batch status
final status = await client.batches.getGenerateContentBatch(name: batch.name!);
print('Batch state: ${status.state}');
Corpus & RAG
Corpus Example
â ī¸ Important: Document, chunk, and RAG features are only available in Vertex AI. Google AI only supports basic corpus management.
import 'package:googleai_dart/googleai_dart.dart';
// Assumes you have a configured Google AI client
// Google AI supports basic corpus CRUD operations
final corpus = await client.corpora.create(
corpus: const Corpus(displayName: 'My Knowledge Base'),
);
// List corpora
final corpora = await client.corpora.list(pageSize: 10);
// Get corpus
final retrieved = await client.corpora.get(name: corpus.name!);
// Update corpus
final updated = await client.corpora.update(
name: corpus.name!,
corpus: const Corpus(displayName: 'Updated Name'),
updateMask: 'displayName',
);
// Delete corpus
await client.corpora.delete(name: corpus.name!);
For full RAG capabilities (documents, chunks, semantic search):
- Use Vertex AI with RAG Stores and Vector Search
- The Semantic Retriever API has been succeeded by Vertex AI Vector Search
Permissions
Permission Management Examples
import 'package:googleai_dart/googleai_dart.dart';
// Assumes you have a configured client instance
// Grant permissions to a resource
final permission = await client.tunedModels.permissions(parent: 'tunedModels/my-model').create(
permission: const Permission(
granteeType: GranteeType.user,
emailAddress: 'user@example.com',
role: PermissionRole.reader,
),
);
// List permissions
final permissions = await client.tunedModels.permissions(parent: 'tunedModels/my-model').list();
// Transfer ownership
await client.tunedModels.permissions(parent: 'tunedModels/my-model').transferOwnership(
emailAddress: 'newowner@example.com',
);
Tuned Models
List and Inspect Tuned Models
import 'package:googleai_dart/googleai_dart.dart';
// Assumes you have a configured client instance
// List all tuned models
final tunedModels = await client.tunedModels.list(
pageSize: 10,
filter: 'owner:me', // Filter by ownership
);
for (final model in tunedModels.tunedModels) {
print('Model: ${model.displayName} (${model.name})');
print('State: ${model.state}');
print('Base model: ${model.baseModel}');
}
// Get specific tuned model details
final myModel = await client.tunedModels.get(
name: 'tunedModels/my-model-abc123',
);
print('Model: ${myModel.displayName}');
print('State: ${myModel.state}');
print('Training examples: ${myModel.tuningTask?.trainingData?.examples?.exampleCount}');
// List operations for a tuned model (monitor training progress)
final operations = await client.tunedModels.operations(parent: 'tunedModels/my-model-abc123').list();
for (final operation in operations.operations) {
print('Operation: ${operation.name}');
print('Done: ${operation.done}');
if (operation.metadata != null) {
print('Progress: ${operation.metadata}');
}
}
Using Tuned Models for Generation
Generate Content with Tuned Models
Once you have a tuned model, you can use it for content generation just like base models:
import 'package:googleai_dart/googleai_dart.dart';
// Assumes you have a configured client instance
// Generate content with a tuned model
final response = await client.tunedModels.generateContent(
tunedModel: 'my-model-abc123', // Your tuned model ID
request: GenerateContentRequest(
contents: [Content.text('Explain quantum computing')],
),
);
// Use the convenience extension to get text
print(response.text);
// Stream responses with a tuned model
await for (final chunk in client.tunedModels.streamGenerateContent(
tunedModel: 'my-model-abc123',
request: request,
)) {
final text = chunk.text;
if (text != null) print(text);
}
// Batch generation with a tuned model
// The model in the batch is auto-populated from the tunedModel parameter
final batch = await client.tunedModels.batchGenerateContent(
tunedModel: 'my-model-abc123',
batch: GenerateContentBatch(
displayName: 'My Batch Job',
inputConfig: InputConfig(
requests: InlinedRequests(
requests: [
InlinedRequest(
request: GenerateContentRequest(
contents: [Content.text('Question 1')],
),
),
],
),
),
),
);
Benefits of tuned models:
- â Customized behavior for your specific domain
- â Improved accuracy for specialized tasks
- â Consistent output style and format
- â Reduced need for extensive prompting
Prediction (Video Generation)
Video Generation with Veo Example
import 'package:googleai_dart/googleai_dart.dart';
// Assumes you have a configured client instance
// Synchronous prediction
final response = await client.models.predict(
model: 'veo-3.0-generate-001',
instances: [
{'prompt': 'A cat playing piano in a jazz club'},
],
);
print('Predictions: ${response.predictions}');
// Long-running prediction for video generation
final operation = await client.models.predictLongRunning(
model: 'veo-3.0-generate-001',
instances: [
{'prompt': 'A golden retriever running on a beach at sunset'},
],
parameters: {
'aspectRatio': '16:9',
},
);
print('Operation: ${operation.name}');
print('Done: ${operation.done}');
// Check for generated videos
if (operation.done == true && operation.response != null) {
final videoResponse = operation.response!.generateVideoResponse;
if (videoResponse?.generatedSamples != null) {
for (final media in videoResponse!.generatedSamples!) {
if (media.video?.uri != null) {
print('Video URI: ${media.video!.uri}');
}
}
}
// Check for RAI filtering
if (videoResponse?.raiMediaFilteredCount != null) {
print('Filtered: ${videoResponse!.raiMediaFilteredCount} videos');
print('Reasons: ${videoResponse.raiMediaFilteredReasons}');
}
}
Live API (Real-time Streaming)
The Live API provides bidirectional WebSocket streaming for real-time audio/text conversations.
Live API Example
import 'package:googleai_dart/googleai_dart.dart';
// Create the main client
final client = GoogleAIClient(
config: GoogleAIConfig(
authProvider: ApiKeyProvider('YOUR_API_KEY'),
),
);
// Create a Live client for WebSocket streaming
final liveClient = client.createLiveClient();
try {
// Connect to the Live API with configuration
final session = await liveClient.connect(
model: 'gemini-2.0-flash-live-001',
liveConfig: LiveConfig(
generationConfig: LiveGenerationConfig.textAndAudio(
speechConfig: SpeechConfig.withVoice(LiveVoices.puck),
temperature: 0.7,
),
// Enable transcription for both input and output
inputAudioTranscription: AudioTranscriptionConfig.enabled(),
outputAudioTranscription: AudioTranscriptionConfig.enabled(),
// Configure voice activity detection
realtimeInputConfig: RealtimeInputConfig.withVAD(
silenceDurationMs: 500,
activityHandling: ActivityHandling.startOfActivityInterrupts,
),
),
);
print('Connected! Session ID: ${session.sessionId}');
// Send a text message
session.sendText('Hello! Tell me a short joke.');
// Listen for responses
await for (final message in session.messages) {
switch (message) {
case BidiGenerateContentSetupComplete(:final sessionId):
print('Setup complete, session: $sessionId');
case BidiGenerateContentServerContent(
:final modelTurn,
:final turnComplete,
:final inputTranscription,
:final outputTranscription,
):
// Handle model response
if (modelTurn != null) {
for (final part in modelTurn.parts) {
if (part is TextPart) {
print('Model: ${part.text}');
} else if (part is InlineDataPart) {
// Audio data (24kHz PCM)
print('Audio: ${part.inlineData.data.length} bytes');
}
}
}
// Show transcriptions
if (inputTranscription?.text != null) {
print('You said: ${inputTranscription!.text}');
}
if (outputTranscription?.text != null) {
print('Model said: ${outputTranscription!.text}');
}
if (turnComplete ?? false) {
print('--- Turn complete ---');
break;
}
case BidiGenerateContentToolCall(:final functionCalls):
// Handle tool calls and send responses
final responses = functionCalls.map((call) => FunctionResponse(
name: call.name,
response: {'result': 'executed'},
)).toList();
session.sendToolResponse(responses);
case GoAway(:final timeLeft):
print('Server disconnect in: $timeLeft');
// Save session.resumptionToken for later resumption
case SessionResumptionUpdate(:final resumable):
print('Session resumable: $resumable');
}
}
await session.close();
} on LiveConnectionException catch (e) {
print('Connection failed: ${e.message}');
} finally {
await liveClient.close();
client.close();
}
Sending Audio
// Audio must be 16kHz, 16-bit, mono PCM
void sendAudio(LiveSession session, List<int> pcmBytes) {
session.sendAudio(pcmBytes);
}
Manual Voice Activity Detection
// For manual VAD mode (when automaticActivityDetection.disabled = true)
void manualVAD(LiveSession session) {
// Signal when user starts speaking
session.signalActivityStart();
// ... user speaks ...
// Signal when user stops speaking
session.signalActivityEnd();
}
Session Resumption
// Resume a previous session
final session = await liveClient.resume(
model: 'gemini-2.0-flash-live-001',
resumptionToken: savedToken, // From previous session
liveConfig: LiveConfig(
generationConfig: LiveGenerationConfig.textAndAudio(),
),
);
print('Session resumed! ID: ${session.sessionId}');
Ephemeral Tokens (Secure Client-Side Auth)
For client-side applications (mobile apps, web apps), use ephemeral tokens instead of exposing your API key:
Server-side (create token):
// On your backend server
final token = await client.authTokens.create(
authToken: AuthToken(
expireTime: DateTime.now().add(Duration(minutes: 30)),
uses: 1, // Single use
),
);
// Send token.name to client securely
Client-side (use token):
// On mobile/web client - no API key needed!
final liveClient = LiveClient(
config: GoogleAIConfig.googleAI(
authProvider: NoAuthProvider(), // No API key
),
);
final session = await liveClient.connect(
model: 'gemini-2.0-flash-live-001',
accessToken: tokenFromServer, // Use ephemeral token
);
Note: Ephemeral tokens are only available with Google AI (not Vertex AI).
Platform Notes
Web (Browser) Limitations:
- Browser WebSocket API does NOT support custom HTTP headers during handshake
- Google AI: Works on web via query parameter authentication (
?key=...or?access_token=...) - Vertex AI OAuth: Requires Bearer token in headers, which doesn't work in browsers
- Recommendation: For Vertex AI on web, use a backend proxy or ephemeral tokens with Google AI
Audio Streaming Notes:
- Audio data is base64-encoded before sending (~33% size overhead)
- The underlying WebSocket handles buffering automatically
- Audio format: 16kHz, 16-bit PCM mono input (32 KB/s raw, ~43 KB/s encoded)
- Output audio: 24kHz, 16-bit PCM mono
Migrating from Google AI to Vertex AI
Switching from Google AI to Vertex AI requires minimal code changes:
Step 1: Change Configuration
Configuration Migration Examples
Before (Google AI):
import 'package:googleai_dart/googleai_dart.dart';
final client = GoogleAIClient(
config: GoogleAIConfig.googleAI(
authProvider: ApiKeyProvider('YOUR_API_KEY'),
),
);
After (Vertex AI):
import 'package:googleai_dart/googleai_dart.dart';
final client = GoogleAIClient(
config: GoogleAIConfig.vertexAI(
projectId: 'your-project-id',
location: 'us-central1',
authProvider: MyOAuthProvider(), // Implement OAuth
),
);
Step 2: Replace Google AI-Only Features
If you use any of these features, you'll need to use Vertex AI alternatives:
| Google AI Feature | Vertex AI Alternative |
|---|---|
client.files.upload() |
Upload to Cloud Storage, use gs:// URIs in requests |
client.tunedModels.* |
Use Vertex AI Tuning API directly |
client.corpora.* |
Use Vertex AI RAG Stores |
client.models.generateAnswer() |
Use generateContent() with grounding |
Examples
See the example/ directory for comprehensive examples:
- generate_content.dart - Basic content generation
- streaming_example.dart - Real-time SSE streaming
- function_calling_example.dart - Tool usage with functions
- embeddings_example.dart - Text embeddings generation
- error_handling_example.dart - Exception handling patterns
- abort_example.dart - Request cancellation
- models_example.dart - Model listing and inspection
- files_example.dart - File upload and management
- pagination_example.dart - Paginated API calls
- batch_example.dart - Batch operations
- caching_example.dart - Context caching for cost/latency optimization
- permissions_example.dart - Permission management
- oauth_refresh_example.dart - OAuth token refresh with AuthProvider
- prediction_example.dart - Video generation with Veo model
- generate_answer_example.dart - Grounded question answering with inline passages or semantic retriever
- tuned_model_generation_example.dart - Generate content with custom tuned models
- api_versions_example.dart - Using v1 (stable) vs v1beta (beta)
- vertex_ai_example.dart - Using Vertex AI with OAuth authentication
- complete_api_example.dart - Demonstrating 100% API coverage
- interactions_example.dart - Interactions API for server-side state management (experimental)
- google_search_example.dart - Google Search grounding for real-time web information
- url_context_example.dart - URL Context for fetching and analyzing web content
- google_maps_example.dart - Google Maps grounding for geospatial context
- file_search_example.dart - File Search with FileSearchStores for semantic retrieval (RAG)
- live_example.dart - Live API for real-time WebSocket streaming (audio/text)
API Coverage
This client implements 78 endpoints covering 100% of all non-deprecated Gemini API operations:
Models Resource (client.models)
- Generation: generateContent, streamGenerateContent, countTokens, generateAnswer
- Dynamic Content: dynamicGenerateContent, dynamicStreamGenerateContent (for live content with dynamic IDs)
- Embeddings: embedContent, batchEmbedContents (synchronous), asyncBatchEmbedContent (asynchronous)
- Prediction: predict, predictLongRunning (for video generation with Veo)
- Model Management: list, get, listOperations, create, patch, delete
Tuned Models Resource (client.tunedModels)
- Generation: generateContent, streamGenerateContent, batchGenerateContent, asyncBatchEmbedContent
- Management: list, get
- Operations: operations(parent).list
- Permissions: permissions(parent).create, permissions(parent).list, permissions(parent).get, permissions(parent).update, permissions(parent).delete, permissions(parent).transferOwnership
Files Resource (client.files)
- Management: upload, list, get, delete, download
- Upload Methods:
filePath: Upload from file system (IO platforms, streaming)bytes: Upload from memory (all platforms)contentStream: Upload large files via streaming (IO platforms, memory efficient)
- Generated Files: generatedFiles.list, generatedFiles.get, generatedFiles.getOperation (for video outputs)
Cached Contents Resource (client.cachedContents)
- Management: create, get, update, delete, list
Batches Resource (client.batches)
- Management: list, getGenerateContentBatch, getEmbedBatch, updateGenerateContentBatch, updateEmbedContentBatch, delete, cancel
Corpora Resource (client.corpora)
- Corpus Management: create, list, get, update, delete
- Document Management: documents(corpus).create, documents(corpus).list, documents(corpus).get, documents(corpus).update, documents(corpus).delete
- Permissions: permissions(parent).create, permissions(parent).list, permissions(parent).get, permissions(parent).update, permissions(parent).delete
FileSearchStores Resource (client.fileSearchStores)
- Store Management: create, list, get, delete
- Document Operations: importFile, uploadToFileSearchStore
- Document Management: documents.list, documents.get, documents.delete
- Operations: getOperation, getUploadOperation
Interactions Resource (client.interactions) - Experimental
- Creation: create, createWithAgent, createStream
- Management: get, cancel, delete
- Streaming: createStream, resumeStream (SSE with event types)
- Content Types: 17 types including text, image, audio, function calls, code execution, etc.
- Events: InteractionStart, ContentDelta, ContentStop, InteractionComplete, Error
Auth Tokens Resource (client.authTokens)
- Management: create (creates ephemeral tokens for secure client-side authentication)
Live API (client.createLiveClient())
- Connection: connect, resume (WebSocket streaming, ephemeral token support)
- Client Messages: setup, clientContent, realtimeInput, toolResponse
- Server Messages: setupComplete, serverContent, toolCall, toolCallCancellation, goAway, sessionResumptionUpdate, unknownServerMessage
- Session Methods: sendText, sendAudio, sendContent, sendToolResponse, signalActivityStart, signalActivityEnd
- Configuration: LiveConfig, LiveGenerationConfig, SpeechConfig, RealtimeInputConfig, SessionResumptionConfig
- Audio Format: 16kHz/16-bit/mono PCM input, 24kHz/16-bit/mono PCM output
Universal Operations
- getOperation: Available for all long-running operations
Development
# Install dependencies
dart pub get
# Run tests
dart test
# Format code
dart format .
# Analyze
dart analyze
License
googleai_dart is licensed under the MIT License.
Libraries
- googleai_dart
- Unofficial Dart client for the Google AI Gemini Developer API and Vertex AI Gemini API with unified interface.