Google AI & Vertex AI Gemini API Dart Client #

Discord

Unofficial Dart client for the Google AI Gemini Developer API and Vertex AI Gemini API with unified interface.

Note

The official google_generative_ai Dart package has been deprecated in favor of firebase_ai. However, since firebase_ai is a Flutter package rather than a pure Dart package, this unofficial client bridges the gap by providing a pure Dart, fully type-safe API client for both Google AI and Vertex AI.

Ideal for server-side Dart, CLI tools, and backend services needing a type-safe Gemini API client without Flutter dependencies.

Table of Contents

Features
Why choose this client?
Quickstart
Installation
API Versions
Vertex AI Support
Advanced Configuration
Usage
Migrating from Google AI to Vertex AI
Examples
API Coverage
Development
License

Features #

🌍 Core Features (Both Google AI & Vertex AI) #

Generation & Streaming

✅ Content generation (generateContent)
✅ Streaming support (streamGenerateContent) with SSE
✅ Request abortion (cancelable requests via abortTrigger)
✅ Token counting (countTokens)
✅ Tool support:
- Function calling (custom function declarations)
- Code execution (Python sandbox)
- Google Search grounding
- URL Context (fetch and analyze web content)
- File search (Semantic Retrieval with FileSearchStores)
- Google Maps (geospatial context)
- MCP servers (Model Context Protocol)
✅ Safety settings support

Embeddings

✅ Content embeddings (embedContent)
✅ Batch embeddings with automatic fallback

Models

✅ Model listing & discovery (listModels, getModel)
✅ Content caching (full CRUD operations)

🔵 Google AI Only (Not available on Vertex AI) #

Files API

✅ File management (upload, list, get, download, delete)
✅ Multiple upload methods: file path (IO), bytes (all platforms), streaming (IO)
ℹ️ Vertex AI: Use Cloud Storage URIs or base64

Tuned Models

✅ Tuned models (create, list, get, patch, delete, listOperations)
✅ Generation with tuned models
ℹ️ Vertex AI: Use Vertex AI Tuning API

Corpora & File Search

✅ Corpus management (create, list, get, update, delete)
✅ Document management within corpora
✅ FileSearchStores for semantic retrieval (create, list, get, delete, import/upload files)
ℹ️ Vertex AI: Use RAG Engine for enterprise RAG capabilities

Permissions

✅ Permission management (create, list, get, update, delete)
✅ Ownership transfer (transferOwnership)
✅ Grantee types (user, group, everyone)
✅ Role-based access (owner, writer, reader)

Prediction (Video Generation)

✅ Synchronous prediction (predict)
✅ Long-running prediction (predictLongRunning) for video generation with Veo
✅ Operation status polling for async predictions
✅ RAI (Responsible AI) filtering support

Batch Operations

✅ Batch content generation (batchGenerateContent)
✅ Synchronous batch embeddings (batchEmbedContents)
✅ Asynchronous batch embeddings (asyncBatchEmbedContent)
✅ Batch management (list, get, update, delete, cancel)
✅ LRO polling for async batch jobs

Interactions API (Experimental)

✅ Server-side conversation state management
✅ Multi-turn conversations with previousInteractionId
✅ Streaming responses with SSE events
✅ Function calling with automatic result handling
✅ Agent support (e.g., Deep Research)
✅ 17 content types (text, image, audio, function calls, etc.)
✅ Background interactions with cancel support

Live API (WebSocket Streaming)

✅ Real-time bidirectional WebSocket communication
✅ Audio streaming (16kHz input, 24kHz output PCM)
✅ Text streaming with real-time responses
✅ Voice Activity Detection (VAD) - automatic and manual modes
✅ Input/output audio transcription
✅ Tool/function calling in live sessions
✅ Session resumption with tokens
✅ Context window compression for long conversations
✅ Multiple voice options (Puck, Charon, Kore, etc.)
✅ Ephemeral tokens for secure client-side authentication

Quick Comparison #

Aspect	Google AI	Vertex AI
Auth	API Key	OAuth 2.0
Core Features	✅ Full support	✅ Full support
Files API	✅ Supported	❌ Use Cloud Storage URIs
Tuned Models	✅ Supported	❌ Different tuning API
File Search	✅ FileSearchStores	✅ RAG Engine
Enterprise	❌ None	✅ VPC, CMEK, HIPAA

Why choose this client? #

✅ Type-safe with sealed classes
✅ Multiple auth methods (API key, OAuth)
✅ Minimal dependencies (http, logging only)
✅ Works on all compilation targets (native, web, WASM)
✅ Interceptor-driven architecture
✅ Comprehensive error handling
✅ Automatic retry with exponential backoff
✅ Long-running operations (LRO polling)
✅ Pagination support (Paginator utility)
✅ 560+ tests

Quickstart #

import 'package:googleai_dart/googleai_dart.dart';

final client = GoogleAIClient(
  config: GoogleAIConfig.googleAI(
    authProvider: ApiKeyProvider('YOUR_API_KEY'),
  ),
);

final response = await client.models.generateContent(
  model: 'gemini-2.5-flash',
  request: GenerateContentRequest(
    contents: [
      Content(
        parts: [TextPart('Hello Gemini!')], 
        role: 'user',
      ),
    ],
  ),
);

print(response.candidates?.first.content?.parts.first);
client.close();

Installation #

dependencies:
  googleai_dart: {version}

API Versions #

This client supports both stable and beta versions of the API:

Version	Stability	Use Case	Features
v1	Stable	Production apps	Guaranteed stability, breaking changes trigger new major version
v1beta	Beta	Testing & development	Early-access features, subject to rapid/breaking changes

Default: The client defaults to v1beta for broadest feature access.

API Version Configuration

import 'package:googleai_dart/googleai_dart.dart';

// Use stable v1
final client = GoogleAIClient(
  config: GoogleAIConfig.googleAI(
    apiVersion: ApiVersion.v1,
    authProvider: ApiKeyProvider('YOUR_API_KEY'),
  ),
);

// Use beta v1beta
final betaClient = GoogleAIClient(
  config: GoogleAIConfig.googleAI(
    apiVersion: ApiVersion.v1beta,
    authProvider: ApiKeyProvider('YOUR_API_KEY'),
  ),
);

Vertex AI Support #

To use Vertex AI, you need:

A Google Cloud Platform (GCP) project
Vertex AI Gemini API enabled
Service account with roles/aiplatform.user role
OAuth 2.0 credentials

Complete Vertex AI Setup

import 'package:googleai_dart/googleai_dart.dart';

// Create an OAuth provider (implement token refresh logic)
class MyOAuthProvider implements AuthProvider {
  @override
  Future<AuthCredentials> getCredentials() async {
    // Your OAuth token refresh logic here
    final token = await getAccessToken(); // Implement this
    // You can use https://pub.dev/packages/googleapis_auth
    return BearerTokenCredentials(token);
  }
}

// Configure for Vertex AI
final vertexClient = GoogleAIClient(
  config: GoogleAIConfig.vertexAI(
    projectId: 'your-gcp-project-id',
    location: 'us-central1', // Or 'global', 'europe-west1', etc.
    apiVersion: ApiVersion.v1, // Stable version
    authProvider: MyOAuthProvider(),
  ),
);

// Use the same API as Google AI
final response = await vertexClient.models.generateContent(
  model: 'gemini-2.5-flash',
  request: GenerateContentRequest(
    contents: [
      Content(
        parts: [TextPart('Explain quantum computing')],
        role: 'user',
      ),
    ],
  ),
);

Advanced Configuration #

Custom Configuration Options

For advanced use cases, you can use the main GoogleAIConfig constructor with full control over all parameters:

import 'package:googleai_dart/googleai_dart.dart';

final client = GoogleAIClient(
  config: GoogleAIConfig(
    baseUrl: 'https://my-custom-proxy.example.com',
    apiMode: ApiMode.googleAI,
    apiVersion: ApiVersion.v1,
    authProvider: ApiKeyProvider('YOUR_API_KEY'),
    timeout: Duration(minutes: 5),
    retryPolicy: RetryPolicy(maxRetries: 5),
    // ... other parameters
  ),
);

Use cases for custom base URLs:

Proxy servers: Route requests through a corporate proxy
Testing: Point to a mock server for integration tests
Regional endpoints: Use region-specific URLs if needed
Custom deployments: Self-hosted or specialized endpoints

The factory constructors (GoogleAIConfig.googleAI() and GoogleAIConfig.vertexAI()) are convenience methods that set sensible defaults, but you can always use the main constructor for full control.

Usage #

Authentication #

All Authentication Methods

The client uses an AuthProvider pattern for flexible authentication:

import 'package:googleai_dart/googleai_dart.dart';

// API Key authentication (query parameter)
final client = GoogleAIClient(
  config: GoogleAIConfig(
    authProvider: ApiKeyProvider('YOUR_API_KEY'),
  ),
);

// API Key as header
final clientWithHeader = GoogleAIClient(
  config: GoogleAIConfig(
    authProvider: ApiKeyProvider(
      'YOUR_API_KEY',
      placement: AuthPlacement.header,
    ),
  ),
);

// Bearer token (for OAuth)
final clientWithBearer = GoogleAIClient(
  config: GoogleAIConfig(
    authProvider: BearerTokenProvider('YOUR_BEARER_TOKEN'),
  ),
);

// Custom OAuth with automatic token refresh
class CustomOAuthProvider implements AuthProvider {
  @override
  Future<AuthCredentials> getCredentials() async {
    // Your token refresh logic here
    // Called on each request, including retries
    return BearerTokenCredentials(await refreshToken());
  }
}

final oauthClient = GoogleAIClient(
  config: GoogleAIConfig(
    authProvider: CustomOAuthProvider(),
  ),
);

Basic Generation #

Basic Generation Example

import 'package:googleai_dart/googleai_dart.dart';

final client = GoogleAIClient(
  config: GoogleAIConfig(
    authProvider: ApiKeyProvider('YOUR_API_KEY'),
  ),
);

final response = await client.models.generateContent(
  model: 'gemini-2.5-flash',
  request: GenerateContentRequest(
    contents: [
      Content(
        parts: [TextPart('Explain quantum computing')],
        role: 'user',
      ),
    ],
  ),
);

print(response.candidates?.first.content?.parts.first);
client.close();

Streaming #

Streaming Example

import 'package:googleai_dart/googleai_dart.dart';

// Assumes you have a configured client instance
await for (final chunk in client.models.streamGenerateContent(
  model: 'gemini-2.5-flash',
  request: request,
)) {
  // Process each chunk as it arrives
  final text = chunk.candidates?.first.content?.parts.first;
  if (text is TextPart) print(text.text);
}

Canceling Requests #

Request Cancellation Examples

You can cancel long-running requests using an abort trigger:

import 'dart:async';
import 'package:googleai_dart/googleai_dart.dart';

// Assumes you have a configured client instance
final abortController = Completer<void>();

// Start request with abort capability
final requestFuture = client.models.generateContent(
  model: 'gemini-2.5-flash',
  request: request,
  abortTrigger: abortController.future,
);

// To cancel:
abortController.complete();

// Handle cancellation
try {
  final response = await requestFuture;
} on AbortedException {
  print('Request was canceled');
}

This works for both regular and streaming requests. You can also use it with timeouts:

// Auto-cancel after 30 seconds
final response = await client.models.generateContent(
  model: 'gemini-2.5-flash',
  request: request,
  abortTrigger: Future.delayed(Duration(seconds: 30)),
);

Function Calling #

Function Calling Example

import 'package:googleai_dart/googleai_dart.dart';

// Assumes you have a configured client instance
final tools = [
  Tool(
    functionDeclarations: [
      FunctionDeclaration(
        name: 'get_weather',
        description: 'Get current weather',
        parameters: Schema(
          type: SchemaType.object,
          properties: {
            'location': Schema(type: SchemaType.string),
          },
          required: ['location'],
        ),
      ),
    ],
  ),
];

final response = await client.models.generateContent(
  model: 'gemini-2.5-flash',
  request: GenerateContentRequest(
    contents: [/* ... */],
    tools: tools,
  ),
);

Grounding Tools #

googleai_dart supports multiple grounding tools that enhance model responses with real-world data.

Google Search Grounding

Ground responses with real-time web information:

final response = await client.models.generateContent(
  model: 'gemini-2.5-flash',
  request: GenerateContentRequest(
    contents: [
      Content(
        parts: [TextPart('Who won Euro 2024?')],
        role: 'user',
      ),
    ],
    // Enable Google Search grounding with an empty map
    tools: [Tool(googleSearch: {})],
  ),
);

// Extract text from response
final text = response.candidates?.first.content?.parts
    .whereType<TextPart>()
    .map((p) => p.text)
    .join() ?? '';
print(text);

// Access grounding metadata
final metadata = response.candidates?.first.groundingMetadata;
if (metadata != null) {
  // Search queries executed by the model
  print('Queries: ${metadata.webSearchQueries}');

  // Web sources used
  for (final chunk in metadata.groundingChunks ?? []) {
    if (chunk.web != null) {
      print('Source: ${chunk.web!.title} - ${chunk.web!.uri}');
    }
  }

  // Search entry point widget (required for attribution)
  if (metadata.searchEntryPoint?.renderedContent != null) {
    print('Widget HTML available for display');
  }
}

Or use the Interactions API for streaming:

await for (final event in client.interactions.createStream(
  model: 'gemini-2.5-flash',
  input: 'What are today\'s top technology news?',
  tools: [GoogleSearchTool()],
)) {
  if (event case ContentDeltaEvent(:final delta)) {
    if (delta is TextDelta) {
      print(delta.text);
    } else if (delta is GoogleSearchCallDelta) {
      print('Searching: ${delta.queries?.join(", ")}');
    }
  }
}

See google_search_example.dart for a complete example.

URL Context

Fetch and analyze content from specific URLs (up to 20 URLs, max 34MB per URL):

final response = await client.models.generateContent(
  model: 'gemini-2.5-flash',
  request: GenerateContentRequest(
    contents: [
      Content(
        parts: [
          TextPart('Summarize the main points from: https://dart.dev/overview'),
        ],
        role: 'user',
      ),
    ],
    // Enable URL Context with an empty map
    tools: [Tool(urlContext: {})],
  ),
);

// Extract text from response
final text = response.candidates?.first.content?.parts
    .whereType<TextPart>()
    .map((p) => p.text)
    .join() ?? '';
print(text);

Or with the Interactions API:

await for (final event in client.interactions.createStream(
  model: 'gemini-2.5-flash',
  input: 'Summarize https://pub.dev/packages/googleai_dart',
  tools: [UrlContextTool()],
)) {
  if (event case ContentDeltaEvent(:final delta)) {
    if (delta is TextDelta) {
      stdout.write(delta.text);
    } else if (delta is UrlContextCallDelta) {
      print('Fetching: ${delta.urls?.join(", ")}');
    }
  }
}

See url_context_example.dart for a complete example.

Google Maps

Add geospatial context for location-based queries:

final response = await client.models.generateContent(
  model: 'gemini-2.5-flash',
  request: GenerateContentRequest(
    contents: [
      Content(
        parts: [TextPart('Find Italian restaurants nearby')],
        role: 'user',
      ),
    ],
    // Enable Google Maps with widget support
    tools: [Tool(googleMaps: GoogleMaps(enableWidget: true))],
    // Provide user location context (as Map)
    toolConfig: {
      'retrievalConfig': {
        'latLng': {
          'latitude': 40.758896,
          'longitude': -73.985130,
        },
      },
    },
  ),
);

// Access place information
final metadata = response.candidates?.first.groundingMetadata;
for (final chunk in metadata?.groundingChunks ?? []) {
  if (chunk.maps != null) {
    print('Place: ${chunk.maps!.title}');
    print('Place ID: ${chunk.maps!.placeId}');
  }
}

// Widget token for rendering Google Maps widget
if (metadata?.googleMapsWidgetContextToken != null) {
  print('Widget token: ${metadata!.googleMapsWidgetContextToken}');
}

See google_maps_example.dart for a complete example.

File Search (Semantic Retrieval)

Search your own documents using FileSearchStores:

// Create a FileSearchStore
final store = await client.fileSearchStores.create(
  displayName: 'My Knowledge Base',
);

// Upload a document with custom chunking and metadata
final uploadResponse = await client.fileSearchStores.upload(
  parent: store.name!,
  filePath: '/path/to/document.pdf',
  mimeType: 'application/pdf',
  request: UploadToFileSearchStoreRequest(
    displayName: 'Technical Documentation',
    chunkingConfig: ChunkingConfig(
      whiteSpaceConfig: WhiteSpaceConfig(
        maxTokensPerChunk: 200,
        maxOverlapTokens: 20,
      ),
    ),
    customMetadata: [
      FileSearchCustomMetadata(key: 'author', stringValue: 'Jane Doe'),
      FileSearchCustomMetadata(key: 'year', numericValue: 2024),
    ],
  ),
);

// Use FileSearch in generation with optional metadata filter
final response = await client.models.generateContent(
  model: 'gemini-2.5-flash',
  request: GenerateContentRequest(
    contents: [
      Content(
        parts: [TextPart('What does the documentation say about X?')],
        role: 'user',
      ),
    ],
    tools: [
      Tool(
        fileSearch: FileSearch(
          fileSearchStoreNames: [store.name!],
          topK: 5,
          metadataFilter: 'author = "Jane Doe"',
        ),
      ),
    ],
  ),
);

// Access grounding metadata (citations)
final metadata = response.candidates?.first.groundingMetadata;
for (final chunk in metadata?.groundingChunks ?? []) {
  if (chunk.retrievedContext != null) {
    print('Source: ${chunk.retrievedContext!.title}');
  }
}

// Cleanup
await client.fileSearchStores.delete(name: store.name!);

See file_search_example.dart for a complete example.

Embeddings #

Embeddings Example

import 'package:googleai_dart/googleai_dart.dart';

// Assumes you have a configured client instance
final response = await client.models.embedContent(
  model: 'gemini-embedding-001',
  request: EmbedContentRequest(
    content: Content(
      parts: [TextPart('Hello world')],
    ),
    taskType: TaskType.retrievalDocument,
  ),
);

print(response.embedding.values); // List<double>

File Management #

Complete File Management Example

Upload files for use in multimodal prompts:

import 'dart:io' as io;
import 'package:googleai_dart/googleai_dart.dart';

// Assumes you have a configured client instance
// Upload a file
final file = await client.files.upload(
  filePath: '/path/to/image.jpg',
  mimeType: 'image/jpeg',
  displayName: 'My Image',
);

print('File uploaded: ${file.name}');
print('State: ${file.state}');
print('URI: ${file.uri}');

// Wait for file to be processed (if needed)
while (file.state == FileState.processing) {
  await Future.delayed(Duration(seconds: 2));
  file = await client.files.get(name: file.name);
}

// Use the file in a prompt
final response = await client.models.generateContent(
  model: 'gemini-2.0-flash-exp',
  request: GenerateContentRequest(
    contents: [
      Content(
        parts: [
          TextPart('Describe this image'),
          FileData(
            fileUri: file.uri,
            mimeType: file.mimeType,
          ),
        ],
        role: 'user',
      ),
    ],
  ),
);

// List all files
final listResponse = await client.files.list(pageSize: 10);
for (final f in listResponse.files ?? []) {
  print('${f.displayName}: ${f.state}');
}

// Download file content (if needed)
final bytes = await client.files.download(name: file.name);
// Save to disk or process bytes
await io.File('downloaded_file.jpg').writeAsBytes(bytes);

// Delete the file when done
await client.files.delete(name: file.name);

Note: Files are automatically deleted after 48 hours.

Context Caching #

Complete Context Caching Example

Context caching allows you to save frequently used content and reuse it across requests, reducing latency and costs:

import 'package:googleai_dart/googleai_dart.dart';

// Assumes you have a configured client instance
// Create cached content with system instructions
final cachedContent = await client.cachedContents.create(
  cachedContent: const CachedContent(
    model: 'models/gemini-1.5-flash-8b',
    displayName: 'Math Expert Cache',
    systemInstruction: Content(
      parts: [TextPart('You are an expert mathematician...')],
    ),
    ttl: '3600s', // Cache for 1 hour
  ),
);

// Use cached content in requests (saves tokens!)
final response = await client.models.generateContent(
  model: 'gemini-1.5-flash-8b',
  request: GenerateContentRequest(
    cachedContent: cachedContent.name,
    contents: [
      Content(parts: [TextPart('Explain the Pythagorean theorem')], role: 'user'),
    ],
  ),
);

// Update cache TTL
await client.cachedContents.update(
  name: cachedContent.name!,
  cachedContent: const CachedContent(
    model: 'models/gemini-1.5-flash-8b',
    ttl: '7200s', // Extend to 2 hours
  ),
  updateMask: 'ttl',
);

// Clean up when done
await client.cachedContents.delete(name: cachedContent.name!);

Benefits:

✅ Reduced latency for requests with large contexts
✅ Lower costs by reusing cached content
✅ Consistent system instructions across requests

Grounded Question Answering #

Complete generateAnswer Examples

The generateAnswer API provides answers grounded in specific sources, ideal for Retrieval Augmented Generation (RAG):

import 'package:googleai_dart/googleai_dart.dart';

// Assumes you have a configured client instance
// Using inline passages (for small knowledge bases)
final response = await client.models.generateAnswer(
  model: 'aqa',
  request: const GenerateAnswerRequest(
    contents: [
      Content(
        parts: [TextPart('What is the capital of France?')],
        role: 'user',
      ),
    ],
    answerStyle: AnswerStyle.abstractive, // Or: extractive, verbose
    inlinePassages: GroundingPassages(
      passages: [
        GroundingPassage(
          id: 'passage-1',
          content: Content(
            parts: [TextPart('Paris is the capital of France.')],
          ),
        ),
      ],
    ),
    temperature: 0.2, // Low temperature for factual answers
  ),
);

// Check answerability
if (response.answerableProbability != null &&
    response.answerableProbability! < 0.5) {
  print('⚠️ Answer may not be grounded in sources');
}

// Using semantic retriever (for large corpora)
final ragResponse = await client.models.generateAnswer(
  model: 'aqa',
  request: const GenerateAnswerRequest(
    contents: [
      Content(
        parts: [TextPart('What are the key features of Dart?')],
        role: 'user',
      ),
    ],
    answerStyle: AnswerStyle.verbose,
    semanticRetriever: SemanticRetrieverConfig(
      source: 'corpora/my-corpus',
      query: Content(
        parts: [TextPart('Dart programming language features')],
      ),
      maxChunksCount: 5,
      minimumRelevanceScore: 0.5,
    ),
  ),
);

Features:

✅ Multiple answer styles (abstractive, extractive, verbose)
✅ Inline passages or semantic retriever grounding
✅ Answerability probability for quality control
✅ Safety settings support
✅ Input feedback for blocked content

Batch Operations #

Batch Operations Example

import 'package:googleai_dart/googleai_dart.dart';

// Assumes you have a configured client instance
// Create a batch for processing multiple requests
final batch = await client.models.batchGenerateContent(
  model: 'gemini-2.0-flash-exp',
  batch: const GenerateContentBatch(
    displayName: 'My Batch Job',
    model: 'models/gemini-2.0-flash-exp',
    inputConfig: InputConfig(
      requests: InlinedRequests(
        requests: [
          InlinedRequest(
            request: GenerateContentRequest(
              contents: [
                Content(parts: [TextPart('What is 2+2?')], role: 'user'),
              ],
            ),
          ),
          InlinedRequest(
            request: GenerateContentRequest(
              contents: [
                Content(parts: [TextPart('What is 3+3?')], role: 'user'),
              ],
            ),
          ),
        ],
      ),
    ),
  ),
);

// Monitor batch status
final status = await client.batches.getGenerateContentBatch(name: batch.name!);
print('Batch state: ${status.state}');

Corpus & RAG #

Corpus Example

⚠️ Important: Document, chunk, and RAG features are only available in Vertex AI. Google AI only supports basic corpus management.

import 'package:googleai_dart/googleai_dart.dart';

// Assumes you have a configured Google AI client
// Google AI supports basic corpus CRUD operations
final corpus = await client.corpora.create(
  corpus: const Corpus(displayName: 'My Knowledge Base'),
);

// List corpora
final corpora = await client.corpora.list(pageSize: 10);

// Get corpus
final retrieved = await client.corpora.get(name: corpus.name!);

// Update corpus
final updated = await client.corpora.update(
  name: corpus.name!,
  corpus: const Corpus(displayName: 'Updated Name'),
  updateMask: 'displayName',
);

// Delete corpus
await client.corpora.delete(name: corpus.name!);

For full RAG capabilities (documents, chunks, semantic search):

Use Vertex AI with RAG Stores and Vector Search
The Semantic Retriever API has been succeeded by Vertex AI Vector Search

Permissions #

Permission Management Examples

import 'package:googleai_dart/googleai_dart.dart';

// Assumes you have a configured client instance
// Grant permissions to a resource
final permission = await client.tunedModels.permissions(parent: 'tunedModels/my-model').create(
  permission: const Permission(
    granteeType: GranteeType.user,
    emailAddress: 'user@example.com',
    role: PermissionRole.reader,
  ),
);

// List permissions
final permissions = await client.tunedModels.permissions(parent: 'tunedModels/my-model').list();

// Transfer ownership
await client.tunedModels.permissions(parent: 'tunedModels/my-model').transferOwnership(
  emailAddress: 'newowner@example.com',
);

Tuned Models #

List and Inspect Tuned Models

import 'package:googleai_dart/googleai_dart.dart';

// Assumes you have a configured client instance
// List all tuned models
final tunedModels = await client.tunedModels.list(
  pageSize: 10,
  filter: 'owner:me', // Filter by ownership
);

for (final model in tunedModels.tunedModels) {
  print('Model: ${model.displayName} (${model.name})');
  print('State: ${model.state}');
  print('Base model: ${model.baseModel}');
}

// Get specific tuned model details
final myModel = await client.tunedModels.get(
  name: 'tunedModels/my-model-abc123',
);

print('Model: ${myModel.displayName}');
print('State: ${myModel.state}');
print('Training examples: ${myModel.tuningTask?.trainingData?.examples?.exampleCount}');

// List operations for a tuned model (monitor training progress)
final operations = await client.tunedModels.operations(parent: 'tunedModels/my-model-abc123').list();

for (final operation in operations.operations) {
  print('Operation: ${operation.name}');
  print('Done: ${operation.done}');
  if (operation.metadata != null) {
    print('Progress: ${operation.metadata}');
  }
}

Using Tuned Models for Generation #

Generate Content with Tuned Models

Once you have a tuned model, you can use it for content generation just like base models:

import 'package:googleai_dart/googleai_dart.dart';

// Assumes you have a configured client instance
// Generate content with a tuned model
final response = await client.tunedModels.generateContent(
  tunedModel: 'my-model-abc123', // Your tuned model ID
  request: const GenerateContentRequest(
    contents: [
      Content(
        parts: [TextPart('Explain quantum computing')],
        role: 'user',
      ),
    ],
  ),
);

print(response.candidates?.first.content?.parts.first);

// Stream responses with a tuned model
await for (final chunk in client.tunedModels.streamGenerateContent(
  tunedModel: 'my-model-abc123',
  request: request,
)) {
  final text = chunk.candidates?.first.content?.parts.first;
  if (text is TextPart) print(text.text);
}

// Batch generation with a tuned model
final batch = await client.tunedModels.batchGenerateContent(
  tunedModel: 'my-model-abc123',
  batch: const GenerateContentBatch(
    model: 'models/placeholder',
    displayName: 'My Batch Job',
    inputConfig: InputConfig(
      requests: InlinedRequests(
        requests: [
          InlinedRequest(
            request: GenerateContentRequest(
              contents: [
                Content(parts: [TextPart('Question 1')], role: 'user'),
              ],
            ),
          ),
        ],
      ),
    ),
  ),
);

Benefits of tuned models:

✅ Customized behavior for your specific domain
✅ Improved accuracy for specialized tasks
✅ Consistent output style and format
✅ Reduced need for extensive prompting

Prediction (Video Generation) #

Video Generation with Veo Example

import 'package:googleai_dart/googleai_dart.dart';

// Assumes you have a configured client instance
// Synchronous prediction
final response = await client.models.predict(
  model: 'veo-3.0-generate-001',
  instances: [
    {'prompt': 'A cat playing piano in a jazz club'},
  ],
);

print('Predictions: ${response.predictions}');

// Long-running prediction for video generation
final operation = await client.models.predictLongRunning(
  model: 'veo-3.0-generate-001',
  instances: [
    {'prompt': 'A golden retriever running on a beach at sunset'},
  ],
  parameters: {
    'aspectRatio': '16:9',
  },
);

print('Operation: ${operation.name}');
print('Done: ${operation.done}');

// Check for generated videos
if (operation.done == true && operation.response != null) {
  final videoResponse = operation.response!.generateVideoResponse;
  if (videoResponse?.generatedSamples != null) {
    for (final media in videoResponse!.generatedSamples!) {
      if (media.video?.uri != null) {
        print('Video URI: ${media.video!.uri}');
      }
    }
  }

  // Check for RAI filtering
  if (videoResponse?.raiMediaFilteredCount != null) {
    print('Filtered: ${videoResponse!.raiMediaFilteredCount} videos');
    print('Reasons: ${videoResponse.raiMediaFilteredReasons}');
  }
}

Live API (Real-time Streaming) #

The Live API provides bidirectional WebSocket streaming for real-time audio/text conversations.

Live API Example

import 'package:googleai_dart/googleai_dart.dart';

// Create the main client
final client = GoogleAIClient(
  config: GoogleAIConfig(
    authProvider: ApiKeyProvider('YOUR_API_KEY'),
  ),
);

// Create a Live client for WebSocket streaming
final liveClient = client.createLiveClient();

try {
  // Connect to the Live API with configuration
  final session = await liveClient.connect(
    model: 'gemini-2.0-flash-live-001',
    liveConfig: LiveConfig(
      generationConfig: LiveGenerationConfig.textAndAudio(
        speechConfig: SpeechConfig.withVoice(LiveVoices.puck),
        temperature: 0.7,
      ),
      // Enable transcription for both input and output
      inputAudioTranscription: AudioTranscriptionConfig.enabled(),
      outputAudioTranscription: AudioTranscriptionConfig.enabled(),
      // Configure voice activity detection
      realtimeInputConfig: RealtimeInputConfig.withVAD(
        silenceDurationMs: 500,
        activityHandling: ActivityHandling.startOfActivityInterrupts,
      ),
    ),
  );

  print('Connected! Session ID: ${session.sessionId}');

  // Send a text message
  session.sendText('Hello! Tell me a short joke.');

  // Listen for responses
  await for (final message in session.messages) {
    switch (message) {
      case BidiGenerateContentSetupComplete(:final sessionId):
        print('Setup complete, session: $sessionId');

      case BidiGenerateContentServerContent(
          :final modelTurn,
          :final turnComplete,
          :final inputTranscription,
          :final outputTranscription,
        ):
        // Handle model response
        if (modelTurn != null) {
          for (final part in modelTurn.parts) {
            if (part is TextPart) {
              print('Model: ${part.text}');
            } else if (part is InlineDataPart) {
              // Audio data (24kHz PCM)
              print('Audio: ${part.inlineData.data.length} bytes');
            }
          }
        }

        // Show transcriptions
        if (inputTranscription?.text != null) {
          print('You said: ${inputTranscription!.text}');
        }
        if (outputTranscription?.text != null) {
          print('Model said: ${outputTranscription!.text}');
        }

        if (turnComplete ?? false) {
          print('--- Turn complete ---');
          break;
        }

      case BidiGenerateContentToolCall(:final functionCalls):
        // Handle tool calls and send responses
        final responses = functionCalls.map((call) => FunctionResponse(
          name: call.name,
          response: {'result': 'executed'},
        )).toList();
        session.sendToolResponse(responses);

      case GoAway(:final timeLeft):
        print('Server disconnect in: $timeLeft');
        // Save session.resumptionToken for later resumption

      case SessionResumptionUpdate(:final resumable):
        print('Session resumable: $resumable');
    }
  }

  await session.close();
} on LiveConnectionException catch (e) {
  print('Connection failed: ${e.message}');
} finally {
  await liveClient.close();
  client.close();
}

Sending Audio

// Audio must be 16kHz, 16-bit, mono PCM
void sendAudio(LiveSession session, List<int> pcmBytes) {
  session.sendAudio(pcmBytes);
}

Manual Voice Activity Detection

// For manual VAD mode (when automaticActivityDetection.disabled = true)
void manualVAD(LiveSession session) {
  // Signal when user starts speaking
  session.signalActivityStart();

  // ... user speaks ...

  // Signal when user stops speaking
  session.signalActivityEnd();
}

Session Resumption

// Resume a previous session
final session = await liveClient.resume(
  model: 'gemini-2.0-flash-live-001',
  resumptionToken: savedToken, // From previous session
  liveConfig: LiveConfig(
    generationConfig: LiveGenerationConfig.textAndAudio(),
  ),
);

print('Session resumed! ID: ${session.sessionId}');

Ephemeral Tokens (Secure Client-Side Auth)

For client-side applications (mobile apps, web apps), use ephemeral tokens instead of exposing your API key:

Server-side (create token):

// On your backend server
final token = await client.authTokens.create(
  authToken: AuthToken(
    expireTime: DateTime.now().add(Duration(minutes: 30)),
    uses: 1, // Single use
  ),
);
// Send token.name to client securely

Client-side (use token):

// On mobile/web client - no API key needed!
final liveClient = LiveClient(
  config: GoogleAIConfig.googleAI(
    authProvider: NoAuthProvider(), // No API key
  ),
);

final session = await liveClient.connect(
  model: 'gemini-2.0-flash-live-001',
  accessToken: tokenFromServer, // Use ephemeral token
);

Note: Ephemeral tokens are only available with Google AI (not Vertex AI).

Platform Notes

Web (Browser) Limitations:

Browser WebSocket API does NOT support custom HTTP headers during handshake
Google AI: Works on web via query parameter authentication (?key=... or ?access_token=...)
Vertex AI OAuth: Requires Bearer token in headers, which doesn't work in browsers
Recommendation: For Vertex AI on web, use a backend proxy or ephemeral tokens with Google AI

Audio Streaming Notes:

Audio data is base64-encoded before sending (~33% size overhead)
The underlying WebSocket handles buffering automatically
Audio format: 16kHz, 16-bit PCM mono input (32 KB/s raw, ~43 KB/s encoded)
Output audio: 24kHz, 16-bit PCM mono

Migrating from Google AI to Vertex AI #

Switching from Google AI to Vertex AI requires minimal code changes:

Step 1: Change Configuration #

Configuration Migration Examples

Before (Google AI):

import 'package:googleai_dart/googleai_dart.dart';

final client = GoogleAIClient(
  config: GoogleAIConfig.googleAI(
    authProvider: ApiKeyProvider('YOUR_API_KEY'),
  ),
);

After (Vertex AI):

import 'package:googleai_dart/googleai_dart.dart';

final client = GoogleAIClient(
  config: GoogleAIConfig.vertexAI(
    projectId: 'your-project-id',
    location: 'us-central1',
    authProvider: MyOAuthProvider(), // Implement OAuth
  ),
);

Step 2: Replace Google AI-Only Features #

If you use any of these features, you'll need to use Vertex AI alternatives:

Google AI Feature	Vertex AI Alternative
`client.files.upload()`	Upload to Cloud Storage, use `gs://` URIs in requests
`client.tunedModels.*`	Use Vertex AI Tuning API directly
`client.corpora.*`	Use Vertex AI RAG Stores
`client.models.generateAnswer()`	Use `generateContent()` with grounding

Examples #

See the example/ directory for comprehensive examples:

generate_content.dart - Basic content generation
streaming_example.dart - Real-time SSE streaming
function_calling_example.dart - Tool usage with functions
embeddings_example.dart - Text embeddings generation
error_handling_example.dart - Exception handling patterns
abort_example.dart - Request cancellation
models_example.dart - Model listing and inspection
files_example.dart - File upload and management
pagination_example.dart - Paginated API calls
batch_example.dart - Batch operations
caching_example.dart - Context caching for cost/latency optimization
permissions_example.dart - Permission management
oauth_refresh_example.dart - OAuth token refresh with AuthProvider
prediction_example.dart - Video generation with Veo model
generate_answer_example.dart - Grounded question answering with inline passages or semantic retriever
tuned_model_generation_example.dart - Generate content with custom tuned models
api_versions_example.dart - Using v1 (stable) vs v1beta (beta)
vertex_ai_example.dart - Using Vertex AI with OAuth authentication
complete_api_example.dart - Demonstrating 100% API coverage
interactions_example.dart - Interactions API for server-side state management (experimental)
google_search_example.dart - Google Search grounding for real-time web information
url_context_example.dart - URL Context for fetching and analyzing web content
google_maps_example.dart - Google Maps grounding for geospatial context
file_search_example.dart - File Search with FileSearchStores for semantic retrieval (RAG)
live_example.dart - Live API for real-time WebSocket streaming (audio/text)

API Coverage #

This client implements 78 endpoints covering 100% of all non-deprecated Gemini API operations:

Models Resource (`client.models`) #

Generation: generateContent, streamGenerateContent, countTokens, generateAnswer
Dynamic Content: dynamicGenerateContent, dynamicStreamGenerateContent (for live content with dynamic IDs)
Embeddings: embedContent, batchEmbedContents (synchronous), asyncBatchEmbedContent (asynchronous)
Prediction: predict, predictLongRunning (for video generation with Veo)
Model Management: list, get, listOperations, create, patch, delete

Tuned Models Resource (`client.tunedModels`) #

Generation: generateContent, streamGenerateContent, batchGenerateContent, asyncBatchEmbedContent
Management: list, get
Operations: operations(parent).list
Permissions: permissions(parent).create, permissions(parent).list, permissions(parent).get, permissions(parent).update, permissions(parent).delete, permissions(parent).transferOwnership

Files Resource (`client.files`) #

Management: upload, list, get, delete, download
Upload Methods:
- filePath: Upload from file system (IO platforms, streaming)
- bytes: Upload from memory (all platforms)
- contentStream: Upload large files via streaming (IO platforms, memory efficient)
Generated Files: generatedFiles.list, generatedFiles.get, generatedFiles.getOperation (for video outputs)

Cached Contents Resource (`client.cachedContents`) #

Management: create, get, update, delete, list

Batches Resource (`client.batches`) #

Management: list, getGenerateContentBatch, getEmbedBatch, updateGenerateContentBatch, updateEmbedContentBatch, delete, cancel

Corpora Resource (`client.corpora`) #

Corpus Management: create, list, get, update, delete
Document Management: documents(corpus).create, documents(corpus).list, documents(corpus).get, documents(corpus).update, documents(corpus).delete
Permissions: permissions(parent).create, permissions(parent).list, permissions(parent).get, permissions(parent).update, permissions(parent).delete

FileSearchStores Resource (`client.fileSearchStores`) #

Store Management: create, list, get, delete
Document Operations: importFile, uploadToFileSearchStore
Document Management: documents.list, documents.get, documents.delete
Operations: getOperation, getUploadOperation

Interactions Resource (`client.interactions`) - Experimental #

Creation: create, createWithAgent, createStream
Management: get, cancel, delete
Streaming: createStream, resumeStream (SSE with event types)
Content Types: 17 types including text, image, audio, function calls, code execution, etc.
Events: InteractionStart, ContentDelta, ContentStop, InteractionComplete, Error

Auth Tokens Resource (`client.authTokens`) #

Management: create (creates ephemeral tokens for secure client-side authentication)

Live API (`client.createLiveClient()`) #

Connection: connect, resume (WebSocket streaming, ephemeral token support)
Client Messages: setup, clientContent, realtimeInput, toolResponse
Server Messages: setupComplete, serverContent, toolCall, toolCallCancellation, goAway, sessionResumptionUpdate, unknownServerMessage
Session Methods: sendText, sendAudio, sendContent, sendToolResponse, signalActivityStart, signalActivityEnd
Configuration: LiveConfig, LiveGenerationConfig, SpeechConfig, RealtimeInputConfig, SessionResumptionConfig
Audio Format: 16kHz/16-bit/mono PCM input, 24kHz/16-bit/mono PCM output

Universal Operations #

getOperation: Available for all long-running operations

Development #

# Install dependencies
dart pub get

# Run tests
dart test

# Format code
dart format .

# Analyze
dart analyze

License #

googleai_dart is licensed under the MIT License.

googleai_dart 2.1.0 googleai_dart: ^2.1.0 copied to clipboard

Metadata