gemini_live library
Public entry point for the gemini_live package.
Import this library to access the high-level GoogleGenAI client, Live API session helpers, and the package's request and response models.
Classes
- ActivityEnd
- A signal that marks the end of explicit user activity.
- ActivityStart
- A signal that marks the start of explicit user activity.
- AudioTranscriptionConfig
- Audio transcription settings for input or output streams.
- AutomaticActivityDetection
- Automatic voice activity detection settings for realtime audio.
- AvatarConfig
- Avatar options for live video-capable sessions.
- Blob
- Inline binary data encoded for API transport.
- CodeExecutionResult
- The result of model-executed code.
- ComputerUse
- Computer-use tool configuration.
- Content
- A conversational turn made of one or more Part values.
- ContextWindowCompressionConfig
- Context compression settings for long-running sessions.
- CustomizedAvatar
- A customized avatar reference image.
- DynamicRetrievalConfig
- Dynamic retrieval thresholds for grounded search.
- ExecutableCode
- Executable code emitted by the model.
- FileData
- URI-based media content referenced by a part.
- FunctionCall
- A tool invocation requested by the model.
- FunctionDeclaration
- A function schema exposed to the model as a callable tool.
- FunctionResponse
- A tool result sent back to the model.
- FunctionResponseBlob
- Inline binary data returned from a function response.
- FunctionResponseFileData
- File metadata returned from a function response.
- FunctionResponsePart
- A single payload part inside a function response.
- GenerationConfig
- Generation parameters used when starting a Live API session.
- GoogleGenAI
- The primary class for interacting with the Google Generative AI API.
- GoogleSearch
- Google Search tool configuration.
- GoogleSearchRetrieval
- Retrieval settings for the Google Search tool.
- Interval
- A time interval used by search filters.
- LiveCallbacks
- Callbacks for Live API events
- LiveClientContent
- Client-authored conversation turns sent to the model.
- LiveClientMessage
- A top-level client message sent over the Live API socket.
- LiveClientRealtimeInput
- Realtime media or text input sent while a session is active.
- LiveClientSetup
- The initial setup message sent when opening a Live API session.
- LiveClientToolResponse
- A batch of tool results returned to the server.
- LiveConnectParameters
- Parameters for establishing a Live API connection
- LiveSendClientContentParameters
- Parameters for sending conversational turns to the session.
- LiveSendRealtimeInputParameters
- Parameters for sending realtime media or text input.
- LiveSendToolResponseParameters
- Parameters for sending tool results back to the model.
- LiveServerContent
- Server-generated content and turn lifecycle updates.
- LiveServerGoAway
- A shutdown warning indicating when the session will expire.
- LiveServerMessage
- A top-level server message received over the Live API socket.
- LiveServerSessionResumptionUpdate
- A session resumption token update from the server.
- LiveServerSetupComplete
- Acknowledgement payload returned after session setup completes.
- LiveServerToolCall
- A tool call request emitted by the server.
- LiveServerToolCallCancellation
- A cancellation notice for previously issued tool calls.
- LiveService
- Service for connecting to the Gemini Live API via WebSocket
- LiveSession
- Represents an active Live API session
- ModalityTokenCount
- Token counts broken down by media modality.
- MultiSpeakerVoiceConfig
- Speech settings for two-speaker text-to-speech output.
- Part
- A single multimodal part within a content turn.
- PartialArg
- One streamed partial argument value for a function call.
- PartMediaResolution
- Input media tokenization hints attached to a part.
- PrebuiltVoiceConfig
- A prebuilt voice selection for synthesized audio output.
- ProactivityConfig
- Proactivity options for realtime audio sessions.
- RealtimeInputConfig
- Realtime input settings sent during session setup.
- ReplicatedVoiceConfig
- Voice cloning settings for custom speech output.
- SafetySetting
- Safety settings to block unsafe content in Gemini responses.
- SessionResumptionConfig
- Session resumption settings for reconnectable sessions.
- SlidingWindow
- Sliding window targets used during context compression.
- SpeakerVoiceConfig
- Voice assignment for one speaker in a multi-speaker response.
- SpeechConfig
- Speech generation settings for audio responses.
- StreamTranslationConfig
- Stream translation settings for Live sessions.
- ThinkingConfig
- Thinking controls for models that can emit thought content.
- Tool
- A tool bundle that can be attached to a model session.
- ToolCall
- A server-side tool call embedded in a model part.
- ToolResponse
- The client-side result of a server-side tool call.
- Transcription
- A transcription update for input or output audio.
- UsageMetadata
- Usage statistics attached to a server response.
- VideoMetadata
- Additional video metadata attached to inline or URI-based media.
- VoiceActivity
- A higher-level voice activity event emitted by the server.
- VoiceActivityDetectionSignal
- A low-level VAD signal emitted by the server.
- VoiceConfig
- Voice settings applied to spoken responses.
Enums
- ActivityHandling
- How detected user activity affects model generation.
- Behavior
- Execution modes for server-side behaviors such as function calling.
- EndSensitivity
- Sensitivity levels for detecting the end of speech.
- Environment
- Environments supported by the computer-use tool.
- FunctionResponseScheduling
- Scheduling strategies for tool responses.
- HarmBlockMethod
- Safety blocking methods.
- HarmBlockThreshold
- Safety thresholds used to block unsafe content.
- HarmCategory
- Harm categories reported by Gemini safety metadata.
- MediaModality
- Media kinds used in token accounting details.
- MediaResolution
- Media resolution presets for multimodal responses.
- Modality
- Modalities that a request or response can contain.
- PartMediaResolutionLevel
- Media tokenization quality used for a specific part.
- StartSensitivity
- Sensitivity levels for detecting the start of speech.
- ThinkingLevel
- Thinking effort levels for models that support thought generation.
- ToolType
- Tool categories reported in server-side tool call parts.
- TrafficType
- Traffic classes used for usage accounting.
- TurnCompleteReason
- Reasons a model turn completed without a final response.
- TurnCoverage
- How much of the user turn is forwarded to the model.
- VadSignalType
- Voice activity detection signals emitted by the server.
- VoiceActivityType
- Voice activity events detected for an audio stream.
Functions
Exceptions / Errors
- TimeoutException
- Exception for timeout errors