gemini_live library

Public entry point for the gemini_live package.

Import this library to access the high-level GoogleGenAI client, Live API session helpers, and the package's request and response models.

Classes

ActivityEnd
A signal that marks the end of explicit user activity.
ActivityStart
A signal that marks the start of explicit user activity.
AudioTranscriptionConfig
Audio transcription settings for input or output streams.
AutomaticActivityDetection
Automatic voice activity detection settings for realtime audio.
AvatarConfig
Avatar options for live video-capable sessions.
Blob
Inline binary data encoded for API transport.
CodeExecutionResult
The result of model-executed code.
ComputerUse
Computer-use tool configuration.
Content
A conversational turn made of one or more Part values.
ContextWindowCompressionConfig
Context compression settings for long-running sessions.
CustomizedAvatar
A customized avatar reference image.
DynamicRetrievalConfig
Dynamic retrieval thresholds for grounded search.
ExecutableCode
Executable code emitted by the model.
FileData
URI-based media content referenced by a part.
FunctionCall
A tool invocation requested by the model.
FunctionDeclaration
A function schema exposed to the model as a callable tool.
FunctionResponse
A tool result sent back to the model.
FunctionResponseBlob
Inline binary data returned from a function response.
FunctionResponseFileData
File metadata returned from a function response.
FunctionResponsePart
A single payload part inside a function response.
GenerationConfig
Generation parameters used when starting a Live API session.
GoogleGenAI
The primary class for interacting with the Google Generative AI API.
GoogleSearch
Google Search tool configuration.
GoogleSearchRetrieval
Retrieval settings for the Google Search tool.
Interval
A time interval used by search filters.
LiveCallbacks
Callbacks for Live API events
LiveClientContent
Client-authored conversation turns sent to the model.
LiveClientMessage
A top-level client message sent over the Live API socket.
LiveClientRealtimeInput
Realtime media or text input sent while a session is active.
LiveClientSetup
The initial setup message sent when opening a Live API session.
LiveClientToolResponse
A batch of tool results returned to the server.
LiveConnectParameters
Parameters for establishing a Live API connection
LiveSendClientContentParameters
Parameters for sending conversational turns to the session.
LiveSendRealtimeInputParameters
Parameters for sending realtime media or text input.
LiveSendToolResponseParameters
Parameters for sending tool results back to the model.
LiveServerContent
Server-generated content and turn lifecycle updates.
LiveServerGoAway
A shutdown warning indicating when the session will expire.
LiveServerMessage
A top-level server message received over the Live API socket.
LiveServerSessionResumptionUpdate
A session resumption token update from the server.
LiveServerSetupComplete
Acknowledgement payload returned after session setup completes.
LiveServerToolCall
A tool call request emitted by the server.
LiveServerToolCallCancellation
A cancellation notice for previously issued tool calls.
LiveService
Service for connecting to the Gemini Live API via WebSocket
LiveSession
Represents an active Live API session
ModalityTokenCount
Token counts broken down by media modality.
MultiSpeakerVoiceConfig
Speech settings for two-speaker text-to-speech output.
Part
A single multimodal part within a content turn.
PartialArg
One streamed partial argument value for a function call.
PartMediaResolution
Input media tokenization hints attached to a part.
PrebuiltVoiceConfig
A prebuilt voice selection for synthesized audio output.
ProactivityConfig
Proactivity options for realtime audio sessions.
RealtimeInputConfig
Realtime input settings sent during session setup.
ReplicatedVoiceConfig
Voice cloning settings for custom speech output.
SafetySetting
Safety settings to block unsafe content in Gemini responses.
SessionResumptionConfig
Session resumption settings for reconnectable sessions.
SlidingWindow
Sliding window targets used during context compression.
SpeakerVoiceConfig
Voice assignment for one speaker in a multi-speaker response.
SpeechConfig
Speech generation settings for audio responses.
StreamTranslationConfig
Stream translation settings for Live sessions.
ThinkingConfig
Thinking controls for models that can emit thought content.
Tool
A tool bundle that can be attached to a model session.
ToolCall
A server-side tool call embedded in a model part.
ToolResponse
The client-side result of a server-side tool call.
Transcription
A transcription update for input or output audio.
UsageMetadata
Usage statistics attached to a server response.
VideoMetadata
Additional video metadata attached to inline or URI-based media.
VoiceActivity
A higher-level voice activity event emitted by the server.
VoiceActivityDetectionSignal
A low-level VAD signal emitted by the server.
VoiceConfig
Voice settings applied to spoken responses.

Enums

ActivityHandling
How detected user activity affects model generation.
Behavior
Execution modes for server-side behaviors such as function calling.
EndSensitivity
Sensitivity levels for detecting the end of speech.
Environment
Environments supported by the computer-use tool.
FunctionResponseScheduling
Scheduling strategies for tool responses.
HarmBlockMethod
Safety blocking methods.
HarmBlockThreshold
Safety thresholds used to block unsafe content.
HarmCategory
Harm categories reported by Gemini safety metadata.
MediaModality
Media kinds used in token accounting details.
MediaResolution
Media resolution presets for multimodal responses.
Modality
Modalities that a request or response can contain.
PartMediaResolutionLevel
Media tokenization quality used for a specific part.
StartSensitivity
Sensitivity levels for detecting the start of speech.
ThinkingLevel
Thinking effort levels for models that support thought generation.
ToolType
Tool categories reported in server-side tool call parts.
TrafficType
Traffic classes used for usage accounting.
TurnCompleteReason
Reasons a model turn completed without a final response.
TurnCoverage
How much of the user turn is forwarded to the model.
VadSignalType
Voice activity detection signals emitted by the server.
VoiceActivityType
Voice activity events detected for an audio stream.

Functions

addWavHeader(Uint8List pcmBytes, {required int sampleRate, int numChannels = 1, int bitsPerSample = 16}) Uint8List
PCM 오디오 데이터에 WAV 헤더를 추가하여 완전한 WAV 파일 바이트를 생성합니다.

Exceptions / Errors

TimeoutException
Exception for timeout errors