gemini_live library

Classes

ActivityEnd: A signal that marks the end of explicit user activity.
ActivityStart: A signal that marks the start of explicit user activity.
AudioTranscriptionConfig: Audio transcription settings for input or output streams.
AutomaticActivityDetection: Automatic voice activity detection settings for realtime audio.
AvatarConfig: Avatar options for live video-capable sessions.
Blob: Inline binary data encoded for API transport.
CodeExecutionResult: The result of model-executed code.
ComputerUse: Computer-use tool configuration.
Content: A conversational turn made of one or more Part values.
ContextWindowCompressionConfig: Context compression settings for long-running sessions.
CustomizedAvatar: A customized avatar reference image.
DynamicRetrievalConfig: Dynamic retrieval thresholds for grounded search.
ExecutableCode: Executable code emitted by the model.
FileData: URI-based media content referenced by a part.
FunctionCall: A tool invocation requested by the model.
FunctionDeclaration: A function schema exposed to the model as a callable tool.
FunctionResponse: A tool result sent back to the model.
FunctionResponseBlob: Inline binary data returned from a function response.
FunctionResponseFileData: File metadata returned from a function response.
FunctionResponsePart: A single payload part inside a function response.
GenerationConfig: Generation parameters used when starting a Live API session.
GoogleGenAI: The primary class for interacting with the Google Generative AI API.
GoogleSearch: Google Search tool configuration.
GoogleSearchRetrieval: Retrieval settings for the Google Search tool.
Interval: A time interval used by search filters.
LiveCallbacks: Callbacks for Live API events
LiveClientContent: Client-authored conversation turns sent to the model.
LiveClientMessage: A top-level client message sent over the Live API socket.
LiveClientRealtimeInput: Realtime media or text input sent while a session is active.
LiveClientSetup: The initial setup message sent when opening a Live API session.
LiveClientToolResponse: A batch of tool results returned to the server.
LiveConnectParameters: Parameters for establishing a Live API connection
LiveSendClientContentParameters: Parameters for sending conversational turns to the session.
LiveSendRealtimeInputParameters: Parameters for sending realtime media or text input.
LiveSendToolResponseParameters: Parameters for sending tool results back to the model.
LiveServerContent: Server-generated content and turn lifecycle updates.
LiveServerGoAway: A shutdown warning indicating when the session will expire.
LiveServerMessage: A top-level server message received over the Live API socket.
LiveServerSessionResumptionUpdate: A session resumption token update from the server.
LiveServerSetupComplete: Acknowledgement payload returned after session setup completes.
LiveServerToolCall: A tool call request emitted by the server.
LiveServerToolCallCancellation: A cancellation notice for previously issued tool calls.
LiveService: Service for connecting to the Gemini Live API via WebSocket
LiveSession: Represents an active Live API session
ModalityTokenCount: Token counts broken down by media modality.
MultiSpeakerVoiceConfig: Speech settings for two-speaker text-to-speech output.
Part: A single multimodal part within a content turn.
PartialArg: One streamed partial argument value for a function call.
PartMediaResolution: Input media tokenization hints attached to a part.
PrebuiltVoiceConfig: A prebuilt voice selection for synthesized audio output.
ProactivityConfig: Proactivity options for realtime audio sessions.
RealtimeInputConfig: Realtime input settings sent during session setup.
ReplicatedVoiceConfig: Voice cloning settings for custom speech output.
SafetySetting: Safety settings to block unsafe content in Gemini responses.
SessionResumptionConfig: Session resumption settings for reconnectable sessions.
SlidingWindow: Sliding window targets used during context compression.
SpeakerVoiceConfig: Voice assignment for one speaker in a multi-speaker response.
SpeechConfig: Speech generation settings for audio responses.
StreamTranslationConfig: Stream translation settings for Live sessions.
ThinkingConfig: Thinking controls for models that can emit thought content.
Tool: A tool bundle that can be attached to a model session.
ToolCall: A server-side tool call embedded in a model part.
ToolResponse: The client-side result of a server-side tool call.
Transcription: A transcription update for input or output audio.
UsageMetadata: Usage statistics attached to a server response.
VideoMetadata: Additional video metadata attached to inline or URI-based media.
VoiceActivity: A higher-level voice activity event emitted by the server.
VoiceActivityDetectionSignal: A low-level VAD signal emitted by the server.
VoiceConfig: Voice settings applied to spoken responses.

Enums

ActivityHandling: How detected user activity affects model generation.
Behavior: Execution modes for server-side behaviors such as function calling.
EndSensitivity: Sensitivity levels for detecting the end of speech.
Environment: Environments supported by the computer-use tool.
FunctionResponseScheduling: Scheduling strategies for tool responses.
HarmBlockMethod: Safety blocking methods.
HarmBlockThreshold: Safety thresholds used to block unsafe content.
HarmCategory: Harm categories reported by Gemini safety metadata.
MediaModality: Media kinds used in token accounting details.
MediaResolution: Media resolution presets for multimodal responses.
Modality: Modalities that a request or response can contain.
PartMediaResolutionLevel: Media tokenization quality used for a specific part.
StartSensitivity: Sensitivity levels for detecting the start of speech.
ThinkingLevel: Thinking effort levels for models that support thought generation.
ToolType: Tool categories reported in server-side tool call parts.
TrafficType: Traffic classes used for usage accounting.
TurnCompleteReason: Reasons a model turn completed without a final response.
TurnCoverage: How much of the user turn is forwarded to the model.
VadSignalType: Voice activity detection signals emitted by the server.
VoiceActivityType: Voice activity events detected for an audio stream.

Functions

addWavHeader(Uint8List pcmBytes, {required int sampleRate, int numChannels = 1, int bitsPerSample = 16}) → Uint8List: PCM 오디오 데이터에 WAV 헤더를 추가하여 완전한 WAV 파일 바이트를 생성합니다.

Exceptions / Errors

TimeoutException: Exception for timeout errors

gemini_live library

Classes

Enums

Functions

Exceptions / Errors

gemini_live package

gemini_live library