models/audio_models library

Audio-related models for Text-to-Speech (TTS) and Speech-to-Text (STT) functionality

Classes

AudioAlignment
Character-level timing alignment for TTS (ElevenLabs specific)
AudioDataEvent
Audio data chunk event
AudioErrorEvent
Audio error event
AudioMetadataEvent
Audio metadata event
AudioStreamEvent
Audio stream event for streaming TTS
AudioTimingEvent
Audio timing event for character-level alignment
AudioTranslationRequest
Audio translation request (OpenAI specific)
EnhancedWordTiming
Enhanced word timing with speaker information (ElevenLabs specific)
LanguageInfo
Language information for STT
STTRequest
Speech-to-Text request configuration
STTResponse
Speech-to-Text response with metadata
TranscriptionSegment
Transcription segment information (OpenAI specific)
TTSRequest
Text-to-Speech request configuration
TTSResponse
Text-to-Speech response with metadata
VoiceInfo
Voice information
WordTiming
Word timing information for STT

Enums

AudioFormat
Audio format enumeration for better type safety
AudioProcessingMode
Audio processing mode for different use cases
AudioQuality
Audio quality settings
TextNormalization
Text normalization mode for TTS
TimestampGranularity
Timestamp granularity for audio processing