models/audio_models library
Audio-related models for Text-to-Speech (TTS) and Speech-to-Text (STT) functionality
Classes
- AudioAlignment
- Character-level timing alignment for TTS (ElevenLabs specific)
- AudioDataEvent
- Audio data chunk event
- AudioErrorEvent
- Audio error event
- AudioMetadataEvent
- Audio metadata event
- AudioStreamEvent
- Audio stream event for streaming TTS
- AudioTimingEvent
- Audio timing event for character-level alignment
- AudioTranslationRequest
- Audio translation request (OpenAI specific)
- EnhancedWordTiming
- Enhanced word timing with speaker information (ElevenLabs specific)
- LanguageInfo
- Language information for STT
- STTRequest
- Speech-to-Text request configuration
- STTResponse
- Speech-to-Text response with metadata
- TranscriptionSegment
- Transcription segment information (OpenAI specific)
- TTSRequest
- Text-to-Speech request configuration
- TTSResponse
- Text-to-Speech response with metadata
- VoiceInfo
- Voice information
- WordTiming
- Word timing information for STT
Enums
- AudioFormat
- Audio format enumeration for better type safety
- AudioProcessingMode
- Audio processing mode for different use cases
- AudioQuality
- Audio quality settings
- TextNormalization
- Text normalization mode for TTS
- TimestampGranularity
- Timestamp granularity for audio processing