TranscriberConfig

sampling_rate
int

The sampling rate of the audio in samples per second (Hz). A higher sampling rate provides better audio quality but may increase processing time and data size.

audio_encoding
AudioEncoding

The encoding format of the audio data. Options include: LINEAR16, MULAW.

chunk_size
int

The size of each chunk of audio data sent to the transcriber, in bytes. A larger chunk size can reduce network overhead but may increase latency.

endpointing_config
Optional[EndpointingConfig]

Optional configuration for endpointing, which determines when to split the transcript based on criteria such as time or punctuation. If not provided, the default endpointing behavior will be used.

min_interrupt_confidence
Optional[float]

Optional minimum confidence threshold for interrupting the transcription. Confidence values range from 0 to 1, with higher values indicating greater confidence. If provided, transcriptions will only be interrupted when the confidence exceeds the threshold. If not provided, the default interrupting behavior will be used.

The TranscriberConfig class provides helper methods to abstract away some fields to the user.

For example, when using an input device like MicrophoneInput, you can use:

TranscriberConfig.from_input_device(MicrophoneInput())

You can also do this for telephone calls, which all share the same sampling rate, chunk size and audio encoding:

TranscriberConfig.from_telephone_input_device()

DeepgramTranscriberConfig

language
Optional[str]

The language code for the transcription, e.g., ‘en-US’ for American English. If not provided, the default language will be used.

model
Optional[str]

The model used for transcription. For Deepgram, it can be ‘phonecall’, ‘voicemail’.

tier
Optional[str]

The tier of the Deepgram API to use for the transcription, e.g., ‘enhanced’. If not provided, the default tier will be used.

version
Optional[str]

The version of the Deepgram API to use for the transcription. If not provided, the latest version will be used.

keywords
Optional[list]

A list of keywords to be used for keyword spotting during transcription with the Deepgram API. If not provided, no keyword spotting will be performed.

GoogleTranscriberConfig

language_code
str

The language code for the transcription, e.g., ‘en-US’ for American English. Defaults to ‘en-US’.

model
Optional[str]

The model used for transcription. For Google, it can be ‘default’, ‘video’, or ‘phone_call’. If not provided, the default model will be used.

AssemblyAITranscriberConfig

WhisperCPPTranscriberConfig

RevAITranscriberConfig