Voxta docs

Chatterbox TTS

Local TTS using a Diffusion Transformer with ConvNeXt V2.

Chatterbox TTS is a local text-to-speech engine using a Diffusion Transformer with ConvNeXt V2 — faster training and inference than older diffusion TTS models.

Setup

Add the service

Manage Services → + Add Services → Chatterbox TTS → Add. Voxta installs the Python runtime and model weights automatically (watch the Terminal for first-run install progress).

Pick a voice

In the Chatterbox config, browse available voices and pick one.

On this page