Kokoro TTS

Kokoro TTS is an open-source text-to-speech (TTS) model that transforms text into natural-sounding speech with remarkable efficiency. Despite its compact size of 82 million parameters, it delivers high-quality voice synthesis comparable to models ten times larger.

Configuration Options

Model Settings

Model: Specify the Kokoro ONNX model file.
- Example: hf:hexgrad/Kokoro-82M:kokoro-v0_19.onnx
Models Directory: Define the directory where models are stored.
- Default: Data/HuggingFace
- Note: Save and refresh the page to update the models list after making changes.

Options

Thinking Speech:

Specify the sounds generated while the AI “thinks” or processes speech. Enter one sound file per line.

Defaults

Default Female Voice: The default voice used when a female character does not specify one.
- Default: bf_isabella
Default Male Voice: The default voice used when a male character does not specify one.
- Default: am_adam

Available Voices

Below is a list of all available voices that can be used in Kokoro:

Female Voices:
- af_bella
- af_nicole
- af_sarah
- af_sky
- bf_emma
- bf_isabella
Male Voices:
- am_adam
- am_michael
- bm_george
- bm_lewis

Device Settings

Use Cuda:

Enable GPU usage for faster performance. If disabled, the CPU will be used instead.