Kokoro TTS

Kokoro TTS is an open-source text-to-speech (TTS) model that transforms text into natural-sounding speech with remarkable efficiency. Despite its compact size of 82 million parameters, it delivers high-quality voice synthesis comparable to models ten times larger.

Configuration Options

Model Settings

  • Model: Specify the Kokoro ONNX model file.

    • Example: hf:hexgrad/Kokoro-82M:kokoro-v0_19.onnx
  • Models Directory: Define the directory where models are stored.

    • Default: Data/HuggingFace
    • Note: Save and refresh the page to update the models list after making changes.

Options

  • Thinking Speech:

    Specify the sounds generated while the AI “thinks” or processes speech. Enter one sound file per line.

Defaults

  • Default Female Voice: The default voice used when a female character does not specify one.

    • Default: bf_isabella
  • Default Male Voice: The default voice used when a male character does not specify one.

    • Default: am_adam

Available Voices

Below is a list of all available voices that can be used in Kokoro:

  • Female Voices:

    • af_bella
    • af_nicole
    • af_sarah
    • af_sky
    • bf_emma
    • bf_isabella
  • Male Voices:

    • am_adam
    • am_michael
    • bm_george
    • bm_lewis

Device Settings

  • Use Cuda:

    Enable GPU usage for faster performance. If disabled, the CPU will be used instead.