About Text-To-Speech HTTP API
This allows you to call any text-to-speech service, given the right configuration.
Content Type
: The mime type of the generated audio, such asaudio/wav
oraudio/mpeg
.Url Template
: The url to call, with{text}
as a placeholder for the text to generate.Request Body
: Theapplication/json
body to generate speech, with{text}
as a placeholder for the text to generate, and{{ culture }}
or{{ language }}
if needed. Other fields from the voice will be available too. The template uses Scriban if you need conditions.
Voices
You have two ways to list voices. Dynamically, if there is an API, or manually using Voices
.
Voices Url
: The url that should return ajson
array of voices.Voices Format
: How to convert a voice from the API to Voxta’sVoiceInfo
format. You can use Scriban if you need conditions.- Default voices: Specify a part of the label or properties, to select from the list
xtts-api-server
This allows you to run xtts-v2, one of the best open source text to speech systems right now.
More information on how to install and run it.
Content Type
audio/wav
Url Template
http://localhost:8020/tts_to_audio/
Request Body
:
{
"text": "{{ text }}",
"speaker_wav": "{{ speaker_wav }}",
"language": "{{ if !language.empty? }}{{ language }}{{ else }}en{{ end }}"
}
Voices Url
:
http://localhost:8020/speakers/
Voices Format
:
{
"label": "{{ name }}",
"parameters": {
"speaker_wav": "{{ voice_id }}.wav"
}
}