Getting Started
Sign up, generate an API key on the portal, and connect Voxta Cloud to your Voxta Server.
Voxta Cloud connects to your Voxta Server through the same Services system as everything else. Drop in an API key, save, and it's available to your characters.
Prerequisites
- Voxta Cloud credits — subscribe directly with Voxta, via Patreon, or buy a one-time credit pack. See Plans & Billing for the options.
- Voxta Server installed and running.
Connect Voxta Server to Voxta Cloud
Sign in to the portal
Go to portal.voxta.ai and sign in with Google, Patreon, or Discord. The portal is where your subscription, credits, and API key live.
Sign in with the same method you used to subscribe — each sign-in method starts its own account. See Plans & Billing → Which sign-in should I use? if you need to link methods.
Generate an API key
In the portal, generate your API key. Copy it immediately and store it somewhere safe — you can't view it again after generation, only rotate it.
Only one API key is active at a time. Generating a new one revokes the previous one. This is intentional — if your key leaks, rotate it and the old key stops working everywhere.
In Voxta, install the Voxta Cloud service
Two paths, same result:
- Wizard — if you're on first-time setup, the wizard offers Voxta Cloud as an option.
- Manage Services — go to Services → Add Services, find Voxta Cloud, click to install.
Paste your API key
In the Voxta Cloud service config, paste your API key into the top field, then click Save & Install Service at the bottom.
That's it — Voxta Cloud is now available as a backend for LLM, TTS, and STT.
(Optional) Customize models
The default models work for most users. To change them (different LLM, different voice provider), clone the installed Voxta Cloud service in Manage Services and edit the clone's settings. Click Show Advanced Settings to expose the model picker.
What you get
-
LLM — Voxta tunes the default model continuously. Currently routed via OpenRouter to a high-quality general-purpose model. You can override per-clone.
-
TTS — three options out of the box:
- UnrealSpeech — cheapest credits per minute, good quality.
- Cartesia — low-latency neural TTS, great middle ground.
- ElevenLabs — pricier, best quality and emotion.
Pick voice per character on the Character Card.
-
STT — Deepgram-backed, low latency.
Want a free TTS option instead? The Coqui XTTS local service runs on ~2 GB of VRAM and is built into Voxta Server.