Voxta docs

Large Language Models

Text generation providers — the brain behind every character reply, across cloud and self-hosted options.

The LLM is the AI's brain. It handles three jobs in Voxta:

  • Reply — the conversation text the character speaks.
  • Action Inference — picking which action to fire.
  • Summarization — compressing old chat history into manageable summaries.

You can route all three through one service or split them across different ones (e.g. a high-quality model for Reply, a cheap fast model for Action Inference and Summarization).

The Voxta UI groups LLMs into three hosting tiers — these docs mirror that.

Cloud-Based

External companies. Need an API key. Pay per use.

Self-Hosted: Zero-Setup

Voxta installs the runtime and Python dependencies automatically on first use. You only provide a model.

Self-Hosted: Requires External Software

You install and run the upstream LLM server separately. Voxta connects to its API.

On this page