Prompt formatting
How Voxta builds the final prompt sent to the LLM, and how to pick the right format for the model you're running.
Prompt formatting is the wrapper Voxta puts around character data, scenario state, and chat history before sending it to the LLM. Different LLMs expect different formats — Llama 3 wants a different prompt shape than Mistral, Mistral wants a different shape than ChatML, and so on.
Voxta ships prompt formatting templates for the common formats and auto-detects the right one when you load a model. You can also override the template manually per-service if auto-detect picks wrong.
When the default works
For most LLMs Voxta picks the right format automatically. Cloud services (OpenAI, Anthropic, Voxta Cloud, etc.) use their providers' native formats and don't need configuration. Local GGUF/EXL2 models usually come with metadata that tells Voxta which format to use.
You only need to think about prompt formatting when:
- Auto-detect picks wrong — you see weird repetitions, formatting noise, or "stop" tokens leaking into output.
- You're using a model with an unusual template — fine-tunes sometimes diverge from their base model's format.
- You want a custom template for a specific use case.
Where to find it
In Manage Services → [your LLM service] → Configuration, look for Prompt Formatting Template. The dropdown shows:
- Automatic — Voxta picks based on model metadata. Recommended default.
- Specific templates — pick one manually if auto-detect is wrong (Llama 3, ChatML, Mistral Instruct, Alpaca, Vicuna, etc.).
Symptoms of wrong formatting
If you suspect the wrong template is active, look for:
- Character output that includes literal tokens like
<|im_start|>,[INST], or### Response:. - Replies that repeat the system prompt back at you.
- Replies that cut off mid-sentence at unusual stop tokens.
- Characters that speak as a different role (the LLM thinks it's still in user turn).
Switch to a specific template that matches your model's documentation, save, retry.