Diagnostics

Per-service latency, response sizes, and the fully-rendered prompt the LLM is seeing.

Diagnostics is where you go when you want to know what Voxta is actually doing under the hood. It shows per-service performance and the fully-rendered prompt the LLM received on the last turn.

What's on this screen

Per-service performance

For each active service:

Latency — how long the last call took.
Token counts — input / output where applicable.
Error rates — anything that failed.

Use this when a chat feels sluggish to find which service is the bottleneck.

Rendered prompt

The full prompt as sent to the LLM, with all templates resolved, contexts injected, history included, and summarization applied. This is what the model actually saw.

Indispensable when:

A character is acting weird and you want to know why — usually the prompt has the answer.
A context isn't activating and you want to confirm.
You're prompt-engineering and want to see how your template edits actually render.

What's next

Chat

The Chat Inspector covers similar ground during an active chat.

Terminal

Live log stream for deeper debugging.

Manage services

Fix what's slow.