Welcome

Documentation hub for Muster — AI agent observability and governance built on Langfuse.

Muster gives you visibility and control over the AI agents running across your organization — where they live, what they cost, when they go wrong, and whether they should be running at all.

Start here

If you're new to Muster, work through these pages in order:

Get Started — sign up, create a project, send your first trace.
Concepts — the core data model: trace, observation, score, session, dataset.
Self-hosting — run Muster on your own infrastructure with Docker.

Operate it

Once tracing works, these pages cover the observability layer:

Agent Inventory — the lifecycle every agent moves through, from discovery to retirement.
Discovery Engine — find shadow agents already running in your AWS account via DNS query logs.
Cost Aggregation — track LLM and cloud infrastructure spend per agent.
Anomaly Detection — spot cost spikes, error surges, and accuracy degradation.
Hallucination Detection — flag arithmetic errors, broken references, and unstable outputs.
Auto-Instrumentation — let Muster suggest threshold tuning based on your traffic.

Govern it

The governance layer turns observability into action:

Business Rules — project-level policies that score every trace as it lands.
Risk Scoring — daily composite score per agent, gates approval workflows.
Weekly Report — Monday email summarizing costs, anomalies, and risk movers.

What's not yet here

Still to come:

API reference auto-generated from the OpenAPI spec (infrastructure is wired — see scripts/generate-api-docs.mjs — pending a performance fix for dev compile time)
SDK reference (Python, TypeScript, OpenTelemetry)

In the meantime, the engineering team's runbooks live in the docs/ folder of the repository — they're internal but accurate.

Welcome

Start here

Operate it

Govern it

What's not yet here

On this page