MusterMuster Docs

Business Rules

Project-level policies that score every trace as it lands.

Business rules are policies you write once and Muster evaluates against every trace your agents produce. Each rule emits a Langfuse score (0 = failed, 1 = passed) so you can filter, alert, and dashboard on violations the same way you do any other score.

Four rule types

TypeWhat it checksExample use
INPUT_OUTPUT_CONSISTENCYCompare a field on the input to a field on the output"The agent's response mentions the user_query topic"
THRESHOLD_RANGEA numeric field is within bounds"Latency under 5 seconds"
PATTERN_FORMATA field matches a regex or preset (EMAIL, URL, PHONE)"Output never contains a credit-card number"
COMPLETENESSRequired fields are non-empty"Every response includes a citations array"

Each rule lives in MusterBusinessRule with:

  • name and optional description (the score name auto-derives as BR: <name>)
  • checkType — one of the four above
  • ruleConfig — JSON config tailored to the type (which fields, comparators, presets, etc.)
  • targetTraceName — null to apply to every trace, or a specific trace name to scope it
  • enabled — flip to disable without deleting

How evaluation runs

trace completes
   └── businessRuleEvaluation worker enqueued
         └── fetch the trace + all enabled rules whose targetTraceName matches
               └── evaluate each rule against trace input/output/metadata
                     └── write a Langfuse score (0 or 1) with reason comment
                           — source: EVAL
                           — metadata: { source: "muster-business-rule" }

Scores are stored on the same table as every other Langfuse score, so they show up in trace detail view, score analytics, and dashboard widgets without any extra wiring.

Wire up your first rule

In Business Rules → New rule:

  1. Name: Response stays on topic
  2. Type: INPUT_OUTPUT_CONSISTENCY
  3. Input field: user_query
  4. Output field: response.text
  5. Comparison: CONTAINS
  6. Target trace name: leave empty to apply to every trace
  7. Enabled: on

Save, and the next trace your agent emits will get scored as BR: Response stays on topic = 0 or 1. Filter the Traces table by score to triage failures.

Tips

  • Rules cost nothing to leave on — evaluation is cheap, scores are cheap to store. Lean toward enabling more rules and triaging via score filters rather than running fewer rules.
  • Use targetTraceName to keep rules scoped. A "must include citations" rule probably only applies to your research-agent trace, not your tool-call trace.
  • Disable instead of deleting when iterating — keeps history queryable and lets you turn the rule back on without reconfiguring it.