Weco
Optimise LLM applications with Weco using Muster as the evaluation backend.
Weco is a code optimisation platform that automatically iterates on LLM applications. With Muster (langfuse) wired in as the evaluation backend, each Weco iteration becomes a tracked experiment in Muster — easy to compare and roll back.
How it works
Each Weco optimisation cycle:
- Edits your source code.
- Runs the modified version against a Muster dataset.
- Collects evaluation scores from Muster.
- Retains the highest-performing variant.
Each iteration produces a new experiment run in Muster.
Setup
!pip install "weco[langfuse]" langfuse openai -q
!weco loginimport os
os.environ["LANGFUSE_PUBLIC_KEY"] = "pk-lf-***"
os.environ["LANGFUSE_SECRET_KEY"] = "sk-lf-***"
os.environ["LANGFUSE_BASE_URL"] = "https://app.getmuster.io"Workflow
-
Create a dataset in Muster with test inputs and expected outputs.
-
Write a target function for Weco to optimise:
def answer_question(inputs: dict) -> dict: question = inputs.get("question", "") # Your LLM logic here return {"answer": response} -
Define evaluators that score outputs and a metric function combining them.
-
Run optimisation via the Weco CLI, specifying the dataset, target function, and evaluators.
Weco iteratively refines prompts and logic; Muster tracks every variant side by side for comparison.