LlamaIndex

Trace LlamaIndex applications in Muster via the OpenInference instrumentation, capturing query engines, retrievers, and LLM calls.

LlamaIndex is a "data framework" tailored for augmenting LLMs with private data — query engines, retrievers, agents, and more. Muster captures LlamaIndex executions through the OpenInference instrumentation, which emits OpenTelemetry spans that the Muster ingestion endpoint converts into traces.

Setup

The setup is five short phases.

1. Install dependencies

pip install langfuse openinference-instrumentation-llama-index llama-index-llms-openai llama-index

2. Configure credentials

import os

os.environ["LANGFUSE_PUBLIC_KEY"] = "pk-lf-..."
os.environ["LANGFUSE_SECRET_KEY"] = "sk-lf-..."
os.environ["LANGFUSE_BASE_URL"] = "https://app.getmuster.io"  # or your self-hosted URL
os.environ["OPENAI_API_KEY"] = "sk-..."

3. Instrument LlamaIndex

from openinference.instrumentation.llama_index import LlamaIndexInstrumentor

LlamaIndexInstrumentor().instrument()

After instrumenting, every LlamaIndex operation — query engines, retrievers, LLM calls — is captured as OpenTelemetry spans and forwarded to Muster.

4. Build a query engine

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader("data/").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()

response = query_engine.query("What does the document say about X?")
print(response)

5. View traces

Open your project in Muster — you'll see a trace per query with the full LLM call, retrieval, and post-processing breakdown.

SDK Enhancement Features

The integration plays nicely with the Muster (langfuse) SDK helpers:

@observe() decorator — automatically wraps instrumented code and adds attributes (user_id, session_id, tags, metadata, version).
Context manager — uses with statements to wrap code and propagate attributes across the execution scope.

from langfuse import observe, propagate_attributes

@observe()
def run_query(question: str):
    with propagate_attributes(
        user_id="user_123",
        session_id="session_abc",
        tags=["llamaindex", "rag"],
    ):
        return query_engine.query(question)

Troubleshooting

Common issues:

Missing observations — enable LANGFUSE_DEBUG=true to see what the SDK is doing.
Unwanted spans — instrumentations from unrelated libraries can clutter traces. Filter them via OpenTelemetry span processors.
Attribute mapping — fields without an explicit langfuse.* prefix end up in the catch-all metadata bucket.