OpenAI SDK (Python)
Drop-in replacement for the OpenAI Python SDK that automatically captures prompts, completions, latencies, errors, and token usage in Muster.
This integration provides a drop-in replacement for the OpenAI Python SDK that enables full observability through Muster. By changing only the import statement, you gain automatic tracking of prompts, completions, latencies, errors, and token usage.
Key Features
The wrapper automatically captures:
- All prompts/completions with support for streaming, async, and functions
- Latencies and API errors
- Model usage (tokens) and cost in USD
Installation & Setup
Requirements
The integration requires OpenAI SDK version >=0.27.8, with async and
streaming support available in >=1.0.0.
pip install langfuse openaiConfiguration
Option 1: Environment Variables
LANGFUSE_SECRET_KEY="sk-lf-..."
LANGFUSE_PUBLIC_KEY="pk-lf-..."
LANGFUSE_BASE_URL="https://app.getmuster.io" # or your self-hosted Muster URLOption 2: Direct Attributes
import openai
openai.langfuse_public_key = "pk-lf-..."
openai.langfuse_secret_key = "sk-lf-..."
openai.langfuse_enabled = TrueBasic Usage
Simply change the import:
from langfuse.openai import openai
completion = openai.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Tell me a story about a dog."}],
)Alternative imports are available for OpenAI, AsyncOpenAI, AzureOpenAI,
and AsyncAzureOpenAI classes.
Advanced Features
Custom Trace Properties
You can add name, metadata, trace_id, and parent_observation_id to
OpenAI method calls. Trace attributes like session_id, user_id, and
tags can be set via metadata or enclosing spans.
Structured Output Support
For SDK versions >=1.92.0:
completion = openai.chat.completions.parse(
model="gpt-4o-2024-08-06",
messages=[...],
response_format=CalendarEvent,
name="extract-calendar-event"
)Streaming with Token Usage
When using streaming responses with include_usage=True, OpenAI returns
token usage information in a final chunk that has an empty choices list.
stream = openai.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Count to ten."}],
stream=True,
stream_options={"include_usage": True},
)
for chunk in stream:
print(chunk)Configuration & Troubleshooting
Flushing Events
For short-lived applications, flush events before exit:
from langfuse import get_client
langfuse = get_client()
langfuse.flush()Debug Mode
from langfuse import Langfuse
langfuse = Langfuse(debug=True)
# OR via env var:
# export LANGFUSE_DEBUG=trueSampling
Control trace volume with sampling configuration:
from langfuse import Langfuse
langfuse = Langfuse(sample_rate=0.1)
# OR:
# export LANGFUSE_SAMPLE_RATE=0.1Disable Tracing
from langfuse import Langfuse
langfuse = Langfuse(tracing_enabled=False)
# OR:
# export LANGFUSE_TRACING_ENABLED=falseLimitations
The integration does not support the OpenAI Assistants API due to server-side state constraints. A separate notebook is available demonstrating manual tracking approaches for this use case — see the upstream guide.
See also
- OpenAI SDK (JS/TS)
- Upstream Langfuse OpenAI (Python) docs for the latest cookbook examples