Musterby Elitery
Integrations

Groq

Trace Groq calls in Muster — either via the OpenAI-compatible endpoint or the Groq-native OpenInference instrumentation.

Groq hosts large language models on their LPU inference engine and exposes them via an OpenAI-compatible API. Muster has two ways to trace Groq calls.

Option 1: OpenAI SDK route

Use the Muster OpenAI wrapper pointed at Groq's base URL.

%pip install langfuse openai --upgrade
import os

os.environ["LANGFUSE_PUBLIC_KEY"] = "pk-lf-..."
os.environ["LANGFUSE_SECRET_KEY"] = "sk-lf-..."
os.environ["LANGFUSE_BASE_URL"] = "https://app.getmuster.io"
os.environ["GROQ_API_KEY"] = "gsk_..."
from langfuse.openai import OpenAI

client = OpenAI(
    base_url="https://api.groq.com/openai/v1",
    api_key=os.environ["GROQ_API_KEY"],
)

completion = client.chat.completions.create(
    model="llama3-8b-8192",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Write a poem about language models"},
    ],
)
print(completion.choices[0].message.content)

Option 2: Native Groq SDK + OpenInference

%pip install groq langfuse openinference-instrumentation-groq
from langfuse import get_client
from openinference.instrumentation.groq import GroqInstrumentor

langfuse = get_client()
assert langfuse.auth_check()

GroqInstrumentor().instrument()
from groq import Groq

groq_client = Groq(api_key=os.environ["GROQ_API_KEY"])

chat_completion = groq_client.chat.completions.create(
    messages=[{"role": "user", "content": "Explain the importance of fast language models"}],
    model="llama-3.3-70b-versatile",
)
print(chat_completion.choices[0].message.content)

Trace details

Muster displays request parameters, response content, token usage, and latency.

Groq trace example in Muster

See also