Integrations
Hugging Face Inference
Trace Hugging Face Inference Endpoints in Muster via the OpenAI-compatible API.
Hugging Face Inference exposes hosted models — Llama, Mistral, Qwen, Falcon, and many others — via an OpenAI-compatible endpoint. Trace them in Muster by pointing the Muster OpenAI wrapper at the HF endpoint.
Setup
%pip install langfuse openai --upgradeimport os
os.environ["LANGFUSE_SECRET_KEY"] = "sk-lf-..."
os.environ["LANGFUSE_PUBLIC_KEY"] = "pk-lf-..."
os.environ["LANGFUSE_BASE_URL"] = "https://app.getmuster.io"
os.environ["HUGGINGFACE_ACCESS_TOKEN"] = "hf_..."from langfuse.openai import OpenAI
from langfuse import observe
client = OpenAI(
base_url="https://api-inference.huggingface.co/models/meta-llama/Meta-Llama-3-8B-Instruct/v1/",
api_key=os.environ["HUGGINGFACE_ACCESS_TOKEN"],
)Examples
Chat completion
completion = client.chat.completions.create(
model="model-name",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Write a poem about language models"},
],
)
print(completion.choices[0].message.content)Wrap the call with @observe()
@observe()
def generate_rap():
completion = client.chat.completions.create(
name="rap-generator",
model="tgi",
messages=[
{"role": "system", "content": "You are a poet."},
{"role": "user", "content": "Compose a rap about Muster."},
],
metadata={"category": "rap"},
)
return completion.choices[0].message.content
rap = generate_rap()
print(rap)