Instrumentation
To instrument your application to send traces to Langfuse, you can use either native library instrumentations that work out of the box, or use custom instrumentation methods for fine-grained control.
Custom Instrumentation
There are three main ways to create spans with the Langfuse Python SDK. All of them are fully interoperable with each other.
The@observe() decorator provides a convenient way to automatically trace function executions, including capturing their inputs, outputs, execution time, and any errors. It supports both synchronous and asynchronous functions.
from langfuseimport observe@observe()def my_data_processing_function(data, parameter): # ... processing logic ... return {"processed_data": data,"status":"ok"}@observe(name="llm-call",as_type="generation")async def my_async_llm_call(prompt_text): # ... async LLM call ... return "LLM response"Parameters:
name: Optional[str]: Custom name for the created span/generation. Defaults to the function name.as_type: Optional[Literal["generation"]]: If set to"generation", a Langfuse generation object is created, suitable for LLM calls. Otherwise, a regular span is created.capture_input: bool: Whether to capture function arguments as input. Defaults to env varLANGFUSE_OBSERVE_DECORATOR_IO_CAPTURE_ENABLEDorTrueif not set.capture_output: bool: Whether to capture function return value as output. Defaults to env varLANGFUSE_OBSERVE_DECORATOR_IO_CAPTURE_ENABLEDorTrueif not set.transform_to_string: Optional[Callable[[Iterable], str]]: For functions that return generators (sync or async), this callable can be provided to transform the collected chunks into a single string for theoutputfield. If not provided, and all chunks are strings, they will be concatenated. Otherwise, the list of chunks is stored.
Trace Context and Special Keyword Arguments:
The@observe decorator automatically propagates the OTEL trace context. If a decorated function is called from within an active Langfuse span (or another OTEL span), the new observation will be nested correctly.
You can also pass special keyword arguments to a decorated function to control its tracing behavior:
langfuse_trace_id: str: Explicitly set the trace ID for this function call. Must be a valid W3C Trace Context trace ID (32-char hex). If you have a trace ID from an external system, you can useLangfuse.create_trace_id(seed=external_trace_id)to generate a valid deterministic ID.langfuse_parent_observation_id: str: Explicitly set the parent observation ID. Must be a valid W3C Trace Context span ID (16-char hex).
@observe()def my_function(a, b): return a+ b# Call with a specific trace contextmy_function(1,2,langfuse_trace_id="1234567890abcdef1234567890abcdef")The observe decorator is capturing the args, kwargs and return value of decorated functions by default. This may lead to performance issues in your application if you have large or deeply nested objects there. To avoid this, explicitly disable function IO capture on the decorated function by passingcapture_input / capture_output with valueFalse or globally by setting the environment variableLANGFUSE_OBSERVE_DECORATOR_IO_CAPTURE_ENABLED=False.
You can create spans or generations anywhere in your application. If you need more control than the@observe decorator, the primary way to do this is using context managers (withwith statements), which ensure that observations are properly started and ended.
langfuse.start_as_current_observation(as_type="span"): Creates a new span and sets it as the currently active observation in the OTel context for its duration. Any new observations created within this block will be its children.langfuse.start_as_current_observation(as_type="generation"): Similar to the above, but creates a specialized “generation” observation type for LLM calls.- You can see an overview of the different observation typeshere.
from langfuseimport get_client, propagate_attributeslangfuse= get_client()with langfuse.start_as_current_observation( as_type="span", name="user-request-pipeline", input={"user_query":"Tell me a joke about OpenTelemetry"},)as root_span: # This span is now active in the context. # Propagate trace attributes to all child observations with propagate_attributes( user_id="user_123", session_id="session_abc", tags=["experimental","comedy"] ): # Create a nested generation with langfuse.start_as_current_observation( as_type="generation", name="joke-generation", model="gpt-4o", input=[{"role":"user","content":"Tell me a joke about OpenTelemetry"}], model_parameters={"temperature":0.7} )as generation: # Simulate an LLM call joke_response= "Why did the OpenTelemetry collector break up with the span? Because it needed more space... for its attributes!" token_usage= {"input_tokens":10,"output_tokens":25} generation.update( output=joke_response, usage_details=token_usage ) # Generation ends automatically here root_span.update(output={"final_joke": joke_response}) # Root span ends automatically hereFor scenarios where you need to create an observation (a span or generation) without altering the currently active OpenTelemetry context, you can uselangfuse.start_span() orlangfuse.start_generation().
from langfuseimport get_clientlangfuse= get_client()span= langfuse.start_span(name="my-span")span.end()# Important: Manually end the spanIf you uselangfuse.start_span() orlangfuse.start_generation(), you areresponsible for calling.end() on the returned observation object. Failureto do so will result in incomplete or missing observations in Langfuse. Theirstart_as_current_... counterparts used with awith statement handle thisautomatically.
Key Characteristics:
- No Context Shift: Unlike their
start_as_current_...counterparts, these methodsdo not set the new observation as the active one in the OpenTelemetry context. The previously active span (if any) remains the current context for subsequent operations in the main execution flow. - Parenting: The observation created by
start_span()orstart_generation()will still be a child of the span that was active in the context at the moment of its creation. - Manual Lifecycle: These observations are not managed by a
withblock and thereforemust be explicitly ended by calling their.end()method. - Nesting Children:
- Subsequent observations created using the global
langfuse.start_as_current_observation()(or similar global methods) willnot be children of these “manual” observations. Instead, they will be parented by the original active span. - To create children directly under a “manual” observation, you would use methodson that specific observation object (e.g.,
manual_span.start_as_current_observation(...)).
- Subsequent observations created using the global
When to Use:
This approach is useful when you need to:
- Record work that is self-contained or happens in parallel to the main execution flow but should still be part of the same overall trace (e.g., a background task initiated by a request).
- Manage the observation’s lifecycle explicitly, perhaps because its start and end are determined by non-contiguous events.
- Obtain an observation object reference before it’s tied to a specific context block.
Example with more complex nesting:
from langfuseimport get_clientlangfuse= get_client()# This outer span establishes an active context.with langfuse.start_as_current_observation(as_type="span",name="main-operation")as main_operation_span: # 'main_operation_span' is the current active context. # 1. Create a "manual" span using langfuse.start_span(). # - It becomes a child of 'main_operation_span'. # - Crucially, 'main_operation_span' REMAINS the active context. # - 'manual_side_task' does NOT become the active context. manual_side_task= langfuse.start_span(name="manual-side-task") manual_side_task.update(input="Data for side task") # 2. Start another operation that DOES become the active context. # This will be a child of 'main_operation_span', NOT 'manual_side_task', # because 'manual_side_task' did not alter the active context. with langfuse.start_as_current_observation(as_type="span",name="core-step-within-main")as core_step_span: # 'core_step_span' is now the active context. # 'manual_side_task' is still open but not active in the global context. core_step_span.update(input="Data for core step") # ... perform core step logic ... core_step_span.update(output="Core step finished") # 'core_step_span' ends. 'main_operation_span' is the active context again. # 3. Complete and end the manual side task. # This could happen at any point after its creation, even after 'core_step_span'. manual_side_task.update(output="Side task completed") manual_side_task.end()# Manual end is crucial for 'manual_side_task' main_operation_span.update(output="Main operation finished")# 'main_operation_span' ends automatically here.# Expected trace structure in Langfuse:# - main-operation# |- manual-side-task# |- core-step-within-main# (Note: 'core-step-within-main' is a sibling to 'manual-side-task', both children of 'main-operation')Nesting Observations
The function call hierarchy is automatically captured by the@observe decorator reflected in the trace.
from langfuseimport observe@observedef my_data_processing_function(data, parameter): # ... processing logic ... return {"processed_data": data,"status":"ok"}@observedef main_function(data, parameter): return my_data_processing_function(data, parameter)Nesting is handled automatically by OpenTelemetry’s context propagation. When you create a new observation usingstart_as_current_observation, it becomes a child of the observation that was active in the context when it was created.
from langfuseimport get_clientlangfuse= get_client()with langfuse.start_as_current_observation(as_type="span",name="outer-process")as outer_span: # outer_span is active with langfuse.start_as_current_observation(as_type="generation",name="llm-step-1")as gen1: # gen1 is active, child of outer_span gen1.update(output="LLM 1 output") with outer_span.start_as_current_span(name="intermediate-step")as mid_span: # mid_span is active, also a child of outer_span # This demonstrates using the yielded span object to create children with mid_span.start_as_current_observation(as_type="generation",name="llm-step-2")as gen2: # gen2 is active, child of mid_span gen2.update(output="LLM 2 output") mid_span.update(output="Intermediate processing done") outer_span.update(output="Outer process finished")If you are creating observations manually (not_as_current_), you can use the methods on the parentLangfuseSpan orLangfuseGeneration object to create children. These children willnot become the current context unless their_as_current_ variants are used.
from langfuseimport get_clientlangfuse= get_client()parent= langfuse.start_span(name="manual-parent")child_span= parent.start_span(name="manual-child-span")# ... work ...child_span.end()child_gen= parent.start_generation(name="manual-child-generation")# ... work ...child_gen.end()parent.end()Updating Observations
You can update observations with new information as your code executes.
- For spans/generations created via context managers or assigned to variables: use the
.update()method on the object. - To update thecurrently active observation in the context (without needing a direct reference to it): use
langfuse.update_current_span()orlangfuse.update_current_generation().
LangfuseSpan.update() /LangfuseGeneration.update() parameters:
| Parameter | Type | Description | Applies To |
|---|---|---|---|
input | Optional[Any] | Input data for the operation. | Both |
output | Optional[Any] | Output data from the operation. | Both |
metadata | Optional[Any] | Additional metadata (JSON-serializable). | Both |
version | Optional[str] | Version identifier for the code/component. | Both |
level | Optional[SpanLevel] | Severity:"DEBUG","DEFAULT","WARNING","ERROR". | Both |
status_message | Optional[str] | A message describing the status, especially for errors. | Both |
completion_start_time | Optional[datetime] | Timestamp when the LLM started generating the completion (streaming). | Generation |
model | Optional[str] | Name/identifier of the AI model used. | Generation |
model_parameters | Optional[Dict[str, MapValue]] | Parameters used for the model call (e.g., temperature). | Generation |
usage_details | Optional[Dict[str, int]] | Token usage (e.g.,{"input_tokens": 10, "output_tokens": 20}). | Generation |
cost_details | Optional[Dict[str, float]] | Cost information (e.g.,{"total_cost": 0.0023}). | Generation |
prompt | Optional[PromptClient] | AssociatedPromptClient object from Langfuse prompt management. | Generation |
from langfuseimport get_clientlangfuse= get_client()with langfuse.start_as_current_observation(as_type="generation",name="llm-call",model="gpt-5-mini")as gen: gen.update(input={"prompt":"Why is the sky blue?"}) # ... make LLM call ... response_text= "Rayleigh scattering..." gen.update( output=response_text, usage_details={"input_tokens":5,"output_tokens":50}, metadata={"confidence":0.9} )# Alternatively, update the current observation in context:with langfuse.start_as_current_observation(as_type="span",name="data-processing"): # ... some processing ... langfuse.update_current_span(metadata={"step1_complete":True}) # ... more processing ... langfuse.update_current_span(output={"result":"final_data"})Setting Trace Attributes
Trace-level attributes apply to the entire trace, not just a single observation. You can set or update these using:
- the
propagate_attributescontext manager that sets attributes on all observations inside its context and on the trace - The
.update_trace()method on anyLangfuseSpanorLangfuseGenerationobject within that trace. langfuse.update_current_trace()to update the trace associated with the currently active observation.
Trace attribute parameters:
| Parameter | Type | Description | Recommended Method |
|---|---|---|---|
name | Optional[str] | Name for the trace. | update_trace() |
user_id | Optional[str] | ID of the user associated with this trace. | propagate_attributes() |
session_id | Optional[str] | Session identifier for grouping related traces. | propagate_attributes() |
version | Optional[str] | Version of your application/service for this trace. | propagate_attributes() |
input | Optional[Any] | Overall input for the entire trace. | update_trace() |
output | Optional[Any] | Overall output for the entire trace. | update_trace() |
metadata | Optional[Any] | Additional metadata for the trace. | propagate_attributes() |
tags | Optional[List[str]] | List of tags to categorize the trace. | propagate_attributes() |
public | Optional[bool] | Whether the trace should be publicly accessible (if configured). | update_trace() |
Note: Foruser_id,session_id,metadata,version, andtags, considerusingpropagate_attributes() (see below) to ensure these attributes areapplied toall spans, not just the trace object.
In the near-term future filtering and aggregating observations by these attributes requires them to be present on all observations, andpropagate_attributes is the future-proof solution.
Propagating Attributes
Certain attributes (user_id,session_id,metadata,version,tags) should be applied toall spans created within some execution scope. This is important because Langfuse aggregation queries (e.g., filtering by user_id, calculating costs by session_id) will soon operate across individual observations rather than the trace level.
Use thepropagate_attributes() context manager to automatically propagate these attributes to all child observations:
from langfuseimport get_client,propagate_attributeslangfuse= get_client()with langfuse.start_as_current_observation(as_type="span",name="user-workflow")as span: # Propagate attributes to all child observations withpropagate_attributes( user_id="user_123", session_id="session_abc", metadata={"experiment":"variant_a","env":"prod"}, version="1.0" ): # All spans created here inherit these attributes with langfuse.start_as_current_observation( as_type="generation", name="llm-call", model="gpt-4o" )as gen: # This generation automatically has user_id, session_id, metadata, version passfrom langfuseimport observe,propagate_attributes@observe()def my_llm_pipeline(user_id:str, session_id:str): # Propagate early in the trace withpropagate_attributes( user_id=user_id, session_id=session_id, metadata={"pipeline":"main"} ): # All nested @observe functions inherit these attributes result= call_llm() return result@observe()def call_llm(): # This automatically has user_id, session_id, metadata from parent pass- Values must bestrings ≤200 characters
- Metadata keys:Alphanumeric characters only (no whitespace or special characters)
- Callearly in your trace to ensure all observations are covered. This way you make sure that all Metrics in Langfuse are accurate.
- Invalid values are dropped with a warning
Cross-Service Propagation
For distributed tracing across multiple services, use theas_baggage parameter (seeOpenTelemetry documentation for more details) to propagate attributes via HTTP headers:
from langfuseimport get_client,propagate_attributesimport requestslangfuse= get_client()# Service A - originating servicewith langfuse.start_as_current_observation(as_type="span",name="api-request"): withpropagate_attributes( user_id="user_123", session_id="session_abc", as_baggage=True # Propagate via HTTP headers ): # HTTP request to Service B response= requests.get("https://service-b.example.com/api") # user_id and session_id are now in HTTP headers# Service B will automatically extract and apply these attributesSecurity Warning: Whenas_baggage=True, attribute values are added toHTTP headers on ALL outbound requests. Only enable for non-sensitive valuesand when you need cross-service tracing.
Trace Input/Output Behavior
In v3, trace input and output are automatically set from theroot observation (first span/generation) by default. This differs from v2 where integrations could set trace-level inputs/outputs directly.
Default Behavior
from langfuseimport get_clientlangfuse= get_client()with langfuse.start_as_current_observation( as_type="span", name="user-request", input={"query":"What is the capital of France?"}# This becomes the trace input)as root_span: with langfuse.start_as_current_observation( as_type="generation", name="llm-call", model="gpt-4o", input={"messages": [{"role":"user","content":"What is the capital of France?"}]} )as gen: response= "Paris is the capital of France." gen.update(output=response) # LLM generation input/output are separate from trace input/output root_span.update(output={"answer":"Paris"})# This becomes the trace outputOverride Default Behavior
If you need different trace inputs/outputs than the root observation, explicitly set them:
from langfuseimport get_clientlangfuse= get_client()with langfuse.start_as_current_observation(as_type="span",name="complex-pipeline")as root_span: # Root span has its own input/output root_span.update(input="Step 1 data",output="Step 1 result") # But trace should have different input/output (e.g., for LLM-as-a-judge) root_span.update_trace( input={"original_query":"User's actual question"}, output={"final_answer":"Complete response","confidence":0.95} ) # Now trace input/output are independent of root span input/outputCritical for LLM-as-a-Judge Features
LLM-as-a-judge and evaluation features typically rely on trace-level inputs and outputs. Make sure to set these appropriately:
from langfuseimport observe, get_clientlangfuse= get_client()@observe()def process_user_query(user_question:str): # LLM processing... answer= call_llm(user_question) # Explicitly set trace input/output for evaluation features langfuse.update_current_trace( input={"question": user_question}, output={"answer": answer} ) return answerTrace and Observation IDs
Langfuse uses W3C Trace Context compliant IDs:
- Trace IDs: 32-character lowercase hexadecimal string (16 bytes).
- Observation IDs (Span IDs): 16-character lowercase hexadecimal string (8 bytes).
You can retrieve these IDs:
langfuse.get_current_trace_id(): Gets the trace ID of the currently active observation.langfuse.get_current_observation_id(): Gets the ID of the currently active observation.span_obj.trace_idandspan_obj.id: Access IDs directly from aLangfuseSpanorLangfuseGenerationobject.
For scenarios where you need to generate IDs outside of an active trace (e.g., to link scores to traces/observations that will be created later, or to correlate with external systems), use:
Langfuse.create_trace_id(seed: Optional[str] = None)(static method): Generates a new trace ID. If aseedis provided, the ID is deterministic. Use the same seed to get the same ID. This is useful for correlating external IDs with Langfuse traces.
from langfuseimport get_client, Langfuselangfuse= get_client()# Get current IDswith langfuse.start_as_current_observation(as_type="span",name="my-op")as current_op: trace_id= langfuse.get_current_trace_id() observation_id= langfuse.get_current_observation_id() print(f"Current Trace ID:{trace_id}, Current Observation ID:{observation_id}") print(f"From object: Trace ID:{current_op.trace_id}, Observation ID:{current_op.id}")# Generate IDs deterministicallyexternal_request_id= "req_12345"deterministic_trace_id= Langfuse.create_trace_id(seed=external_request_id)print(f"Deterministic Trace ID for{external_request_id}:{deterministic_trace_id}")Linking to Existing Traces (Trace Context)
If you have atrace_id (and optionally aparent_span_id) from an external source (e.g., another service, a batch job), you can link new observations to it using thetrace_context parameter. Note that OpenTelemetry offers native cross-service context propagation, so this is not necessarily required for calls between services that are instrumented with OTEL.
from langfuseimport get_clientlangfuse= get_client()existing_trace_id= "abcdef1234567890abcdef1234567890" # From an upstream serviceexisting_parent_span_id= "fedcba0987654321" # Optional parent span in that tracewith langfuse.start_as_current_observation( as_type="span", name="process-downstream-task", trace_context={ "trace_id": existing_trace_id, "parent_span_id": existing_parent_span_id# If None, this becomes a root span in the existing trace })as span: # This span is now part of the trace `existing_trace_id` # and a child of `existing_parent_span_id` if provided. print(f"This span's trace_id:{span.trace_id}")# Will be existing_trace_id passClient Management
flush()
Manually triggers the sending of all buffered observations (spans, generations, scores, media metadata) to the Langfuse API. This is useful in short-lived scripts or before exiting an application to ensure all data is persisted.
from langfuseimport get_clientlangfuse= get_client()# ... create traces and observations ...langfuse.flush()# Ensures all pending data is sentTheflush() method blocks until the queued data is processed by the respective background threads.
shutdown()
Gracefully shuts down the Langfuse client. This includes:
- Flushing all buffered data (similar to
flush()). - Waiting for background threads (for data ingestion and media uploads) to finish their current tasks and terminate.
It’s crucial to callshutdown() before your application exits to prevent data loss and ensure clean resource release. The SDK automatically registers anatexit hook to callshutdown() on normal program termination, but manual invocation is recommended in scenarios like:
- Long-running daemons or services when they receive a shutdown signal.
- Applications where
atexitmight not reliably trigger (e.g., certain serverless environments or forceful terminations).
from langfuseimport get_clientlangfuse= get_client()# ... application logic ...# Before exiting:langfuse.shutdown()Native Instrumentations
The Langfuse Python SDK has native integrations for the OpenAI and LangChain SDK. You can also use any other OTel-based instrumentation library to automatically trace your calls in Langfuse.
OpenAI Integration
Langfuse offers a drop-in replacement for the OpenAI Python SDK to automatically trace all your OpenAI API calls. Simply change your import statement:
- import openai+ from langfuse.openai import openai# Your existing OpenAI code continues to work as is# For example:# client = openai.OpenAI()# completion = client.chat.completions.create(...)What’s automatically captured:
- Requests & Responses: All prompts/completions, including support for streaming, async operations, and function/tool calls.
- Timings: Latencies for API calls.
- Errors: API errors are captured with their details.
- Model Usage: Token counts (input, output, total).
- Cost: Estimated cost in USD (based on model and token usage).
- Media: Input audio and output audio from speech-to-text and text-to-speech endpoints.
The integration is fully interoperable with@observe and manual tracing methods (start_as_current_span, etc.). If an OpenAI call is made within an active Langfuse span, the OpenAI generation will be correctly nested under it.
Passing Langfuse arguments to OpenAI calls:
You can pass Langfuse-specific arguments directly to OpenAI client methods. These will be used to enrich the trace data.
from langfuseimport get_client, propagate_attributesfrom langfuse.openaiimport openailangfuse= get_client()client= openai.OpenAI()with langfuse.start_as_current_observation(as_type="span",name="qna-bot-openai")as span: with propagate_attributes( tags=["qna-bot-openai"] ): # This will be traced as a Langfuse generation response= client.chat.completions.create( name="qna-bot-openai",# Custom name for this generation in Langfuse metadata={"user_tier":"premium","request_source":"web_api"},# will be added to the Langfuse generation model="gpt-4o", messages=[{"role":"user","content":"What is OpenTelemetry?"}], )Setting trace attributes via metadata:
You can set trace attributes (session_id,user_id,tags) directly on OpenAI calls using special fields in themetadata parameter:
from langfuse.openaiimport openaiclient= openai.OpenAI()response= client.chat.completions.create( model="gpt-4o", messages=[{"role":"user","content":"Hello"}], metadata={ "langfuse_session_id":"session_123", "langfuse_user_id":"user_456", "langfuse_tags": ["production","chat-bot"], "custom_field":"additional metadata" # Regular metadata fields work too })The special metadata fields are:
langfuse_session_id: Sets the session ID for the tracelangfuse_user_id: Sets the user ID for the tracelangfuse_tags: Sets tags for the trace (should be a list of strings)
Supported Langfuse arguments:name,metadata,langfuse_prompt
Learn more in theOpenAI integration documentation.
Langchain Integration
Langfuse provides a callback handler for Langchain to trace its operations.
Setup:
Initialize theCallbackHandler and add it to your Langchain calls, either globally or per-call.
from langfuseimport get_client, propagate_attributesfrom langfuse.langchainimport CallbackHandlerfrom langchain_openaiimport ChatOpenAI# Example LLMfrom langchain_core.promptsimport ChatPromptTemplatelangfuse= get_client()# Initialize the Langfuse handlerlangfuse_handler= CallbackHandler()# Example: Using it with an LLM callllm= ChatOpenAI(model_name="gpt-4o")prompt= ChatPromptTemplate.from_template("Tell me a joke about{topic}")chain= prompt| llmwith langfuse.start_as_current_observation(as_type="span",name="joke-chain")as span: with propagate_attributes( tags=["joke-chain"] ): response= chain.invoke({"topic":"cats"},config={"callbacks": [langfuse_handler]}) print(response)Setting trace attributes via metadata:
You can set trace attributes (session_id,user_id,tags) directly during chain invocation using special fields in themetadata configuration:
from langfuse.langchainimport CallbackHandlerfrom langchain_openaiimport ChatOpenAIfrom langchain_core.promptsimport ChatPromptTemplate# Initialize the Langfuse handlerlangfuse_handler= CallbackHandler()# Create your LangChain componentsllm= ChatOpenAI(model_name="gpt-4o")prompt= ChatPromptTemplate.from_template("Tell me a joke about{topic}")chain= prompt| llm# Set trace attributes via metadata in chain invocationresponse= chain.invoke( {"topic":"cats"}, config={ "callbacks": [langfuse_handler], "metadata": { "langfuse_session_id":"session_123", "langfuse_user_id":"user_456", "langfuse_tags": ["production","humor-bot"], "custom_field":"additional metadata" # Regular metadata fields work too } })The special metadata fields are:
langfuse_session_id: Sets the session ID for the tracelangfuse_user_id: Sets the user ID for the tracelangfuse_tags: Sets tags for the trace (should be a list of strings)
You can also passupdate_trace=True to the CallbackHandler init to force a trace update with the chains input, output and metadata.
What’s captured:
The callback handler maps various Langchain events to Langfuse observations:
- Chains (
on_chain_start,on_chain_end,on_chain_error): Traced as spans. - LLMs (
on_llm_start,on_llm_end,on_llm_error,on_chat_model_start): Traced as generations, capturing model name, prompts, responses, and usage if available from the LLM provider. - Tools (
on_tool_start,on_tool_end,on_tool_error): Traced as spans, capturing tool input and output. - Retrievers (
on_retriever_start,on_retriever_end,on_retriever_error): Traced as spans, capturing the query and retrieved documents. - Agents (
on_agent_action,on_agent_finish): Agent actions and final finishes are captured within their parent chain/agent span.
Langfuse attempts to parse model names, usage, and other relevant details from the information provided by Langchain. Themetadata argument in Langchain calls can be used to pass additional information to Langfuse, includinglangfuse_prompt to link with managed prompts.
Learn more in theLangchain integration documentation.
Third-party integrations
The Langfuse SDK seamlessly integrates with any third-party library that uses OpenTelemetry instrumentation. When these libraries emit spans, they are automatically captured and properly nested within your trace hierarchy. This enables unified tracing across your entire application stack without requiring any additional configuration.
For example, if you’re using OpenTelemetry-instrumented databases, HTTP clients, or other services alongside your LLM operations, all these spans will be correctly organized within your traces in Langfuse.
You can use any third-party, OTEL-based instrumentation library for Anthropic to automatically trace all your Anthropic API calls in Langfuse.
In this example, we are using theopentelemetry-instrumentation-anthropic library.
from anthropicimport Anthropicfrom opentelemetry.instrumentation.anthropicimport AnthropicInstrumentorfrom langfuseimport get_client# This will automatically emit OTEL-spans for all Anthropic API callsAnthropicInstrumentor().instrument()langfuse= get_client()anthropic_client= Anthropic()with langfuse.start_as_current_observation(as_type="span",name="myspan"): # This will be traced as a Langfuse generation nested under the current span message= anthropic_client.messages.create( model="claude-3-7-sonnet-20250219", max_tokens=1024, messages=[{"role":"user","content":"Hello, Claude"}], ) print(message.content)# Flush events to Langfuse in short-lived applicationslangfuse.flush()Learn more in theAnthropic integration documentation.
You can use the third-party, OTEL-based instrumentation library for LlamaIndex to automatically trace your LlamaIndex calls in Langfuse.
In this example, we are using theopeninference-instrumentation-llama-index library.
from llama_index.core.llms.openaiimport OpenAIfrom openinference.instrumentation.llama_indeximport LlamaIndexInstrumentorfrom langfuseimport get_clientLlamaIndexInstrumentor().instrument()langfuse= get_client()llm= OpenAI(model="gpt-4o")with langfuse.start_as_current_observation(as_type="span",name="myspan"): response= llm.complete("Hello, world!") print(response)langfuse.flush()Learn more in theLlamaindex integration documentation.