Running agents

You can run agents via the Runner class. You have 3 options:

Runner.run(), which runs async and returns aRunResult.
Runner.run_sync(), which is a sync method and just runs.run() under the hood.
Runner.run_streamed(), which runs async and returns aRunResultStreaming. It calls the LLM in streaming mode, and streams those events to you as they are received.

fromagentsimportAgent,Runnerasyncdefmain():agent=Agent(name="Assistant",instructions="You are a helpful assistant")result=awaitRunner.run(agent,"Write a haiku about recursion in programming.")print(result.final_output)# Code within the code,# Functions calling themselves,# Infinite loop's dance

The agent loop

When you use the run method inRunner, you pass in a starting agent and input. The input can either be a string (which is considered a user message), or a list of input items, which are the items in the OpenAI Responses API.

The runner then runs a loop:

We call the LLM for the current agent, with the current input.
The LLM produces its output.
1. If the LLM returns afinal_output, the loop ends and we return the result.
2. If the LLM does a handoff, we update the current agent and input, and re-run the loop.
3. If the LLM produces tool calls, we run those tool calls, append the results, and re-run the loop.
If we exceed themax_turns passed, we raise aMaxTurnsExceeded exception.

Note

The rule for whether the LLM output is considered as a "final output" is that it produces text output with the desired type, and there are no tool calls.

Streaming

Streaming allows you to additionally receive streaming events as the LLM runs. Once the stream is done, theRunResultStreaming will contain the complete information about the run, including all the new outputs produced. You can call.stream_events() for the streaming events. Read more in thestreaming guide.

Run config

Therun_config parameter lets you configure some global settings for the agent run:

model: Allows setting a global LLM model to use, irrespective of whatmodel each Agent has.
model_provider: A model provider for looking up model names, which defaults to OpenAI.
model_settings: Overrides agent-specific settings. For example, you can set a globaltemperature ortop_p.
session_settings: Overrides session-level defaults (for example,SessionSettings(limit=...)) when retrieving history during a run.
input_guardrails,output_guardrails: A list of input or output guardrails to include on all runs.
handoff_input_filter: A global input filter to apply to all handoffs, if the handoff doesn't already have one. The input filter allows you to edit the inputs that are sent to the new agent. See the documentation inHandoff.input_filter for more details.
nest_handoff_history: Opt-in beta that collapses the prior transcript into a single assistant message before invoking the next agent. This is disabled by default while we stabilize nested handoffs; set toTrue to enable or leaveFalse to pass through the raw transcript. AllRunner methods automatically create aRunConfig when you do not pass one, so the quickstarts and examples keep the default off, and any explicitHandoff.input_filter callbacks continue to override it. Individual handoffs can override this setting viaHandoff.nest_handoff_history.
handoff_history_mapper: Optional callable that receives the normalized transcript (history + handoff items) whenever you opt in tonest_handoff_history. It must return the exact list of input items to forward to the next agent, allowing you to replace the built-in summary without writing a full handoff filter.
tracing_disabled: Allows you to disabletracing for the entire run.
tracing: Pass aTracingConfig to override exporters, processors, or tracing metadata for this run.
trace_include_sensitive_data: Configures whether traces will include potentially sensitive data, such as LLM and tool call inputs/outputs.
workflow_name,trace_id,group_id: Sets the tracing workflow name, trace ID and trace group ID for the run. We recommend at least settingworkflow_name. The group ID is an optional field that lets you link traces across multiple runs.
trace_metadata: Metadata to include on all traces.
session_input_callback: Customize how new user input is merged with session history before each turn when using Sessions.
call_model_input_filter: Hook to edit the fully prepared model input (instructions and input items) immediately before the model call, e.g., to trim history or inject a system prompt.
tool_error_formatter: Customize the model-visible message when a tool call is rejected during approval flows.

Nested handoffs are available as an opt-in beta. Enable the collapsed-transcript behavior by passingRunConfig(nest_handoff_history=True) or sethandoff(..., nest_handoff_history=True) to turn it on for a specific handoff. If you prefer to keep the raw transcript (the default), leave the flag unset or provide ahandoff_input_filter (orhandoff_history_mapper) that forwards the conversation exactly as you need. To change the wrapper text used in the generated summary without writing a custom mapper, callset_conversation_history_wrappers (andreset_conversation_history_wrappers to restore the defaults).

Conversations/chat threads

Calling any of the run methods can result in one or more agents running (and hence one or more LLM calls), but it represents a single logical turn in a chat conversation. For example:

User turn: user enter text
Runner run: first agent calls LLM, runs tools, does a handoff to a second agent, second agent runs more tools, and then produces an output.

At the end of the agent run, you can choose what to show to the user. For example, you might show the user every new item generated by the agents, or just the final output. Either way, the user might then ask a followup question, in which case you can call the run method again.

Manual conversation management

You can manually manage conversation history using theRunResultBase.to_input_list() method to get the inputs for the next turn:

asyncdefmain():agent=Agent(name="Assistant",instructions="Reply very concisely.")thread_id="thread_123"# Example thread IDwithtrace(workflow_name="Conversation",group_id=thread_id):# First turnresult=awaitRunner.run(agent,"What city is the Golden Gate Bridge in?")print(result.final_output)# San Francisco# Second turnnew_input=result.to_input_list()+[{"role":"user","content":"What state is it in?"}]result=awaitRunner.run(agent,new_input)print(result.final_output)# California

Automatic conversation management with Sessions

For a simpler approach, you can useSessions to automatically handle conversation history without manually calling.to_input_list():

fromagentsimportAgent,Runner,SQLiteSessionasyncdefmain():agent=Agent(name="Assistant",instructions="Reply very concisely.")# Create session instancesession=SQLiteSession("conversation_123")thread_id="thread_123"# Example thread IDwithtrace(workflow_name="Conversation",group_id=thread_id):# First turnresult=awaitRunner.run(agent,"What city is the Golden Gate Bridge in?",session=session)print(result.final_output)# San Francisco# Second turn - agent automatically remembers previous contextresult=awaitRunner.run(agent,"What state is it in?",session=session)print(result.final_output)# California

Sessions automatically:

Retrieves conversation history before each run
Stores new messages after each run
Maintains separate conversations for different session IDs

See theSessions documentation for more details.

Server-managed conversations

You can also let the OpenAI conversation state feature manage conversation state on the server side, instead of handling it locally withto_input_list() orSessions. This allows you to preserve conversation history without manually resending all past messages. See theOpenAI Conversation state guide for more details.

OpenAI provides two ways to track state across turns:

1. Using`conversation_id`

You first create a conversation using the OpenAI Conversations API and then reuse its ID for every subsequent call:

fromagentsimportAgent,RunnerfromopenaiimportAsyncOpenAIclient=AsyncOpenAI()asyncdefmain():agent=Agent(name="Assistant",instructions="Reply very concisely.")# Create a server-managed conversationconversation=awaitclient.conversations.create()conv_id=conversation.idwhileTrue:user_input=input("You: ")result=awaitRunner.run(agent,user_input,conversation_id=conv_id)print(f"Assistant:{result.final_output}")

2. Using`previous_response_id`

Another option isresponse chaining, where each turn links explicitly to the response ID from the previous turn.

fromagentsimportAgent,Runnerasyncdefmain():agent=Agent(name="Assistant",instructions="Reply very concisely.")previous_response_id=NonewhileTrue:user_input=input("You: ")# Setting auto_previous_response_id=True enables response chaining automatically# for the first turn, even when there's no actual previous response ID yet.result=awaitRunner.run(agent,user_input,previous_response_id=previous_response_id,auto_previous_response_id=True,)previous_response_id=result.last_response_idprint(f"Assistant:{result.final_output}")

Call model input filter

Usecall_model_input_filter to edit the model input right before the model call. The hook receives the current agent, context, and the combined input items (including session history when present) and returns a newModelInputData.

fromagentsimportAgent,Runner,RunConfigfromagents.runimportCallModelData,ModelInputDatadefdrop_old_messages(data:CallModelData[None])->ModelInputData:# Keep only the last 5 items and preserve existing instructions.trimmed=data.model_data.input[-5:]returnModelInputData(input=trimmed,instructions=data.model_data.instructions)agent=Agent(name="Assistant",instructions="Answer concisely.")result=Runner.run_sync(agent,"Explain quines",run_config=RunConfig(call_model_input_filter=drop_old_messages),)

Set the hook per run viarun_config or as a default on yourRunner to redact sensitive data, trim long histories, or inject additional system guidance.

Error handlers

AllRunner entry points accepterror_handlers, a dict keyed by error kind. Today, the supported key is"max_turns". Use it when you want to return a controlled final output instead of raisingMaxTurnsExceeded.

fromagentsimport(Agent,RunErrorHandlerInput,RunErrorHandlerResult,Runner,)agent=Agent(name="Assistant",instructions="Be concise.")defon_max_turns(_data:RunErrorHandlerInput[None])->RunErrorHandlerResult:returnRunErrorHandlerResult(final_output="I couldn't finish within the turn limit. Please narrow the request.",include_in_history=False,)result=Runner.run_sync(agent,"Analyze this long transcript",max_turns=3,error_handlers={"max_turns":on_max_turns},)print(result.final_output)

Setinclude_in_history=False when you do not want the fallback output appended to conversation history.

Long running agents & human-in-the-loop

For tool approval pause/resume patterns, see the dedicatedHuman-in-the-loop guide.

Temporal

You can use the Agents SDKTemporal integration to run durable, long-running workflows, including human-in-the-loop tasks. View a demo of Temporal and the Agents SDK working in action to complete long-running tasksin this video, andview docs here.

Restate

You can use the Agents SDKRestate integration for lightweight, durable agents, including human approval, handoffs, and session management. The integration requires Restate's single-binary runtime as a dependency, and supports running agents as processes/containers or serverless functions.Read theoverview or view thedocs for more details.

DBOS

You can use the Agents SDKDBOS integration to run reliable agents that preserves progress across failures and restarts. It supports long-running agents, human-in-the-loop workflows, and handoffs. It supports both sync and async methods. The integration requires only a SQLite or Postgres database. View the integrationrepo and thedocs for more details.

Exceptions

The SDK raises exceptions in certain cases. The full list is inagents.exceptions. As an overview:

AgentsException: This is the base class for all exceptions raised within the SDK. It serves as a generic type from which all other specific exceptions are derived.
MaxTurnsExceeded: This exception is raised when the agent's run exceeds themax_turns limit passed to theRunner.run,Runner.run_sync, orRunner.run_streamed methods. It indicates that the agent could not complete its task within the specified number of interaction turns.
ModelBehaviorError: This exception occurs when the underlying model (LLM) produces unexpected or invalid outputs. This can include:
- Malformed JSON: When the model provides a malformed JSON structure for tool calls or in its direct output, especially if a specificoutput_type is defined.
- Unexpected tool-related failures: When the model fails to use tools in an expected manner
UserError: This exception is raised when you (the person writing code using the SDK) make an error while using the SDK. This typically results from incorrect code implementation, invalid configuration, or misuse of the SDK's API.
InputGuardrailTripwireTriggered,OutputGuardrailTripwireTriggered: This exception is raised when the conditions of an input guardrail or output guardrail are met, respectively. Input guardrails check incoming messages before processing, while output guardrails check the agent's final response before delivery.

Movatterモバイル変換