- Notifications
You must be signed in to change notification settings - Fork17
Framework enabling modular interchange of language agents, environments, and optimizers
License
Future-House/ldp
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
An framework for constructing language model agents and training on constructive tasks.
This repo models agent-environment interactions using aPartially Observable Markov Decision Process (POMDP).Inspired by POMDP, this repo's nameldp stands for Language Decision Processes.
To installldp:
pip install -e.If you plan to export Graphviz visualizations,make sure you also install thegraphviz library into your OS via:
- Linux:
apt install graphviz - macOS:
brew install graphviz
An agent is something that interacts with an environment (defined in our other GitHub repoFuture-House/aviary).
An agent uses tools in response to observations, which are just natural language observations. An agent has two functions:
agent_state=awaitagent.init_state(tools=tools)new_action,new_agent_state,value=awaitagent.get_asv(agent_state,obs)
get_asv(agent_state, obs) chooses an action (a) conditioned on the observation messages,and returns the next agent state (s) and a value estimate (v).The first argument,agent_state, is a state specific for the agent.The state is outside of the agent so agents are functional, enabling batching across environments.You can make the stateNone if you aren't using it. It could contain things like memory, as a list of previous observations and actions.
Theobs are not the complete list of all prior observations, but rather the return ofenv.step.Usually the state should keep track of these.
Value is the agent's state-action value estimate; it can default to 0.This is used for training with reinforcement learning.
You can just emit actions directly if you want:
fromaviary.coreimportToolCalldefget_asv(agent_state,obs):action=ToolCall.from_name("calculator_tool",x="3 * 2")returnaction,agent_state,0
but likely you want to do something more sophisticated.Here's how ourSimpleAgent - which just relies on a single LLM call - works (typing omitted):
fromldp.agentimportAgentfromldp.graphimportLLMCallOpclassAgentState:def__init__(self,messages,tools):self.messages=messagesself.tools=toolsclassSimpleAgent(Agent):def__init__(self,**kwargs):super().__init__(**kwargs)self.llm_call_op=LLMCallOp()asyncdefinit_state(self,tools):returnAgentState([],tools)asyncdefget_asv(self,agent_state,obs):action=awaitself.llm_call_op(config={"name":"gpt-4o","temperature":0.1},msgs=agent_state.messages+obs,tools=agent_state.tools, )new_state=AgentState(messages=agent_state.messages+obs+ [action],tools=agent_state.tools )returnaction,new_state,0.0
Notice how it's pretty simple. We have to do some bookkeeping - namely appending messages as they come and passing tools. There is no magic here.
We do have a compute graph - which helps if you want to differentiate with respect to parameters inside your agent (including possibly the LLM). If your compute graph looks like the above example - where all you do is call an LLM directly, then don't worry about this.
If you want to do more complex agents and train them, then read on. Let's start with an example compute graph
fromldp.graphimportFxnOp,LLMCallOp,PromptOp,compute_graphop_a=FxnOp(lambdax:2*x)asyncwithcompute_graph():op_result=op_a(3)
This creates a compute graph and executes it. The compute graph is silly - just doubles the input. The compute graph executions and gradients are saved in a context for later use, like training updates. For example:
print(op_result.compute_grads())
Now, inside theSimpleAgent example above, you can see some of the compute graph. Let's see a more complex example for an agent that has a memory it can draw upon.
@compute_graph()asyncdefget_asv(self,agent_state,obs):# Update state with new observationsnext_state=agent_state.get_next_state(obs)# Retrieve relevant memoriesquery=awaitself._query_factory_op(next_state.messages)memories=awaitself._memory_op(query,matches=self.num_memories)# Format memories and package messagesformatted_memories=awaitself._format_memory_op(self.memory_prompt,memories)memory_prompt=awaitself._prompt_op(memories=formatted_memories)packaged_messages=awaitself._package_op(next_state.messages,memory_prompt=memory_prompt,use_memories=bool(memories) )# Make LLM call and update stateconfig=awaitself._config_op()result=awaitself._llm_call_op(config,msgs=packaged_messages,tools=next_state.tools )next_state.messages.extend([result])returnresult,next_state,0.0
You can see in this example that we use differentiable ops to ensure there is a connection in the compute graph from the LLM result (action) back to things like the memory retrieval and the query used to retrieve the memory.
Why use a compute graph? Aside from a gradient, using the compute graph enables the tracking of all inputs/outputs to the ops and serialization/deserialization of the compute graph so that you can easily save/load them. The tracking of input/outputs also makes it easier to do things like fine-tuning or reinforcement learning on the underlying LLMs.
TheAgent (as well as classes inagent.ops)aregenerics,which means:
Agentis designed to support arbitrary types- Subclasses can exactly specify state types, making the code more readable
If you are new to Python generics (typing.Generic),please read about them inPython typing.
Below is how to specify an agent with a custom state type.
fromdataclassesimportdataclass,fieldfromdatetimeimportdatetimefromldp.agentsimportAgent@dataclassclassMyComplexState:vector:list[float]timestamp:datetime=field(default_factory=datetime.now)classMyAgent(Agent[MyComplexState]):"""Some agent who is now type checked to match the custom state."""
fromldp.agentimportSimpleAgentfromaviary.envimportDummyEnvenv=DummyEnv()agent=SimpleAgent()obs,tools=awaitenv.reset()agent_state=awaitagent.init_state(tools=tools)done=Falsewhilenotdone:action,agent_state,_=awaitagent.get_asv(agent_state,obs)obs,reward,done,truncated=awaitenv.step(action.value)
See a tutorial of building andrunning an agent for GSM8K
About
Framework enabling modular interchange of language agents, environments, and optimizers
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Uh oh!
There was an error while loading.Please reload this page.