- Notifications
You must be signed in to change notification settings - Fork8
Multi-agent reinforcement learning framework
NotificationsYou must be signed in to change notification settings
blavad/marl
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
MARL is a high-level multi-agent reinforcement learning library, written in Python.
Project doc : [DOC]
git clone https://github.com/blavad/marl.gitcd marlpip install -e.
Q-learning | DQN | Actor-Critic | DDPG | TD3 |
---|---|---|---|---|
✔️ | ✔️ | ✔️ | ✔️ | ❌ |
minimaxQ | PHC | JAL | MAAC | MADDPG |
---|---|---|---|---|
✔️ | ✔️ | ❌ | ✔️ | ✔️ |
importmarl# Check available agentsprint("\n| Agents\t\t",list(marl.agent.available()))# Check available agentsprint("\n| Policies\t\t",list(marl.policy.available()))# Check available agentsprint("\n| Models\t\t",list(marl.model.available()))# Check available exploration processprint("\n| Expl. Processes\t",list(marl.exploration.available()))# Check available experience memoryprint("\n| Experience Memory\t",list(marl.experience.available()))
importmarlfrommarl.agentimportDQNAgentfrommarl.model.nnimportMlpNetimportgymenv=gym.make("LunarLander-v2")obs_s=env.observation_spaceact_s=env.action_spacemlp_model=MlpNet(8,4,hidden_size=[64,32])dqn_agent=DQNAgent(mlp_model,obs_s,act_s,experience="ReplayMemory-5000",exploration="EpsGreedy",lr=0.001,name="DQN-LunarLander")# Train the agent for 100 000 timestepsdqn_agent.learn(env,nb_timesteps=100000)# Test the agent for 10 episodesdqn_agent.test(env,nb_episodes=10)
importmarlfrommarlimportMARLfrommarl.agentimportMinimaxQAgentfrommarl.explorationimportEpsGreedyfromsoccerimportDiscreteSoccerEnv# Environment available here "https://github.com/blavad/soccer"env=DiscreteSoccerEnv(nb_pl_team1=1,nb_pl_team2=1)obs_s=env.observation_spaceact_s=env.action_space# Custom exploration processexpl1=EpsGreedy(eps_deb=1.,eps_fin=.3)expl2=EpsGreedy(eps_deb=1.,eps_fin=.3)# Create two minimax-Q agentsq_agent1=MinimaxQAgent(obs_s,act_s,act_s,exploration=expl1,gamma=0.9,lr=0.001,name="SoccerJ1")q_agent2=MinimaxQAgent(obs_s,act_s,act_s,exploration=expl2,gamma=0.9,lr=0.001,name="SoccerJ2")# Create the trainable multi-agent systemmas=MARL(agents_list=[q_agent1,q_agent2])# Assign MAS to each agentq_agent1.set_mas(mas)q_agent2.set_mas(mas)# Train the agent for 100 000 timestepsmas.learn(env,nb_timesteps=100000)# Test the agents for 10 episodesmas.test(env,nb_episodes=10,time_laps=0.5)