Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Multi-agent reinforcement learning framework

NotificationsYou must be signed in to change notification settings

blavad/marl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

85 Commits
 
 
 
 
 
 
 
 

Repository files navigation

MARL is a high-level multi-agent reinforcement learning library, written in Python.

Project doc : [DOC]

Installation

git clone https://github.com/blavad/marl.gitcd marlpip install -e.

Implemented algorithms

Single-agent algorithms

Q-learningDQNActor-CriticDDPGTD3
✔️✔️✔️✔️

Multi-agent algorithms

minimaxQPHCJALMAACMADDPG
✔️✔️✔️✔️

Examples

Check existing methods

importmarl# Check available agentsprint("\n| Agents\t\t",list(marl.agent.available()))# Check available agentsprint("\n| Policies\t\t",list(marl.policy.available()))# Check available agentsprint("\n| Models\t\t",list(marl.model.available()))# Check available exploration processprint("\n| Expl. Processes\t",list(marl.exploration.available()))# Check available experience memoryprint("\n| Experience Memory\t",list(marl.experience.available()))

Train a single agent with DQN algorithm

importmarlfrommarl.agentimportDQNAgentfrommarl.model.nnimportMlpNetimportgymenv=gym.make("LunarLander-v2")obs_s=env.observation_spaceact_s=env.action_spacemlp_model=MlpNet(8,4,hidden_size=[64,32])dqn_agent=DQNAgent(mlp_model,obs_s,act_s,experience="ReplayMemory-5000",exploration="EpsGreedy",lr=0.001,name="DQN-LunarLander")# Train the agent for 100 000 timestepsdqn_agent.learn(env,nb_timesteps=100000)# Test the agent for 10 episodesdqn_agent.test(env,nb_episodes=10)

Train two agents with Minimax-Q algorithm

importmarlfrommarlimportMARLfrommarl.agentimportMinimaxQAgentfrommarl.explorationimportEpsGreedyfromsoccerimportDiscreteSoccerEnv# Environment available here "https://github.com/blavad/soccer"env=DiscreteSoccerEnv(nb_pl_team1=1,nb_pl_team2=1)obs_s=env.observation_spaceact_s=env.action_space# Custom exploration processexpl1=EpsGreedy(eps_deb=1.,eps_fin=.3)expl2=EpsGreedy(eps_deb=1.,eps_fin=.3)# Create two minimax-Q agentsq_agent1=MinimaxQAgent(obs_s,act_s,act_s,exploration=expl1,gamma=0.9,lr=0.001,name="SoccerJ1")q_agent2=MinimaxQAgent(obs_s,act_s,act_s,exploration=expl2,gamma=0.9,lr=0.001,name="SoccerJ2")# Create the trainable multi-agent systemmas=MARL(agents_list=[q_agent1,q_agent2])# Assign MAS to each agentq_agent1.set_mas(mas)q_agent2.set_mas(mas)# Train the agent for 100 000 timestepsmas.learn(env,nb_timesteps=100000)# Test the agents for 10 episodesmas.test(env,nb_episodes=10,time_laps=0.5)

[8]ページ先頭

©2009-2025 Movatter.jp