Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

OpenAI Gym environments for Chess

License

NotificationsYou must be signed in to change notification settings

iamlucaswolf/gym-chess

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Table of Contents

  1. Introduction
  2. Installation
  3. Chess-v0
  4. ChessAlphaZero-v0
  5. Acknowledgements

Introduction

gym-chess providesOpenAI Gym environments for thegame of Chess. It comes with an implementation of the board and moveencoding used inAlphaZero,yet leaves you the freedom to define your own encodings via wrappers.

Let's watch a random agent play against itself:

>>>importgym>>>importgym_chess>>>importrandom>>>env=gym.make('Chess-v0')>>>print(env.render())>>>env.reset()>>>done=False>>>whilenotdone:>>>action=random.sample(env.legal_moves)>>>env.step(action)>>>print(env.render(mode='unicode'))>>>env.close()

Installation

gym-chess requires Python 3.6 or later.

To install gym-chess, run:

$ pip install gym-chess

Importing gym-chess will automatically register theChess-v0 andChessAlphaZero-v0 envs with gym:

>>>importgym>>>importgym_chess>>>gym.envs.registry.all()dict_values([...EnvSpec(Chess-v0),EnvSpec(ChessAlphaZero-v0)])

Chess-v0

gym-chess defines a basicChess-v0 environment which representsobservations and actions as objects of typechess.Board andchess.Move,respectively. These classes come from thepython-chess package which implementsthe game logic.

>>>env=gym.make('Chess-v0')>>>state=env.reset()>>>type(state)chess.Board>>>print(env.render(mode='unicode'))♜ ♞ ♝ ♛ ♚ ♝ ♞ ♜♟ ♟ ♟ ♟ ♟ ♟ ♟ ♟⭘ ⭘ ⭘ ⭘ ⭘ ⭘ ⭘ ⭘⭘ ⭘ ⭘ ⭘ ⭘ ⭘ ⭘ ⭘⭘ ⭘ ⭘ ⭘ ⭘ ⭘ ⭘ ⭘⭘ ⭘ ⭘ ⭘ ⭘ ⭘ ⭘ ⭘♙ ♙ ♙ ♙ ♙ ♙ ♙ ♙♖ ♘ ♗ ♕ ♔ ♗ ♘ ♖>>>move=chess.Move.from_uci('e2e4')>>>env.step(move)>>>print(env.render(mode='unicode'))♜ ♞ ♝ ♛ ♚ ♝ ♞ ♜♟ ♟ ♟ ♟ ♟ ♟ ♟ ♟⭘ ⭘ ⭘ ⭘ ⭘ ⭘ ⭘ ⭘⭘ ⭘ ⭘ ⭘ ⭘ ⭘ ⭘ ⭘⭘ ⭘ ⭘ ⭘ ♙ ⭘ ⭘ ⭘⭘ ⭘ ⭘ ⭘ ⭘ ⭘ ⭘ ⭘♙ ♙ ♙ ♙ ⭘ ♙ ♙ ♙♖ ♘ ♗ ♕ ♔ ♗ ♘ ♖

A list of legal moves for the current position is exposed via thelegal_movesproperty:

>>> env.reset()>>> env.legal_moves[Move.from_uci('g1h3'), Move.from_uci('g1f3'), Move.from_uci('b1c3'), Move.from_uci('b1a3'), Move.from_uci('h2h3'), Move.from_uci('g2g3'), Move.from_uci('f2f3'), Move.from_uci('e2e3'), Move.from_uci('d2d3'), Move.from_uci('c2c3'), Move.from_uci('b2b3'), Move.from_uci('a2a3'), Move.from_uci('h2h4'), Move.from_uci('g2g4'), Move.from_uci('f2f4'), Move.from_uci('e2e4'), Move.from_uci('d2d4'), Move.from_uci('c2c4'), Move.from_uci('b2b4'), Move.from_uci('a2a4')]

Using ordinary Python objects (rather than NumPy arrays) as an agent interfaceis arguably unorthodox. An immideate consequence of this approach is thatChess-v0 has no well-definedobservation_space andaction_space; hencethese member variables are set toNone. However, this design allows us toseperate the game'simplementation from itsrepresentation, which is left towrapper classes.

The agent plays for both players, blackand white, by making movesfor either color in turn. An episode ends when a player wins (i.e. the agentmakes a move that puts the opponent player into checkmate), or the game resultsin a draw (e.g. by reaching a stalemate position, insufficient material, or oneor more other draw conditions according to theFIDE Rules of Chess).Note that there is currently no option for the agent to let a player resign oroffer a draw voluntarily.

The agent receives a reward of +1 when the white player makes a winning move,and a reward of -1 when the black player makes a winning move. All other rewardsare zero.

ChessAlphaZero-v0

gym-chess ships with an implementation of the board and move encoding proposedbyAlphaZero (seeSilver et al., 2017).

>>>env=gym.make('ChessAlphaZero-v0')>>>env.observation_spaceBox(8,8,119)>>>env.action_spaceDiscrete(4672)

For a detailed description of how these encodings work, consider reading thepaper or consult the docstring of the respective classes.

In addition tolegal_moves, ChessAlphaZero-v0 also exposes a list of alllegal actions (i.e. encoded legal moves):

>>>env.legal_actions[494,501,129,136,1095,1022,949,876,803,730,657,584,1096,1023,950,877,804,731,658,585]

Moves can be converted to actions and vice versawith theencode anddecodemethods, which may facilitate debugging and experimentation:

>>> move = chess.Move.from_uci('e2e4')>>> env.encode(move)877>>> env.encode(move) in env.legal_actionsTrue>>> env.decode(877)Move.from_uci('e2e4')

Internally, the encoding is implemented via wrapper classes(gym_chess.alphazero.BoardEncoding andgym_chess.alphazero.MoveEncoding,respectively), which can be used independently of one another. This gives youthe flexibility to define your own board and move representations, and easilyswitch between them.

>>>importgym_chess>>>fromgym_chess.alphazeroimportBoardEncoding>>>env=gym.make('Chess-v0')>>>env=BoardEncoding(env,history_length=4)>>>env=MyEsotericMoveEncoding(env)

Acknowledgements

Thanks to @niklasf for providing the awesomepython-chess package.


[8]ページ先頭

©2009-2025 Movatter.jp