Engine Testing

From Chessprogramming wiki
Jump to:navigation,search

Home * Engine Testing

The ever-optimisticWile E. Coyote[1]

Engine Testing,
the process either to eliminatebugs and to measureperformance of a chess engine. New implementations ofmove generation are tested withPerft, while new features andtuning ofsearch andevaluation are verified viaSPRT testing, (historically)test-positions and by playingmatches against other engines.

Contents

Bug Hunting

Analyzing

Tuning

SPRT

The modern, preferred method to test strength modifications.

Test-Positions

Running sets of test-positions with number of solutions per fixed time-frame is useful to prove whether things are broken after program changes or to get hints about missing knowledge. But one should be careful to tune engines based on test-position results, since solving (possible tactical) test-positions does not necessarily correlate with practicalplaying strength in matches against other opponents.

Matches

Most testing involves running different versions of a program in matches, and comparing results.

Time Controls

Generally speaking, for testing changes that don't alter the search tree itself, but only affect performance (eg.move generation) can be tested with given fixed nodes, fixed time or fixed depth. In all other cases thetime management should be left to the engine to simulate real tournament conditions. On the other hand,debugging is much easier under fixed conditions as the games become deterministic.

A side from the type oftime control one also has to decide on how much time should be spent per game, ie. what the average quality of the games should be like. While one can test more changes in the a certain time at short time controls, it is also relevant how a certain change scales to different strengths. So for example should one increase theR inNull move pruning to 3 in depths > 7, this change may only be effectively tested on time controls where this new condition is triggered frequently enough, ie. where the average search depth is far greater than seven. It is hard to generalize, but on average changes of the search functions (LMR,nullmove,futility or similarpruning,reductions andextensions ) tend to be more sensitive to the time control than the tuning ofevaluation parameters.

Opening

During testing the engines should ideally play the same style of openings they would play in a normal tournament, so not to optimize them for different types of positions. One option is to use the engines ownopening book or one can useopening suites, a set of quiet test positions. In the latter case the same opening suit would be used for each tournament conducted and furthermore each position is played a second time with colors reversed. With these measures one can try to minimize the disparity between tests caused by different openings.

Tournament Manager

User interfaces orcommand line tools forUCI andChess Engine Communication Protocol compatible engines in engine-engine matches are mentioned underTournament Manager.

Frameworks

Chess Server

One can also test an engine's performance by comparing it to other programs on the various internet platforms[2] . In this case the different hardware and features like differentEndgame Tablebases orOpening Books have to be considered.

Statistics

The question whether certain results actually indicates astrength increase or not, can be answered with

Ratings

Test Results

Notable Bugs

Publications

Forum Posts

1995 ...

2000 ...

2005 ...

2007

2008

2009

2010 ...

2011

2012

2013

2014

2015 ...

Re: Static evaluation test posistions byFerdinand Mosca,CCC, November 26, 2015 »Python

2016

2017

2018

Re:Basic automated testing byAndrew Grant,CCC, September 30, 2018 »OpenBench

2019

2020 ...

2021

2022

External Links

References

  1. Hope Springs Eternal
  2. Internet chess servers from Wikipedia
  3. Defending Humanity's Honor byTim Krabbé
  4. Regression testing from Wikipedia
  5. Testing a chess engine from the ground up fromHome of the Dutch Rebel byEd Schröder

Up one Level

Retrieved from "https://www.chessprogramming.org/index.php?title=Engine_Testing&oldid=27016"
Categories: