Movatterモバイル変換

[0]ホーム

Jump to content

Alpha–beta pruning

Edit links

From Wikipedia, the free encyclopedia

(Redirected fromAlpha-beta pruning)

Search algorithm that seeks to decrease the number of nodes in the minimax algorithm search tree

For other uses, seeAlphabeta (disambiguation).

Alpha–beta pruning
Class	Search algorithm
Worst-case performance	$O(b^{d})$
Best-case performance	$O\left({\sqrt {b^{d}}}\right)$

Alpha–beta pruning is asearch algorithm that seeks to decrease the number of nodes that are evaluated by theminimax algorithm in itssearch tree. It is an adversarial search algorithm used commonly for machine playing of two-playercombinatorial games (Tic-tac-toe,Chess,Connect 4, etc.). It stops evaluating a move when at least one possibility has been found that proves the move to be worse than a previously examined move. Such moves need not be evaluated further. When applied to a standard minimax tree, it returns the same move as minimax would, but prunes away branches that cannot possibly influence the final decision.^[1]

History

[edit]

John McCarthy during theDartmouth Workshop met Alex Bernstein ofIBM, who was writing a chess program. McCarthy invented alpha–beta search and recommended it to him, but Bernstein was "unconvinced".^[2]

Allen Newell andHerbert A. Simon who used whatJohn McCarthy calls an "approximation"^[3] in 1958 wrote that alpha–beta "appears to have been reinvented a number of times".^[4]Arthur Samuel had an early version for a checkers simulation. Richards, Timothy Hart, Michael Levin and/or Daniel Edwards also invented alpha–beta independently in theUnited States.^[5] McCarthy proposed similar ideas during theDartmouth workshop in 1956 and suggested it to a group of his students includingAlan Kotok at MIT in 1961.^[6]Alexander Brudno independently conceived the alpha–beta algorithm, publishing his results in 1963.^[7]Donald Knuth and Ronald W. Moore refined the algorithm in 1975.^[8]^[9]Judea Pearl proved its optimality in terms of the expected running time for trees with randomly assigned leaf values in two papers.^[10]^[11] The optimality of the randomized version of alpha–beta was shown by Michael Saks and Avi Wigderson in 1986.^[12]

Core idea

[edit]

Agame tree can represent many two-playerzero-sum games, such as chess, checkers, and reversi. Each node in the tree represents a possible situation in the game. Each terminal node (outcome) of a branch is assigned a numeric score that determines the value of the outcome to the player with the next move.^[13]

The algorithm maintains two values, alpha and beta, which respectively represent the minimum score that the maximizing player is assured of and the maximum score that the minimizing player is assured of. Initially, alpha is negative infinity and beta is positive infinity, i.e. both players start with their worst possible score. Whenever the maximum score that the minimizing player (i.e. the "beta" player) is assured of becomes less than the minimum score that the maximizing player (i.e., the "alpha" player) is assured of (i.e. beta < alpha), the maximizing player need not consider further descendants of this node, as they will never be reached in the actual play.

To illustrate this with a real-life example, suppose somebody is playing chess, and it is their turn. Move "A" will improve the player's position. The player continues to look for moves to make sure a better one hasn't been missed. Move "B" is also a good move, but the player then realizes that it will allow the opponent to force checkmate in two moves. Thus, other outcomes from playing move B no longer need to be considered since the opponent can force a win. The maximum score that the opponent could force after move "B" is negative infinity: a loss for the player. This is less than the minimum position that was previously found; move "A" does not result in a forced loss in two moves.

Improvements over naive minimax

[edit]

An illustration of alpha–beta pruning. The grayed-out subtrees don't need to be explored (when moves are evaluated from left to right), since it is known that the group of subtrees as a whole yields the value of an equivalent subtree or worse, and as such cannot influence the final result. The max and min levels represent the turn of the player and the adversary, respectively.

The benefit of alpha–beta pruning lies in the fact that branches of the search tree can be eliminated.^[13] This way, the search time can be limited to the 'more promising' subtree, and a deeper search can be performed in the same time. Like its predecessor, it belongs to thebranch and bound class of algorithms. The optimization reduces the effective depth to slightly more than half that of simple minimax if the nodes are evaluated in an optimal or near optimal order (best choice for side on move ordered first at each node).

With an (average or constant)branching factor ofb, and a search depth ofdplies, the maximum number of leaf node positions evaluated (when the move ordering ispessimal) isO(b^d) – the same as a simple minimax search. If the move ordering for the search is optimal (meaning the best moves are always searched first), the number of leaf node positions evaluated is aboutO(b×1×b×1×...×b) for odd depth andO(b×1×b×1×...×1) for even depth, or $O(b^{d/2})=O({\sqrt {b^{d}}})$ . In the latter case, where the ply of a search is even, the effective branching factor is reduced to itssquare root, or, equivalently, the search can go twice as deep with the same amount of computation.^[14] The explanation ofb×1×b×1×... is that all the first player's moves must be studied to find the best one, but for each, only the second player's best move is needed to refute all but the first (and best) first player move—alpha–beta ensures no other second player moves need be considered. When nodes are considered in a random order (i.e., the algorithm randomizes), asymptotically,the expected number of nodes evaluated in uniform trees with binary leaf-values is $\Theta (((b-1+{\sqrt {b^{2}+14b+1}})/4)^{d})$ .^[12]For the same trees, when the values are assigned to the leaf values independently of each other and say zero and one are both equally probable, the expected number of nodes evaluated is $\Theta ((b/2)^{d})$ , which is much smaller than the work done by the randomized algorithm, mentioned above, and is again optimal for such random trees.^[10] When the leaf values are chosen independently of each other but from the $[0,1]$ interval uniformly at random, the expected number of nodes evaluated increases to $\Theta (b^{d/log(d)})$ in the $d\to \infty$ limit,^[11] which is again optimal for this kind of random tree. Note that the actual work for "small" values of $d {\displaystyle d}$ is better approximated using $0.925d^{0.747}$ .^[11]^[10]

A chess program that searches four plies with an average of 36 branches per node evaluates more than one million terminal nodes. An optimal alpha-beta prune would eliminate all but about 2,000 terminal nodes, a reduction of 99.8%.^[13]

An animated pedagogical example that attempts to be human-friendly by substituting initial infinite (or arbitrarily large) values for emptiness and by avoiding using thenegamax coding simplifications.

Normally during alpha–beta, thesubtrees are temporarily dominated by either a first player advantage (when many first player moves are good, and at each search depth the first move checked by the first player is adequate, but all second player responses are required to try to find a refutation), or vice versa. This advantage can switch sides many times during the search if the move ordering is incorrect, each time leading to inefficiency. As the number of positions searched decreases exponentially each move nearer the current position, it is worth spending considerable effort on sorting early moves. An improved sort at any depth will exponentially reduce the total number of positions searched, but sorting all positions at depths near the root node is relatively cheap as there are so few of them. In practice, the move ordering is often determined by the results of earlier, smaller searches, such as throughiterative deepening.

Additionally, this algorithm can be trivially modified to return an entireprincipal variation in addition to the score. Some more aggressive algorithms such asMTD(f) do not easily permit such a modification.

Pseudocode

[edit]

The pseudo-code for depth limited minimax with alpha–beta pruning is as follows:^[15]

Implementations of alpha–beta pruning can often be delineated by whether they are "fail-soft," or "fail-hard". With fail-soft alpha–beta, the alphabeta function may return values (v) that exceed (v < α or v > β) the α and β bounds set by its function call arguments. In comparison, fail-hard alpha–beta limits its function return value into the inclusive range of α and β. The main difference between fail-soft and fail-hard implementations is whether α and β are updated before or after the cutoff check. If they are updated before the check, then they can exceed initial bounds and the algorithm is fail-soft.

The following pseudo-code illustrates the fail-hard variation.^[15]

function alphabeta(node, depth, α, β, maximizingPlayer)isif depth == 0or node is terminalthenreturn the heuristic value of nodeif maximizingPlayerthen        value := −∞for each child of nodedo            value := max(value, alphabeta(child, depth − 1, α, β, FALSE))if value > βthenbreak(* β cutoff *)            α := max(α, value)return valueelse        value := +∞for each child of nodedo            value := min(value, alphabeta(child, depth − 1, α, β, TRUE))if value < αthenbreak(* α cutoff *)            β := min(β, value)return value

(* Initial call *)alphabeta(origin, depth, −∞, +∞, TRUE)

The following pseudocode illustrates fail-soft alpha-beta.

function alphabeta(node, depth, α, β, maximizingPlayer)isif depth == 0or node is terminalthenreturn the heuristic value of nodeif maximizingPlayerthen        value := −∞for each child of nodedo            value := max(value, alphabeta(child, depth − 1, α, β, FALSE))            α := max(α, value)if value ≥ βthenbreak(* β cutoff *)return valueelse        value := +∞for each child of nodedo            value := min(value, alphabeta(child, depth − 1, α, β, TRUE))            β := min(β, value)if value ≤ αthenbreak(* α cutoff *)return value

(* Initial call *)alphabeta(origin, depth, −∞, +∞, TRUE)

Heuristic improvements

[edit]

Further improvement can be achieved without sacrificing accuracy by using orderingheuristics to search earlier parts of the tree that are likely to force alpha–beta cutoffs. For example, in chess, moves that capture pieces may be examined before moves that do not, and moves that have scored highly inearlier passes through the game-tree analysis may be evaluated before others. Another common, and very cheap, heuristic is thekiller heuristic, where the last move that caused a beta-cutoff at the same tree level in the tree search is always examined first. This idea can also be generalized into a set ofrefutation tables.

Alpha–beta search can be made even faster by considering only a narrow search window (generally determined by guesswork based on experience). This is known as anaspiration window. In the extreme case, the search is performed with alpha and beta equal; a technique known aszero-window search,null-window search, orscout search. This is particularly useful for win/loss searches near the end of a game where the extra depth gained from the narrow window and a simple win/loss evaluation function may lead to a conclusive result. If an aspiration search fails, it is straightforward to detect whether it failedhigh (high edge of window was too low) orlow (lower edge of window was too high). This gives information about what window values might be useful in a re-search of the position.

Over time, other improvements have been suggested, and indeed the Falphabeta (fail-soft alpha–beta) idea of John Fishburn is nearly universal and is already incorporated above in a slightly modified form. Fishburn also suggested a combination of the killer heuristic and zero-window search under the name Lalphabeta ("last move with minimal window alpha–beta search").

Other algorithms

[edit]

Since theminimax algorithm and its variants are inherentlydepth-first, a strategy such asiterative deepening is usually used in conjunction with alpha–beta so that a reasonably good move can be returned even if the algorithm is interrupted before it has finished execution. Another advantage of using iterative deepening is that searches at shallower depths give move-ordering hints, as well as shallow alpha and beta estimates, that both can help produce cutoffs for higher depth searches much earlier than would otherwise be possible.

Algorithms likeSSS*, on the other hand, use thebest-first strategy. This can potentially make them more time-efficient, but typically at a heavy cost in space-efficiency.^[16]

References

[edit]

^Russell & Norvig 2021, p. 152-161.
^McCarthy, John (2006-10-30)."The Dartmouth Workshop--as planned and as it happened".www-formal.stanford.edu. Retrieved2023-10-29.
^McCarthy, John (27 November 2006)."Human Level AI Is Harder Than It Seemed in 1955". Stanford University. Retrieved2006-12-20.
^Newell, Allen; Simon, Herbert A. (1 March 1976)."Computer science as empirical inquiry: symbols and search".Communications of the ACM.19 (3):113–126.doi:10.1145/360018.360022.
^Edwards, D.J.; Hart, T.P. (4 December 1961).The Alpha–beta Heuristic (Technical report).Massachusetts Institute of Technology.hdl:1721.1/6098. AIM-030.
^Kotok, Alan (2004) [1962]."A Chess Playing Program".Artificial Intelligence Project. RLE and MIT Computation Center. Memo 41. Retrieved2006-07-01.
^Marsland, T.A. (May 1987)."Computer Chess Methods"(PDF). In Shapiro, S. (ed.).Encyclopedia of Artificial Intelligence. Wiley. pp. 159–171.ISBN 978-0-471-62974-0. Archived fromthe original(PDF) on 2008-10-30.
^Knuth, Donald E.; Moore, Ronald W. (1975). "An analysis of alpha-beta pruning".Artificial Intelligence.6 (4):293–326.doi:10.1016/0004-3702(75)90019-3.S2CID 7894372.
^Abramson, Bruce (1 June 1989). "Control strategies for two-player games".ACM Computing Surveys.21 (2):137–161.doi:10.1145/66443.66444.S2CID 11526154.
^^a ^b ^cPearl, Judea (1980). "Asymptotic Properties of Minimax Trees and Game-Searching Procedures".Artificial Intelligence.14 (2):113–138.doi:10.1016/0004-3702(80)90037-5.
^^a ^b ^cPearl, Judea (1982)."The Solution for the Branching Factor of the Alpha-Beta Pruning Algorithm and Its Optimality".Communications of the ACM.25 (8):559–64.doi:10.1145/358589.358616.S2CID 8296219.
^^a ^bSaks, M.; Wigderson, A. (1986). "Probabilistic Boolean Decision Trees and the Complexity of Evaluating Game Trees".27th Annual Symposium on Foundations of Computer Science. pp. 29–38.doi:10.1109/SFCS.1986.44.ISBN 0-8186-0740-8.S2CID 6130392.
^^a ^b ^cLevy, David (January 1986)."Alpha-Beta Soup".MacUser. pp. 98–102. Retrieved2021-10-19.
^Russell & Norvig 2021, p. 155.
^^a ^bRussell & Norvig 2021, p. 154.
^Pearl, Judea;Korf, Richard (1987), "Search techniques",Annual Review of Computer Science,2:451–467,doi:10.1146/annurev.cs.02.060187.002315,Like its A* counterpart for single-player games, SSS* is optimal in terms of the average number of nodes examined; but its superior pruning power is more than offset by the substantial storage space and bookkeeping required.

Bibliography

[edit]

Russell, Stuart J.;Norvig, Peter. (2021).Artificial Intelligence: A Modern Approach (4th ed.). Hoboken: Pearson.ISBN 9780134610993.LCCN 20190474.
Heineman, George T.; Pollice, Gary; Selkow, Stanley (2008). "7. Path Finding in AI".Algorithms in a Nutshell.Oreilly Media. pp. 217–223.ISBN 978-0-596-51624-6.
Pearl, Judea (1984).Heuristics: Intelligent Search Strategies for Computer Problem Solving. Addison-Wesley.ISBN 978-0-201-05594-8.OCLC 1035596197.
Fishburn, John P. (1984). "Appendix A: Some Optimizations of α-β Search".Analysis of Speedup in Distributed Algorithms (revision of 1981 PhD thesis). UMI Research Press. pp. 107–111.ISBN 0-8357-1527-2.

v t e Graph andtree traversal algorithms
Search	α–β pruning A* IDA* LPA* SMA* Best-first search Beam search Bidirectional search Breadth-first search Lexicographic Parallel B* Depth-first search Iterative deepening D* Fringe search Jump point search Monte Carlo tree search SSS*
Shortest path	Bellman–Ford Dijkstra's Floyd–Warshall Johnson's Shortest path faster Yen's
Minimum spanning tree	Borůvka's Kruskal's Prim's Reverse-delete
List of graph search algorithms

v t e Topics ofgame theory
Definitions	Congestion game Cooperative game Determinacy Escalation of commitment Extensive-form game First-player and second-player win Game complexity Graphical game Hierarchy of beliefs Information set Normal-form game Perfect recall Preference Sequential game Simultaneous game Simultaneous action selection Solved game Succinct game Mechanism design
Equilibrium concepts	Bayes correlated equilibrium Bayesian Nash equilibrium Berge equilibrium Core Correlated equilibrium Coalition-proof Nash equilibrium Epsilon-equilibrium Evolutionarily stable strategy Gibbs equilibrium Mertens-stable equilibrium Markov perfect equilibrium Nash equilibrium Pareto efficiency Perfect Bayesian equilibrium Proper equilibrium Quantal response equilibrium Quasi-perfect equilibrium Risk dominance Satisfaction equilibrium Self-confirming equilibrium Sequential equilibrium Shapley value Strong Nash equilibrium Subgame perfection Trembling hand equilibrium
Strategies	Appeasement Backward induction Bid shading Collusion Cheap talk De-escalation Deterrence Escalation Forward induction Grim trigger Markov strategy Pairing strategy Dominant strategies Pure strategy Mixed strategy Strategy-stealing argument Tit for tat
Classes of games	Auction Bargaining problem Differential game Global game Intransitive game Mean-field game n-player game Perfect information Large Poisson game Potential game Repeated game Screening game Signaling game Strictly determined game Stochastic game Symmetric game Zero-sum game
Games	Go Chess Infinite chess Checkers All-pay auction Prisoner's dilemma Gift-exchange game Optional prisoner's dilemma Traveler's dilemma Coordination game Chicken Centipede game Lewis signaling game Volunteer's dilemma Dollar auction Battle of the sexes Stag hunt Matching pennies Ultimatum game Electronic mail game Rock paper scissors Pirate game Dictator game Public goods game Blotto game War of attrition El Farol Bar problem Fair division Fair cake-cutting Bertrand competition Cournot competition Stackelberg competition Deadlock Diner's dilemma Guess 2/3 of the average Kuhn poker Nash bargaining game Induction puzzles Trust game Princess and monster game Rendezvous problem Pursuit game
Theorems	Aumann's agreement theorem Folk theorem Minimax theorem Nash's theorem Negamax theorem One-shot deviation principle Purification theorem Revelation principle Sprague–Grundy theorem Zermelo's theorem
Key figures	Albert W. Tucker Amos Tversky Antoine Augustin Cournot Ariel Rubinstein Claude Shannon Daniel Kahneman David K. Levine David M. Kreps Donald B. Gillies Drew Fudenberg Eric Maskin Harold W. Kuhn Herbert Simon Hervé Moulin John Conway Jean Tirole Jean-François Mertens Jennifer Tour Chayes John Harsanyi John Maynard Smith John Nash John von Neumann Kenneth Arrow Kenneth Binmore Leonid Hurwicz Lloyd Shapley Melvin Dresher Merrill M. Flood Olga Bondareva Oskar Morgenstern Paul Milgrom Peyton Young Reinhard Selten Robert Axelrod Robert Aumann Robert B. Wilson Roger Myerson Samuel Bowles Suzanne Scotchmer Thomas Schelling William Vickrey
Search optimizations	Alpha–beta pruning Aspiration window Principal variation search max^n algorithm Paranoid algorithm Lazy SMP
Miscellaneous	Bounded rationality Combinatorial game theory Confrontation analysis Coopetition Evolutionary game theory Glossary of game theory List of game theorists List of games in game theory No-win situation Topological game Tragedy of the commons