Movatterモバイル変換

[0]ホーム

Jump to content

Win–stay, lose–switch

Deutsch

Edit links

From Wikipedia, the free encyclopedia

Heuristic learning strategy

Inpsychology,game theory,statistics, andmachine learning,win–stay, lose–switch (alsowin–stay, lose–shift) is aheuristic learning strategy used to model learning in decision situations. It was first invented as an improvement over randomization inbandit problems.^[1] It was later applied to theprisoner's dilemma in order to model theevolution ofaltruism.^[2]

The learning rule bases its decision only on the outcome of the previous play. Outcomes are divided into successes (wins) and failures (losses). If the play on the previous round resulted in a success, then the agent plays the same strategy on the next round. Alternatively, if the play resulted in a failure the agent switches to another action.

A large-scale empirical study of players of the gamerock, paper, scissors shows that a variation of this strategy is adopted by real-world players of the game, instead of theNash equilibrium strategy of choosing entirely at random between the three options.^[3]^[4]

References

[edit]

^Robbins, H. (1952)."Some aspects of the sequential design of experiments".Bulletin of the American Mathematical Society.58 (5):527–535.doi:10.1090/s0002-9904-1952-09620-8.
^Nowak, M.; Sigmund, K. (July 1, 1993). "A strategy of win-stay, lose-shift that outperforms tit-for-tat in the Prisoner's Dilemma game".Nature.364 (6432):56–58.Bibcode:1993Natur.364...56N.doi:10.1038/364056a0.PMID 8316296.S2CID 4238908.
^Morgan, James (2 May 2014)."How to win at rock-paper-scissors".BBC News.
^Wang, Zhijian; Xu, Bin; Zhou, Hai-Jun (July 25, 2014)."Social cycling and conditional responses in the Rock-Paper-Scissors game".Scientific Reports.4: 5830.doi:10.1038/srep05830.PMC 5376050.PMID 25060115.

Movatterモバイル変換

References

See also