Bounded Rationality

First published Fri Nov 30, 2018; substantive revision Fri Dec 13, 2024

Herbert Simon introduced the term ‘bounded rationality’(Simon 1957b: 198; see also Klaes & Sent 2005) as shorthand forhis proposal to replace the perfect rationality assumptions ofhomo economicus with a concept of rationality better suitedto cognitively limited agents:

Broadly stated, the task is to replace the global rationality ofeconomic man with the kind of rational behavior that is compatiblewith the access to information and the computational capacities thatare actually possessed by organisms, including man, in the kinds ofenvironments in which such organisms exist. (Simon 1955a: 99)

Bounded rationality now describes a wide range ofdescriptive, normative, and prescriptive accounts of effectivebehavior which depart from the assumptions of perfect rationality.This entry aims to highlight key contributions—from the decisionsciences, economics, cognitive- and neuropsychology, biology, physics,computer science, and philosophy—to our current understanding ofbounded rationality.

1. Homo Economicus and Expected Utility Theory

Bounded rationality has come to encompass models of effective behaviorthat weaken, or reject altogether, the idealized conditions of perfectrationality assumed by models of economic man. In this section westate what models of economic man are committed to and theirrelationship to expected utility theory. In later sections we reviewproposals for departing from expected utility theory.

The perfect rationality ofhomo economicus imagines ahypothetical agent who has complete information about the optionsavailable for choice, perfect foresight of the consequences fromchoosing those options, and the wherewithal to solve an optimizationproblem (typically of considerable complexity) that identifies anoption which maximizes the agent’s personal utility.

1.1 The Evolution of Homo Economicus

The meaning of ‘economic man’ has evolved significantlyover time. John Stuart Mill first described this hypotheticalindividual as a self-interested agent who seeks to maximize hispersonal utility (1844). William Jevons later formalized the notion ofmarginal utility to represent an economic consumer, affording a modelof economic behavior (1871). Frank Knight extended Jevon’scalculator man by introducing theslot-machine manin neo-classical economics (1921), which is Jevon’scalculator man augmented with perfect foresight and clearlydefined risks. The modern view of a rational economic agent isconceived in terms of Paul Samuelson’srevealedpreference formulation of utility (1947; Wong 2006) which,together with von Neumann and Morgenstern’s axiomatization(1944), changed the focus of economic modeling from reasoning behaviorto choice behavior, emphasizing decisions over purely logicalreasoning.

Modern economic theory begins with the observation that human beingslike some consequences better than others, even if they only assessthose consequences hypothetically. A perfectly rational person,according to the canonical paradigm of synchronic decision makingunder risk, is one whose comparative assessments of a set ofconsequences adhere to the principle of maximizing utility. This meanstheir decisions consistently follow a rational evaluation of potentialoutcomes, aiming to achieve the highest possible expected benefitbased on probabilistic reasoning.

Yet, this recommendation to maximize expected utility presupposes thatqualitative comparative judgements of those consequences (i.e.,preferences) are structured in such a way (i.e., satisfy specificaxioms) so as to admit a mathematical representation that places thoseobjects of comparison on the real number line (i.e., as inequalitiesof mathematical expectations), ordered from worst to best. Thisstructuring of preference through axioms to admit a numericalrepresentation is the subject of expected utility theory.

1.2 Expected Utility Theory

We present here one such axiom system to derive expected utilitytheory, a simple set of axioms for the binary relation \(\succeq\),which represents the relation “is weakly preferred to”.The objects of comparison for this axiomatization areprospects, which associate probabilities to a fixed set ofconsequences, where both probabilities and consequences are known tothe agent. To illustrate, the prospect (−€10, ½;€20, ½) concerns two consequences,losing 10Euros andwinning 20 Euros, each assigned theprobability one-half. A rational agent will prefer this prospect toanother with the same consequences but greater chance of losing thanwinning, such as (\(-\)€10, ⅔; €20,⅓)—assuming his aim is to maximize his financial welfare.More generally, suppose that \(X = \{x_1, x_2, \ldots, x_n\}\) is amutually exclusive and exhaustive set of consequences and that \(p_i\)denotes the probability of \(x_i\), where each \(p_i \geq 0\) and\(\sum_{i}^{n} p_i = 1\). A prospectP is simply the set ofconsequence-probability pairs, \(P = (x_1, p_1; \ x_2, p_2; \ldots; \x_n, p_n)\). By convention, a prospect’s consequence-probabilitypairs are ordered by the value of each consequence, from leastfavorable to most. When prospectsP,Q,Rare comparable under a specific preference relation, \(\succeq\), andthe (ordered) set of consequencesX is fixed, then prospectsmay be simply represented by a vector of probabilities.

Theexpected utility hypothesis Bernoulli (1738) states thatrational agents ought to maximize expected utility. If yourqualitative preferences \(\succeq\) over prospects satisfy thefollowing three constraints,ordering,continuity,andindependence, then your preferences will maximizeexpected utility (Neumann & Morgenstern 1944).

A1.: Ordering. The ordering condition states thatpreferences are bothcomplete andtransitive. Forall prospectsP,Q, completeness entails that either\(P \succeq Q\), \(Q \succeq P\), or both \(Q \succeq P\) and \(Q\succeq P\), written \(P \sim Q\). For all prospects \(P, Q, R\),transitivity entails that if \(P \succeq Q\) and \(Q \succeq R\), then\(P \succeq R\).
A2.: Archimedean. For all prospects \(P, Q, R\) suchthat \(P \succeq Q\) and \(Q \succeq R\), then there exists some \(p\in (0,1)\) such that \((P, p; \ R, (1-p)) \sim Q\), where \((P, p; R,(1-p))\) is thecompound prospect that yields the prospectP as a consequence with probabilityp or yields theprospectR with probability \(1-p\).^[1]
A3.: Independence. For all prospects \(P, Q, R\), if\(P \succeq Q\), then \[(P, p; \ R, (1-p)) \succeq (Q, p; \ R, (1-p))\] for allp.

Specifically, if A1, A2, and A3 hold, then there is a real-valuedfunction \(V(\cdot)\) of the form

\[\label{eq:seu} V(P) = \sum_i (p_i \cdot u(x_i))\]

whereP is any prospect and \(u(\cdot)\) is a von Neumann andMorgenstern utility function defined on the set of consequencesX, such that \(P \succeq Q\) if and only if \(V(P) \geqV(Q)\). In other words, if your qualitative comparative judgments ofprospects at a given time satisfy A1, A2, and A3, then thosequalitative judgments are representable numerically by inequalities offunctions of the form \(V(\cdot)\), yielding a logical calculus on aninterval scale for determining the consequences of your qualitativecomparative judgments at that time.

1.3 Axiomatic Departures from Expected Utility Theory

It is commonplace to explore alternatives to an axiomatic system andexpected utility theory is no exception. To be clear, not alldepartures from expected utility theory are candidates for modelingbounded rationality. Nevertheless, some misguided rhetoric over how toapproach the problem of modeling bounded rationality stems fromunfamiliarity with the breadth of contemporary statistical decisiontheory. Here we highlight some axiomatic departures from expectedutility theory that are motivated by bounded rationalityconsiderations, all framed in terms of our particular axiomatizationfromsection 1.2.

1.3.1 Alternatives to A1

Weakening the ordering axiom introduces the possibility for an agentto forgo comparing a pair of alternatives, an idea both Keynes andKnight advocated (Keynes 1921; Knight 1921). Specifically, droppingthe completeness axiom allows an agent to be in a position to neitherprefer one option to another nor be indifferent between the two(Koopman 1940; Aumann 1962; Fishburn 1982). Decisiveness, which thecompleteness axiom encodes, is more mathematical convenience thanprinciple of rationality. The question, which is the question thatevery proposed axiomatic system faces, is what logically follows froma system which allows for incomplete preferences. Led by Aumann(1962), early axiomatizations of rational incomplete preferences weresuggested by Giles (1976) and Giron & Rios (1980), and laterstudied by Karni (1985), Bewley (2002), Walley (1991), Seidenfeld,Schervish, & Kadane (1995), Ok (2002), Nau (2006), Galaabaatar& Karni (2013) and Zaffalon & Miranda (2017). In addition toaccommodating indecision (Wheeler 2022), such systems also allow foryou to reason about someone else’s (possibly) completepreferences when your information about that other agent’spreferences is incomplete.

Dropping transitivity limits extendability of elicited preferences(Luce & Raiffa 1957), since the omission of transitivity as anaxiomatic constraint allows for cycles and preference reversals.Although violations of transitivity have been long considered bothcommonplace and a sign of human irrationality (May 1954; Tversky1969), reassessments of the experimental evidence challenge thisreceived view (Mongin 2000; Regenwetter, Dana, & Davis-Stober2011). The axioms impose synchronic consistency constraints onpreferences, whereas the experimental evidence for violations oftransitivity commonly conflate dynamic and synchronic consistency(Regenwetter et al. 2011). Specifically, a person’s preferencesat one moment in time that are inconsistent with his preferences atanother time is no evidence for that person holding logicallyinconsistent preferences at a single moment in time, thus offer noevidence for irrationality. People change their minds, after all.Arguments to limit the scope of transitivity in normative accounts ofrational preference similarly point to diachronic or grouppreferences, which likewise do not contradict the axioms (Kyburg 1978;Anand 1987; Bar-Hillel & Margalit 1988; Schick 1986). Argumentsthat point to psychological processes or algorithms that admit cyclesor reversals of preference over time also point to a misapplicationof, rather than a counter-example to, the ordering condition. Finally,for decisions that involve explicit comparisons of options over time,violating transitivity may be rational. For example, given the goal ofmaximizing the rate of food gain, an organism’s current foodoptions may reveal information about food availability in the nearfuture by indicating that a current option may soon disappear or thata better option may soon reappear. Information about availability ofoptions over time can, and sometimes does, warrant non-transitivechoice behavior over time that nevertheless maximizes food gain(McNamara, Trimmer, & Houston 2014, McNamara and Leimar 2020).

1.3.2 Alternatives to A2

Dropping the Archimedean axiom allows for an agent to havelexicographic preferences (Blume, Brandenburger, & Dekel1991); that is, the omission ofA2 allows the possibility for an agent to prefer one option infinitelymore than another. One motivation for developing a non-Archimedeanversion of expected utility theory is to address a gap in thefoundations of the standard subjective utility framework that preventsa full reconciliation ofadmissibility (i.e., the principlethat one ought not select a weakly dominated option for choice) withfull conditional preferences (i.e., that for any event, thereis a well-defined conditional probability to represent theagent’s conditional preferences; Pedersen 2014). Specifically,the standard subjective expected utility account cannot accommodateconditioning on zero-probability events, which is of particularimportance to game theory (P. Hammond 1994). Non-Archimedean variantsof expected utility theory turn to techniques from nonstandardanalysis (Goldblatt 1998), full conditional probabilities(Rényi 1955; Coletii & Scozzafava 2002; Dubins 1975; Popper1959), and lexicographic probabilities (Halpern 2010; Brickhill &Horsten 2016 [Other Internet Resources]), and are all linked to imprecise probability theory (Wheeler andCozman 2021).

Non-compensatory single-cue decision models, such as the Take-the-Bestheuristic (section 5.2), appeal to lexicographically ordered cues, and admit a numericalrepresentation in terms of non-Archimedean expectations(Arló-Costa & Pedersen 2011).

1.3.3 Alternatives to A3

A1 andA2 together entail that \(V(\cdot)\) assigns a real-valued index toprospects such that \(P \succeq Q\) if and only if \(V(P) \geq V(Q)\).The independence axiom,A3, encodes a separability property for choice, one that ensures thatexpected utilities are linear in probabilities. Motivations fordropping the independence axiom stem from difficulties in applyingexpected utility theory to describe choice behavior, including anearly observation that humans evaluate possible losses and possiblegains differently. Although expected utility theory can represent aperson who either gambles or purchases insurance, Friedman and Savageremarked in their early critique of von Neumann andMorgenstern’s axiomatization, it cannot simultaneously do both(M. Friedman & Savage 1948).

The principle of loss aversion (Kahneman & Tversky 1979; Rabin2000) suggests that the subjective weight that we assign to potentiallosses is greater than that we assign to commensurate potential gains.For example, theendowment effect (Thaler 1980)—theobservation that people tend to view the value of a good higher whenviewed as a potential loss than when viewed as a potentialgain—is supported by neurological evidence for gains and lossesbeing processed by different regions of the brain (Rick 2011).However, even granting the affective differences in how we processlosses and gains, those differences do not necessarily translate to ageneral “negativity bias” (Baumeister, Bratslavsky, &Finkenauer 2001) in choice behavior (Hochman & Yechiam 2011;Yechiam 2019). Yechiam and colleagues report experiments in whichparticipants do not exhibit loss aversion in their choices, such ascases in which participants respond to repetitive situations thatissue losses and gains and single-case decisions involving smallstakes. That said, observations of risk aversion (Allais 1953) andambiguity aversion (Ellsberg 1961) have led to alternatives toexpected utility theory, all of which abandonA3. Those alternative approaches include prospect theory (section 2.4), regret theory (Bell 1982; Loomes & Sugden 1982), andrank-dependent expected utility (Quiggin 1982).

Most models of bounded rationality do not fit neatly into this broadaxiomatic framework we’ve outlined. This is partly becausebounded rationality emphasizes the processes, algorithms, andpsychological mechanisms involved in decision making (section 2), aspects that Samuelson’s shift from reasoning behavior tochoice behavior deliberately abstracted away, considering them beyondthe scope of rational choice theory. Simon, however, saw thisabstraction as precisely the problem with rational choice theory.Additionally, bounded rationality often focuses on adaptive behaviorsuited to an organism’s environment (section 3). Ecological modeling involves goal-directed behavior influenced by theorganism’s constitution and stable features of its environment,making (synchronic) coherent comparative judgments less relevant toframing the problem.

However, it is important to avoid overgeneralizing the limited role ofdecision-theoretic tools in studying bounded rationality. Decisiontheory, broadly construed to include statistical decision theory(Berger 1985), provides a robust mathematical toolbox. Whiletraditional decision theory has relied on psychological assumptionslike “degrees of belief” and logical omniscience (section 1.4), exploring axiomatic departures from expected utility theory can bothchallenge Bayesian orthodoxy and expand the practical application ofpowerful mathematical methods.

1.4 Limits to Logical Omniscience

Most formal models of judgement and decision making assume logicalomniscience, meaning that a decision maker is presumed to havecomplete knowledge of all that logically follows from his currentcommitments and the options available for choice. This assumption isboth psychologically unrealistic and technically challenging to avoid(Stalnaker 1991). Any theory that recommends disbelieving a claim whenthe evidence is logically inconsistent, for example, will beunworkable when the belief in question is sufficiently complicated forall but logically omniscient agents. Further, this policy remainsunworkable even for non-omniscient agents that have access tounlimited computational resources (Kelly and Schulte 1995).

The problem of logical omniscience is particularly acute for expectedutility theory in general, and the theory of subjective probability inparticular. For the postulates of subjective probability imply that anagent knows all the logical consequences of her commitments, therebymandating logical omniscience. This limits the applicability of thetheory, however. For example, it prohibits having uncertain judgmentsabout mathematical and logical statements. In an article from 1967,“Difficulties in the theory of personal probability”,reported in Hacking 1967 and Seidenfeld, Schervish, & Kadane 2012but misprinted in Savage 1967, Savage raises the problem of logicalomniscience for the subjective theory of probability:

The analysis should be careful not to prove too much; for somedepartures from theory are inevitable, and some even laudable. Forexample, a person required to risk money on a remote digit of \(\pi\)would, in order to comply fully with the theory, have to compute thatdigit, though this would really be wasteful if the cost of computationwere more than the prize involved. For the postulates of the theoryimply that you should behave in accordance with the logicalimplication of all that you know. Is it possible to improve the theoryin this respect, making allowances within it for the cost of thinking,or would that entail paradox, as I am inclined to believe but unableto demonstrate? (Savage 1967 excerpted from Savage’sprepublished draft; see notes in Seidenfeld et al. 2012)

Responses to Savage’s problem include a game-theoretic treatmentproposed by I.J. Good (1983), which swaps the extensional variablethat is necessarily true for an intensional variable representing anaccomplice who knows the necessary truth but withholds enoughinformation from you, allowing you to be (coherently) uncertain aboutwhat your accomplice knows. This trick changes the subject of youruncertainty, from a necessarily true proposition that you cannotcoherently doubt to a coherent guessing game about that truthfacilitated by your accomplice’s incomplete description. Anotherresponse sticks to the classical line that failures of logicalomniscience are deviations from the normative standard of perfectrationality but introduces an index for incoherence to accommodatereasoning with incoherent probability assessments (Schervish,Seidenfeld, & Kadane 2012, Konek 2023). A third approach,suggested by de Finetti (1970), is to restrict possible states ofaffairs to observable states with a finite verifiableprocedure—which may rule out theoretical states or any otherthat does not admit a verification protocol. Originally, what deFinetti was after was a principled way to construct a partition overpossible outcomes to distinguish serious possible outcomes of anexperiment from wildly implausible but logically possible outcomes,yielding a method for distinguishing between genuine doubt and mere“paper doubts” (Peirce 1955). Other proposals follow deFinetti’s line by tightening the admissibility criteria andincludeepistemically possible events, which are events thatare logically consistent with the agent’s available information;apparently possible events, which include any event bydefault unless the agent has determined that it is inconsistent withhis information; andpragmatically possible events, whichonly includes events that are judged sufficiently important (Walley1991: section 2.1).

The notion ofapparently possible refers to a procedure fordetermining inconsistency, which is a form of bounded proceduralrationality (section 2). The challenges of avoiding paradox, which Savage alludes to, areformidable. However, work on bounded fragments of Peano arithmetic(Parikh 1971) provide coherent foundations for exploring these ideas,which have been taken up specifically to formulate bounded-extensionsof default logic forapparent possibility (Wheeler 2004) andmore generally in models ofcomputational rationality (Lewis,Howes, & Singh 2014).

1.5 Descriptions, Prescriptions, and Normative Standards

It is common to distinguish how people actually render judgements ormake decisions from how they should ideally do so. However, the studyof bounded rationality suggests distinguishing three aims of inquiryrather than two: adescriptive theory that explains orpredicts what judgments or decisions people actually make; aprescriptive theory that explains or recommends the judgmentsor decisions people ought to make; anormative theory thatspecifies the normative standards to use when evaluating theeffectiveness of a judgement or decision.

To illustrate these types, consider arithmetic. A descriptive theoryof arithmetic might address the psychology of arithmetic reasoning, amodel of numeracy in animals, or an algorithm for arbitrary-precisionarithmetic in computers. The normative standard of full arithmetic isPeano’s axiomatization, which defines natural number arithmeticthrough succession and mathematical induction. Alternative standardsinclude Robinson’s induction-free fragment of Peano arithmetic(Tarski, Mostowski, and Robinson 1953) and various systems of cardinalarithmetic in the hierarchy for large cardinals. A prescriptive theoryreferences both a fixed normative standard and relevant facts aboutthe arithmetic capabilities of the organism or machine. For example,curricula for improving arithmetic performance will differ betweenchildren and adults due to psychological differences, even thoughPeano arithmetic is the normative standard for both groups.Peano’s axioms, while the normative standard for fullarithmetic, are not prescriptive advice for teaching arithmetic, norare they descriptive of arithmetic reasoning. Nevertheless, adescriptive theory of arithmetic assumes Peano’s axiomsimplicitly, since summing two numbers presumes an intent to performarithmetic, not concatenate, sequence, encode, or any other operationon numbers.

Finally, consider developing a pedagogy for teaching cardinalarithmetic to children, based on an effective method for teachingarithmetic. A prescriptive theory might adapt successful methods fromfull arithmetic while anticipating that some approaches will not applyto ZFC+. Differences might arise directly from the change in normativestandards or from the interplay between the new task andchildren’s psychological capabilities.

To be sure, there are important differences between arithmetic andrational behavior. The objects of arithmetic (numerals and the numbersthey represent) are relatively clear-cut, whereas the objects ofrational behavior vary even when the same theoretical framework isused. For instance, in expected utility theory an agent might be seenas choosing options to maximize personal welfare, actingasif doing so but without actual deliberation, or as playing anon-deliberative role in the population fitness of their kind.

Separating the choice of a normative standard from the evaluation ordescription of behavior is crucial for avoiding misunderstandings indiscussions of bounded rationality. Although Peano’s axioms arenot prescribed to improve or describe arithmetic reasoning, theyremain necessary to both descriptive and prescriptive theories ofarithmetic. While it remains an open question whether the normativestandards for human rational behavior are axiomatizable, clearnormative standards nevertheless help to advance our understanding ofhow people make judgements and decisions, and how they ought to doso.

2. The Emergence of Procedural Rationality

Simon thought the shift in focus from reasoning behavior to choicebehavior was a mistake. Since, in the 1950s, little was known aboutthe processes involved in making judgments or reaching decisions,abstracting away those features from our mathematical models waspremature. Yet, this ignorance raised the question of how to proceed.The answer was to attend to the costs in effort involved in operatinga decision-making procedure and compare those costs to the resourcesavailable to the organism. Conversely, the performance of the organismin terms of accuracy (section 7.2) with its limited cognitive resources was compared to models withsimilar accuracy within those resource bounds. Managing the trade-offbetween decision costs and decision quality involves another type ofrationality, which Simon later calledprocedural rationality(Simon 1976, p. 69).

Process models emphasize the cognitive and procedural aspects ofdecision making. By focusing on the algorithms and psychologicalprocesses involved, process models shed light on how individualsnavigate complex decisions within their cognitive limitations. In thissection we highlight early, key contributions to modeling proceduresfor boundedly rational judgment and decision making, including theorigins of theaccuracy-effort trade-off, Simon’ssatisficing strategy,improper linear models, andthe earliest effort to systematize several features of high-level,cognitive judgment and decision making:cumulative prospecttheory.

2.1 Accuracy and Effort

Herbert Simon and I.J. Good were each among the first to callattention to the cognitive demands of subjective expected utilitytheory, although neither one in his early writings abandoned theprinciple of expected utility as the normative standard for rationalchoice. Good, for instance, referred to the recommendation to maximizeexpected utility as theordinary principle of rationality,whereas Simon called the expected utility principleobjectiverationality and considered it the central tenant ofglobalrationality. The rules of rational behavior are costly to operatein both time and effort, Good observed, so real agents have aninterest in minimizing those costs (Good 1952, §7(i)). Efficiencydictates that one choose from available alternatives an option thatyields the largest result given the resources available, which Simonemphasized is not necessarily an option that yields the largest resultoverall (Simon 1947, p. 79). So reasoning judged deficient withoutconsidering the associated costs may be found meritorious once allthose costs are accounted for—a conclusion that a range ofauthors endorsed, including Amos Tversky:

It seems impossible to reach any definitive conclusions concerninghuman rationality in the absence of a detailed analysis of thesensitivity of the criterion and the cost involved in evaluating thealternatives. When the difficulty (or the costs) of the evaluationsand the consistency (or the error) of the judgments are taken intoaccount, a [transitivity-violating method] may prove superior.(Tversky 1969)

Balancing the quality of a decision against its costs soon became apopular conception of bounded rationality, particularly in economics(Stigler 1961), where it remains commonplace to formulate boundedlyrational decision making as a constrained optimization problem. Onthis view boundedly rational agents are utility maximizers after all,once all the constraints are made clear (Arrow 2004). Another reasonfor the popularity of this conception of bounded rationality is itscompatibility with Milton Friedman’sas if methodology(Friedman 1953), which licenses models of behavior that ignore thecausal factors underpinning judgment and decision making. To say thatan agent behaves as if he is a utility maximizer is at once to concedethat he is not but that his behavior proceeds as if he were.Similarly, to say that an agent behaves as if he is a utilitymaximizer under certain constraints is to concede that he does notsolve constrained optimization problems but nevertheless behaves as ifhe did so.

Simon’s focus on computationally efficient methods that yieldsolutions that are good enough contrasts with Friedman’s as ifmethodology, since evaluating whether a solution is “goodenough”, in Simon’s terms, involves search procedures,stopping criteria, and how information is integrated in the course ofmaking a decision. Simon offers several examples to motivate inquiryinto computationally efficient methods. Here is one. Applying thegame-theoretic minimax algorithm to the game of chess calls forevaluating more chess positions than the number of molecules in theuniverse (Simon 1957, p. 6). Yet if the game of chess is beyond thereach of exact computation, why should we expect everyday problems tobe any more tractable? Simon’s question is to explain how humanbeings manage to solve complicated problems in an uncertain worldgiven their meager resources. Answering Simon’s question, asopposed to applying Friedman’s method to fit a constrainedoptimization model to observed behavior, is to demand a model withbetter predictive power concerning boundedly rational judgment anddecision making. In pressing this question of how human beings solveuncertain inference problems, Simon opened two lines of inquiry thatcontinue today, namely:

How do human beings actually make decisions “in thewild”?
How can the standard theories of global rationality be simplified torender them more tractable?

Simon’s initial efforts aimed to simplify global rationality.Due to the limited psychological knowledge about decision making atthe time, he relied on a layman’s “acquaintance with thegross characteristics of human choice” (Simon 1955a, p. 100). Heproposed replacing the complex optimization problem of maximizingexpected utility with a simpler decision criterion he calledsatisficing. This approach, along with models with betterpredictive power, sought to more accurately reflect how decisions aremade in practice.

2.2 Satisficing

Satisficing is the strategy of considering the options available toyou for choice until you find one that meets or exceeds a predefinedthreshold—your aspiration level—for a minimally acceptableoutcome. Although Simon originally thought of procedural rationalityas a poor approximation of global rationality, and thus viewed thestudy of bounded rationality to concern “the behavior of humanbeings whosatisfice because they have not the wits tomaximize” (Simon 1957a: xxiv), there are a range ofapplications of satisficing models to sequential choice problems,aggregation problems, and high-dimensional optimization problems,which are increasingly common in machine learning and economics.

Given a specification of what will count as a good-enough outcome,satisficing replaces the optimization objective from expected utilitytheory of selecting an undominated outcome with the objective ofpicking an option that meets your aspirations. The model has sincebeen applied to business (Bazerman & Moore 2008; Long, Sim, andZhou 2022), mate selection (Todd & Miller 1999), surveymethodology (Roberts, et al. 2019,) and other practicalsequential-choice problems, including selecting a parking spot(Hutchinson, Fanselow, et al. 2012). Ignoring the procedural aspectsof Simon’s original formulation of satisficing, if one has afixed aspirational level for a given decision problem, then admissiblechoices from satisficing can be captured by so-called\(\epsilon\)-efficiency methods (Loridan 1984; White 1986; Yongacoglu,et al. 2023).

Hybrid optimization-satisficing techniques are used in machinelearning when many metrics are available but no sound or practicalmethod is available for combining them into a single value. Instead,hybrid optimization-satisficing methods select one metric to optimizeand satisfice the remainder. For example, a machine learningclassifier might optimize accuracy (i.e., maximize the proportion ofexamples for which the model yields the correct output; seesection 7.2) but set aspiration levels for the false positive rate, coverage, andruntime.

Selten’saspiration adaption theory models decisiontasks as problems with multiple incomparable goals that resistaggregation into a complete preference order over all alternatives(Selten 1998). Instead, the decision-maker will have a vector of goalvariables, where vectors are comparable by weak dominance. If vector Aand vector B are possible assignments for my goals, then A dominatesvector B if there is no goal in the sequence in which B assigns avalue that is strictly less than A, and there is some goal for which Aassigns a value strictly greater than B. Selten’s model imaginesan aspiration level for each goal, which itself can be adjusted upwardor downwards depending on the set of feasible (admissible) options.Aspiration adaption theory is a highly procedural and local account inthe tradition of Newell and Simon’s approach to human problemsolving (Newell & Simon 1972), although it was not initiallyoffered as a psychological process model.

Level-k thinking is a variant of satisficing adapted toiterative and strategic decisions, with agents forming expectationsbased on a finite number of iterative steps (Camerer 2003). Withinthis model, agents are categorized based on the number of iterationsthey perform, wherelevel-0 agents use a simple naivestrategy without considering other agents’ potential decisions,level-1 agents best responds to a level-o agent,level-2 agents best respond to a strategy of level-1 agents,and so on. By limiting the number of reasoning steps, level-kthinking is offered as a more realistic portrayal of human strategicdecision-making based their limited expectations of others’behavior (Chong, Camerer, and Ho 2005), including cases where backwardinduction is infeasible (Ho and Su 2024).

2.3 Proper and Improper Linear Models

Proper linear models represent another important class of optimizationmodels. Aproper linear model is one where predictorvariables are assigned weights, which are selected so that the linearcombination of those weighted predictor variables optimally predicts atarget variable of interest. For example, linear regression is aproper linear model that selects weights such that the squared“distance” between the model’s predicted value ofthe target variable and the actual value (given in the data set) isminimized.

Paul Meehl’s review in the 1950s of psychological studies usingstatistical methods versus clinical judgment cemented the statisticalturn in psychology (Meehl 1954). Meehl’s review found thatstudies involving the prediction of a numerical target variable fromnumerical predictors is better done by a proper linear model than bythe intuitive judgment of clinicians. Concurrently, the psychologistKenneth Hammond formulated Brunswik’s lens model (section 3.2) as a composition of proper linear models to model the differencesbetween clinical versus statistical predictions (K. Hammond 1955).Proper linear models have since become a workhorse in cognitivepsychology in areas that include decision analysis (Keeney &Raiffa 1976; Kaufmann & Wittmann 2016), causal inference(Waldmann, Holyoak, & Fratianne 1995; Spirtes 2010), andresponse-times to choice (Brown & Heathcote 2008; Turner,Rodriguez, et al. 2016).

Robin Dawes, returning to Meehl’s question about statisticalversus clinical predictions, found that evenimproper linearmodels perform better than clinical intuition (Dawes 1979). Thedistinguishing feature of improper linear models is that the weightsof a linear model are selected by some non-optimal method. Forinstance, equal weights might be assigned to the predictor variablesto afford each equal weight or a unit-weight, such as 1 or −1,to tally features supporting a positive or negative prediction,respectively. As an example, Dawes proposed an improper model topredict subjective ratings of marital happiness by couples based onthe difference between their rates of lovemaking and fighting. Theresults? Among the thirty happily married couples, two argued morethan they had intercourse. Yet all twelve unhappy couples fought morefrequently. And those results replicated in other laboratoriesstudying human sexuality in the 1970s. Both equal-weight regressionand unit-weighttallying have since been found to commonlyoutperform proper linear models on small data sets. Although no simpleimproper linear model performs well across all common benchmarkdatasets, for almost every data set in the benchmark there is somesimple improper model that performs well in predictive accuracy(Lichtenberg & Simsek 2016). This observation, and many others inthe heuristics literature, points to biases of simplified models thatcan lead to better predictions when used in the right circumstances (section 4).

Dawes’s original point was not that improper linear modelsoutperform proper linear models in terms of accuracy, but rather thatthey are more efficient and (often) close approximations of properlinear models. “The statistical model may integrate theinformation in an optimal manner”, Dawes observed, “but itis always the individual …who chooses variables” (Dawes1979: 573). Moreover, Dawes argued that it takes human judgment toknow the direction of influence between predictor variables and targetvariables, which includes the knowledge of how to numerically codethose variables to make this direction clear. Advances in machinelearning chip away at Dawes’s claims about the unique role ofhuman judgment, and results from Gigerenzer’s ABC Group aboutunit-weight tallying outperforming linear regression in out-of-sampleprediction tasks with small samples is an instance of improper linearmodels outperforming proper linear models (Czerlinski, Gigerenzer,& Goldstein 1999). Nevertheless, Dawes’s general observationabout the relative importance of variable selection over variableweighting stands (Katsikopoulos, Schooler, & Hertwig 2010).

2.4 Cumulative Prospect Theory

If both satisficing and improper linear models are examples addressingSimon’s second question at the start of thissection—namely, how to simplify existing models to render themboth tractable and effective—then Daniel Kahneman and AmosTversky’scumulative prospect theory addressed thefirst question by directly incorporating knowledge about how humansactually make decisions.

In our discussion insection 1.2 about alternatives to the Independence Axiom, (A3), we mentioned several observed features of human choice behavior thatstand at odds with the prescriptions of expected utility theory.Kahneman and Tversky developed prospect theory around four of thoseobservations about human decision-making (Kahneman & Tversky 1979;Wakker 2010).

Reference Dependence. Rather than make decisions bycomparing the absolute magnitudes of welfare, as prescribed byexpected utility theory, people instead tend to value prospects bytheir change in welfare with respect to a reference point. Thisreference point can be a person’s current state of wealth, anaspiration level, or a hypothetical point of reference from which toevaluate options. The intuition behind reference dependence is thatour sensory organs have evolved to detect changes in sensory stimulirather than store and compare absolute values of stimuli. Therefore,the argument goes, we should expect to see the cognitive mechanismsinvolved in decision-making to inherit this sensitivity to changes inperceptual attributes values.
In prospect theory, reference dependence is reflected by utilitychanging sign at the origin of the valuation curve \(v(\cdot)\) inFigure 1(a). Thex-axis represents gains (right side) and losses (leftside) in euros, andy-axis plots the value placed on relativegains and losses by a valuation function \(v(\cdot)\), which is fit toexperimental data on people’s choice behavior.
Loss Aversion. People are more sensitive to lossesthan gains of the same magnitude; the thrill of victory does notmeasure up to the agony of defeat. So, Kahneman and Tverskymaintained, people will prefer an option that does not incur a loss toan alternative option that yields a comensurate gain. The disparity inhow potential gains and losses are evaluated also accounts for theendowment effect (Thaler 1980), which is the tendency for people tovalue a good that they own more than a comparatively valued substitute(Bruner, Calegari, and Handfield 2020).
In prospect theory, loss aversion appears inFigure 1(a) in the steeper slope of \(v(\cdot)\) to the left of the origin,representing losses relative to the subject’s reference point,than the slope of \(v(\cdot)\) for gains on the right side of thereference point. Thus, for the same magnitude of change in rewardx from the reference point, the corresponding magnitude ofthe consequence of gainingx is less than the magnitude ofthe consequence of losingx.
Tversky and Kahneman (1992) originally estimated the loss aversioncoefficient λ to be 2.25, suggesting that a loss looms morethan twice as painful than the pleasure from a corresponding gain. Anyvalue of λ > 1 indicates a loss-aversion effect andsubsequent empirical results strongly confirm a loss-aversion effect.However, a meta-analysis reveals that the effect is weaker thaninitially assumed (section 1.3), with a mean of λ of 1.97 and median of 1.69 (Brown, Imai,Vieider, and Camerer 2024). Note also that differences in affectiveattitudes toward, and the neurological processes responsible forprocessing, losses and gains do not necessarily translate todifferences in people’s choice behavior (Yechiam and Hochman2014).
Diminishing Returns for both Gains and Losses. Givena fixed reference point, people’s sensitivity to changes inasset values (x inFigure 1a) diminish the further one moves from that reference point, both in thedomain of losses and the domain of gains. This is inconsistent withexpected utility theory, even when the theory is modified toaccommodate diminishing marginal utility (M. Friedman & Savage1948).
In prospect theory, the valuation function \(v(\cdot)\) is concave forgains and convex for losses, representing a diminishing sensitivity toboth gains and losses. Expected utility theory can be made toaccommodate sensitivity effects, but the utility function is typicallyeither strictly concave or strictly convex, not both.
Probability Weighting. Finally, for known exogenousprobabilities, people do not calibrate their subjective probabilitiesby direct inference (Levi 1977), but instead systematicallyunderweight high-probability events and overweight low-probabilityevents, with a cross-over point of approximately one-third (Figure 1b). Thus, changes in very small or very large probabilities have greaterimpact on the evaluation of prospects than they would under expectedutility theory. People are willing to pay more to reduce the number ofbullets in the chamber of a gun from 1 to 0 than from 4 bullets to 3in a hypothetical game of Russian roulette.
Figure 1(b) plots the median values for the probability weighting function\(w(\cdot)\) that takes the exogenous probabilitypassociated with prospects, as reported in Tversky & Kahneman 1992.Roughly, below probability values of one-third people overestimate theprobability of an outcome (consequence), and above probabilityone-third people tend to underestimate the probability of an outcomeoccurring. Traditionally, overweighting is thought to concern thesystematic miscalibration of people’s subjective estimates ofoutcomes against a known exogenous probability,p, serving asthe reference standard. In support of this view, miscalibrationappears to disappear when people learn a distribution through samplinginstead of learning identical statistics by description (Hertwig,Barron, Weber, & Erev 2004). Miscalibration in this context oughtto be distinguished from overestimating or underestimating subjectiveprobabilities when the relevant statistics are not supplied as part ofthe decision task. For example, televised images of the aftermath ofairplane crashes lead to an overestimation of the low-probabilityevent of commercial airplanes crashing. Even though a person’ssubjective probability of the risk of a commercial airline crash wouldbe too high given the statistics, the mechanism responsible isdifferent: here therecency oravailability ofimages from the evening news is to blame for scaring him out of hiswits, not the sober fumbling of a statistics table. An alternativeview maintains that people understand that their weightedprobabilities are different than the exogenous probability butnevertheless prefer to act as if the exogenous probability were soweighted (Wakker 2010). On this view, probability weighting is not a(mistaken) belief but a preference.

Figure 1: (a) plots the value function\(v(\cdot)\) applied to consequences of a prospect; (b) plots themedian value of the probability weighting function \(w(\cdot)\)applied to positive prospects of the form \((x, p; 0, 1-p)\) withprobabilityp. [Anextended description of this figure is in the supplement.]

Prospect theory incorporates these components into models of humanchoice under risk by first identifying a reference point that eitherrefers to the status quo or some other aspiration level. Theconsequences of the options under consideration then are framed interms of deviations from this reference point. Extreme probabilitiesare simplified by rounding off, which yields miscalibration of thegiven, exogenous probabilities. Dominance reasoning is then applied,where dominated alternatives are eliminated from choice, along withadditional steps to separate options without risk, probabilitiesassociated with a specific outcome are combined, and a version ofeliminating irrelevant alternatives is applied (Kahneman & Tversky1979: 284–285).

Nevertheless, prospect theory comes with problems. For example, ashift of probability from less favorable outcomes to more favorableoutcomes ought to yield a better prospect, all things considered, butthe original prospect theory violates this principle of stochasticdominance.Cumulative prospect theory satisfies stochasticdominance, however, by appealing to a rank-dependent method fortransforming probabilities (Quiggin 1982). For a review of thedifferences between prospect theory and cumulative prospect theory,along with an axiomatization of cumulative prospect theory, seeFennema & Wakker 1997.

3. The Emergence of Ecological Rationality

Imagine an area of plants teeming with insects and few in flight. Thisenvironment favors a bird that gleans over one that hawks. In asimilar fashion, a decision-making environment might be more favorablefor one decision-making strategy than to another. Just as it would be“irrational” for a bird to hawk rather than glean, giventhe choice, so too what may be an irrational decision strategy in oneenvironment may be entirely rational in another.

If procedural rationality attaches a cost to the making of a decision,then ecological rationality locates that procedure in the world. Thequestions ecological rationality ask is what features of anenvironment help or hinder decision making and how should we modeljudgment or decision-making ecologies (Neisser 1976; Shamay-Tsoory andMendelsohn 2019; Osborne-Crowley 2020). For example, people makecausal inferences about patterns of covariation theyobserve—especially children, who then perform experimentstesting their causal hypotheses (Glymour 2001). Unsurprisingly, peoplewho draw the correct inferences about the true causal model do betterthan those who infer the wrong causal model (Meder, Mayrhofer, andWaldmann 2014). More surprising, Meder and his colleagues found thatthose making correct causal judgments do better than subjects who makeno causal judgments at all. And perhaps most surprising of all is thatthose with true causal knowledge also beat the benchmark standards inthe literature which ignore causal structure entirely; the benchmarksencode, spuriously, the assumption that the best we can do is to makeno causal judgments at all.

In this section, after reviewing Simon’s proposal fordistinguishing betweenbehavioral constraints andenvironmental structure, we turn to three historicallyimportant contributions:the lens model,rationalanalysis, andcultural adaptation. Finally, we reviewthebias-variance trade-off, which is frequently mentioned inthe Fast and Frugal Heuristics literature (section 5.2).

3.1 Behavioral Constraints and Environmental Structure

Simon thought that both behavioral constraints and environmentalstructure ought to figure in a theory of bounded rationality, yet hecautioned against identifying behavioral and environmental propertieswith features of an organism and features of its physical environment,respectively:

what we call “the environment” may lie, in part, withinthe skin of the biological organisms. That is, some of the constraintsthat must be taken as givens in an optimization problem may bephysiological and psychological limitations of the organism(biologically defined) itself. For example, the maximum speed at whichan organism can move establishes a boundary on the set of itsavailable behavior alternatives. Similarly, limits on computationalcapacity may be important constraints entering into the definition ofrational choice under particular circumstances. (Simon 1955a: 101)

That said, what is classified as a behavioral constraint rather thanan environmental affordance varies across disciplines and thetheoretical tools put to use. For example, one computational approachto bounded rationality,computational rationality theory(Lewis, Howes, and Singh 2014), classifies the cost to an organism ofexecuting an optimal program as a behavioral constraint, classifieslimits on memory as an environmental constraint, and treats the costsassociated with searching for an optimal program to execute asexogenous. Anderson and Schooler’s study and computationalmodeling of human memory (Anderson and Schooler 1991) within the ACT-Rframework, on the other hand, views the limits on memory andsearch-costs as behavioral constraints which are adaptive responses tothe structure of the environment. Still another broad class ofcomputational approaches are found instatistical signalprocessing, such as adaptive filters (Haykin 2013), which arecommonplace in engineering and vision (Marr 1982; Ballard and Brown1982). Signal processing methods typically presume the sharpdistinction between device and world that Simon cautioned againstadopting, however. Still others have challenged the distinctionbetween behavioral constraints and environmental structure by arguingthat there is no clear way to separate organisms from the environmentsthey inhabit (Gibson 1979), or by arguing that features of cognitionwhich appear body-bound may not be necessarily so (Clark &Chalmers 1998).

Bearing in mind the different ways the distinction between behaviorand environment have been drawn, and challenges to what preciselyfollows from drawing such a distinction, ecological approaches torationality all endorse the thesis that the ways in which an organismmanages structural features of its environment are essential tounderstanding how deliberation occurs and effective behavior arises.In doing so theories of bounded rationality have traditionally focusedon at least some of the following features, under this roughclassification:

Behavioral Constraints—may refer to bounds oncomputation, such as thecost of searching the best algorithmto run, an appropriate rule to apply, or a satisficing option tochoose; thecost of executing an optimal algorithm,appropriate rule, or satisficing choice; andcosts of storingthe data structure of an algorithm, the constitutive elements of arule, or the objects of a decision problem.
Ecological Structure—may refer tostatistical,topological, or otherperceptibleinvariances of the task environment that an organism is adaptedto; or toarchitectural features orbiologicalfeatures of the computational processes or cognitive mechanismsresponsible for effective behavior, respectively.

3.2 Brunswik’s Lens Model

Egon Brunswik was a pioneer in applying probability and statistics tothe study of human perception and emphasized the role of ecology inthe generalizability of psychological findings. Brunswik thoughtpsychology ought to aim for statistical descriptions of adaptivebehavior (Brunswik 1943). Instead of isolating a few independentvariables to systematically manipulate and observe their effects on adependent variable, he argued that psychological experiments shouldassess how an organism adapts to its environment. Experimentalsubjects should represent the population, and the experimentalsituations should represent the subjects’ natural environment(Brunswik 1955). Thus, Brunswik advocated for a representative designin psychological experiments to preserve the causal structure of thesubjects’ natural environment. For a review of the developmentof representative design and its use in the study of judgment anddecision making, see (Dhami, Hertwig, & Hoffrage 2004).

Brunswik’slens model centers on how behavioral andenvironmental conditions influence organisms in perceiving proximalcues to infer some distal features of their “natural-culturalhabitat” (Brunswik 1955, p.198). For instance, an organism maydetect the color markings (distal object) of a potential mate throughlight contrasts reflected on its retina (proximal cues). Some proximalcues provide more information about distal objects, which Brunswikunderstood as differences in the “objective” correlationsbetween proximal cues and the target distal objects. The ecologicalvalidity of proximal cues refers to their capacity to provide usefulinformation about distal objects within a particular environment.Performance assessments of an organism thus compare its actual use ofcue information to the cue’s information capacity.

Kenneth Hammond and colleagues (K. Hammond, Hursch, & Todd 1964)formulated Brunswik’s lens model as a system of linear bivariatecorrelations, as depicted inFigure 2 (Hogarth & Karelaia 2007). Informally,Figure 2 says that the accuracy of a subject’s judgment (response),\(Y_s\), about a numerical target criterion, \(Y_e\), given someinformative cues (features) \(X_1, \ldots, X_n\), is determined by thecorrelation between the subject’s response and the target. Morespecifically, the linear lens model imagines two large linear systems,one for the environment,e, and another for the subject,s, which both share a set of cues, \(X_1, \ldots, X_n\). Notethat cues may be associated with one another, i.e., it is possiblethat \(\rho(X_i,X_j) \neq 0\) for indices \(i\neq j\) from 1 ton.

The accuracy of the subject’s judgment \(Y_s\) about the targetcriterion value \(Y_e\) is measured by anachievement index,\(r_a\), which is computed by Pearson’s correlation coefficient\(\rho\) of \(Y_e\) and \(Y_s\). The subject’s predictedresponse \(\hat{Y}_s\) to the cues is determined by the weights\(\beta_{s_i}\) the subject assigns to each cue \(X_i\), and thelinearity of the subject’s response, \(R_s\), measures the noisein the system, \(\epsilon_s\). Thus, the subject’s response isconceived to be a weighted linear sum of subject-weighted cues plusnoise. The analogue ofresponse linearity in the environmentisenvironmental predictability, \(R_e\). The environment, onthis model, is thought to be probabilistic—or“chancy” as some say. Finally, the environment-weightedsum of cues, \(\hat{Y}_e\), is compared to the subject-weighted sum ofcues, \(\hat{Y}_s\), by amatching index,G.

a complex diagram described in full in the next link

Figure 2: Brunswik’s Lens Model
[Anextended description of this figure is in the supplement.]

With this formulation of the lens model in mind, return toSimon’s remarks above concerning the classification ofenvironmental affordance versus behavioral constraint. The lens modelsconception as a linear model is influenced by signal detection theory,originally developed to improve the accuracy of early radar systems.Thus, the model inherits from engineering a clean division betweensubject and environment. However, suppose for a moment that both theenvironmental mechanism producing the criterion value and thesubject’s predicted response are linear. Now consider theerror-term, \(\epsilon_s\). That term may refer to biologicalconstraints that are responses to adaptive pressures on the wholeorganism. If so, ought \(\epsilon_s\) be classified as anenvironmental constraint rather than a behavioral constraint? Theanswer will depend on what follows from the reclassification, whichwill depend on the model and the goal of inquiry (section 7). If we were using the lens model to understand the ecological validityof an organism’s judgment, then reclassifying \(\epsilon_s\) asan environmental constraint would only introduce confusion; If insteadour focus was to distinguish between behavior that is subject tochoice and behavior that is precluded from choice, then the proposedreclassification may herald clarity—but then we would surelyabandon the lens model for something else, or in any case would nolonger be referring to the parameter \(\epsilon_s\) inFigure 2.

Finally, it should be noted that the lens model, like nearly alllinear models used to represent human judgment and decision-making,does not scale well as a descriptive model. In multi-cuedecision-making tasks involving more than three cues, people oftenturn to simplifying heuristics due to the complications involved inperforming the necessary calculations (section 2.1; see alsosection 4). More generally, as we remarked insection 2.3, linear models involve calculating trade-offs that are difficult forpeople to perform. Lastly, the supposition that the environment islinear is a strong modeling assumption. Quite apart from thedifficulties that arise for humans to execute the necessarycomputations, it becomes theoretically more difficult to justify modelselection decisions as the number of features increases. The matchingindexG is a goodness-of-fit measure, but goodness-of-fittests and residual analysis begin to lead to misleading conclusionsfor models with five or more dimensions. Machine learning techniquesfor supervised learning get around this limitation by focusing onanalogues of the achievement index, construct predictive hypothesespurely instrumentally, and dispense with matching altogether (Wheeler2017).

3.3 Rational Analysis

Rational analysis is a methodology applied in cognitive science andbiology to explain why a cognitive system or organism engages in aparticular behavior by appealing to the presumed goals of theorganism, the adaptive pressures of its environment, and theorganism’s computational limitations. Once an organism’sgoals are identified, the adaptive pressures of its environmentspecified, and the computational limitations are accounted for, anoptimal solution under those conditions is sought to explain why abehavior that is otherwise ineffective may nevertheless be effectivein achieving that goal under those conditions (Marr 1982; Anderson1991; Oaksford and Chater 1994; Palmer 1999). Rational analysistherefore assumes that evolution and learning have optimally adaptedthe human mind to its environment, an assumption supported byempirical evidence showing near-optimal human performance inperception (Knill and Richards 1996), motor control (Körding andWolpert 2004), statistical learning (Fiser and Aslin 2002).

One theme to emerge from the rational analysis literature that hasinfluenced bounded rationality is the study of memory (Anderson andSchooler 1991; Chater and Oaksford 1999). For instance, given thestatistical features of our environment, and the sorts of goals wetypically pursue, forgetting is an advantage rather than a liability(Schooler and Hertwig 2005). Memory traces vary in their likelihood ofbeing used, so the memory system will try to make readily availablethose memories which are most likely to be useful. This is a rationalanalysis style argument, which is a common feature of the Bayesianturn in cognitive psychology (Oaksford and Chater 2007; Friston 2010).More generally, spacial arrangements of objectives in the environmentcan simplify perception, choice, and the internal computationnecessary for producing an effective solution (Kirsch 1995). Comparethis view to the discussion of recency or availability effectsdistorting subjective probability estimates insection 2.4.

Rational analyses are typically formulated independently of thecognitive processes or biological mechanisms that explain how anorganism realizes a behavior. Thus, when an organism’s observedbehavior in an environment does not agree with the behavior prescribedby a rational analysis for that environment, there are traditionallythree responses. One strategy is to change the specifications of theproblem, by introducing an intermediate step or changing the goalaltogether, or altering the environmental constraints, et cetera(Anderson & Schooler 1991; Oaksford & Chater 1994). Anotherstrategy is to argue that mechanisms matter after all, so details ofhuman psychology are taken into an alternative account (Newell &Simon 1972; Gigerenzer, Todd, et al. 1999; Todd, Gigerenzer, et al.2012). A third option is to enrich rational analysis by incorporatingcomputational mechanisms directly into the model (Russell &Subramanian 1995; Chater 2014; Lieder and Griffiths 2020). Lewis,Howes, and Singh, for instance, propose to construct theories ofrationality from (i) structural features of the task environment; (ii)the bounded machine the decision-process will run on, about which theyconsider four different classes of computational resources that may beavailable to an agent; and (iii) a utility function to specify thegoal, numerically, so as to supply an objective function against whichto score outcomes (Lewis et al. 2014).

3.4 Cultural Adaptation

So far we have considered theories and models which emphasize anindividual organism and its surrounding environment, which istypically understood to be either the physical environment or, ifsocial, modeled as if those social structures were the physicalenvironment. And we considered whether some features commonlyunderstood to be behavioral constraints ought to be instead classifiedas environmental affordances.

Yet people and their responses to the world are also part of eachperson’s environment. Boyd and Richardson argue that humansocieties ought to be viewed as an adaptive environment, which in turnhas consequences for how individual behavior is evaluated. Humansocieties contain a large reservoir of information that is preservedthrough generations and expanded upon, despite limited, imperfectlearning by the members of human societies. Imitation, which is acommon strategy in humans, including preverbal infants (Gergely,Bekkering, and Király 2002), is central to cultural transmission(Boyd and Richerson 2005) and the emergence of social norms (Bicchieriand Muldoon 2014). In our environment, only a few individuals with aninterest in improving on the folk lore are necessary to nudge theculture to be adaptive. The main advantage that human societies haveover other groups of social animals, this argument runs, is thatcultural adaptation is much faster than genetic adaptation (Bowles andGintis 2011). On this view, human psychology evolved to facilitatespeedy adaptation. Natural selection did not equip our large-brainedancestors with rigid behavior, but instead selected for brains thatallowed then to modify their behavior adaptively in response to theirenvironment (Barkow, Cosmides, and Tooby 1992).

But if human psychology evolved to facility fast social learning, itcomes at the cost of human credulity. To have speedy adaptationthrough imitation of social norms and human behavior, the risk is theadoption of maladaptive norms or stupid behavior.

3.5. The Bias-Variance Trade-off

Thebias-variance trade-off refers to a particulardecomposition of overall prediction error for an estimator into itscentral tendency (bias) and dispersion (variance). Sometimes overallerror can be reduced by increasing bias in order to reduce variance,or vice versa, effectively trading an increase in one type of error toafford a comparatively larger reduction in the other. To give anintuitive example, suppose your goal is to minimize your score withrespect to the following targets.

four targets further described in the next link

Figure 3
[Anextended description of this figure is in the supplement.]

Ideally, you would prefer a procedure for delivering your“shots” that had both a low bias and low variance. Giventhe choice between a low bias and high variance procedure versus ahigh bias and low variance procedure, you would presumably prefer thelatter procedure if it returned a lower overall error. Although adecision maker’s learning algorithm ideally will have low biasand low variance, in practice, reducing one type of error oftenincreases the other.

The bias-variance decomposition highlights a trade-off between twoextreme approaches to making a prediction (see the (supplement on the bias-variance decomposition). At one extreme, you might use a constant function as an estimator,which always gives the same prediction regardless of the data. Forexample, if your estimator always predicts \(h(X) = 7\), the variancewould be zero since the prediction never changes. However, unlessyour deterministic estimator happens by chance to align with a datagenerator that only produces the value 7, the bias of your estimatewould be very high, massively underfitting your data.

At the other extreme, aiming for zero bias means making the predictedvalue \(\hat{Y}\) match the actual value \(Y\) perfectly for everysample \((x_i, y_i)\). Since you do not know the true function\(r(X)\) and only have a sample data set \(\mathcal{D}\), you willaspire to construct an estimator that ideally generalizes to new,unseen data. However, if you fit the estimator perfectly to\(\mathcal{D}\), the variance will be very high because different datasets \(\mathcal{D}^*\) from the true model will not be identical to\(\mathcal{D}\). The variation between these data sets is the varianceor irreducible error of the data generated by the true model. Thisleads to overfitting your data.

The bias-variance trade-off therefore concerns the question of howcomplex a model ought to be to make reasonably accurate predictions onunseen or out-of-sample examples. The problem is to strike a balancebetween an underfitting model, which erroneously ignores availableinformation about the true functionr, and an overfittingmodel, which erroneously includes information that is noise and thusgives misleading information about the true functionr.

One thing that human cognitive systems do very well is to generalizefrom a limited number of examples. The difference between humans andmachines is particularly striking when we compare how humans learn acomplicated skill, such as driving a car, from how a machine learningsystem learns the same task. As harrowing an experience it is to teacha teenager how to drive a car, they do not need to crash into autility pole thousands of times to learn that utility poles are nottraversable. What teenagers learn as children about the world throughplay and observing other people drive lends to them an understandingthat utility poles are to be steered around, a piece of commonsensethat our current machine learning systems do not have but must learnfrom scratch on a case-by-case basis. Human beings have a remarkablecapacity to transfer what we learn from one domain to another domain,a capacity fueled in part by our curiosity (Kidd & Hayden2015).

Viewed from the perspective of the bias-variance trade-off, theability to make accurate predictions from sparse data suggests thatvariance is the dominant source of error but that our cognitive systemoften manages to keep these errors within reasonable limits(Gigerenzer & Brighton 2009). Indeed, Gigerenzer and Brighton makea stronger argument, stating that “the bias-variance dilemmashows formally why a mind can be better off with an adaptive toolboxof biased, specialized heuristics” (Gigerenzer & Brighton2009: 120); see alsosection 5.2. However, the bias-variance decomposition (see thebias-variance supplement) is a decomposition of squared loss, which means that thedecomposition above depends on how total error (loss) is measured.There are many loss functions, however, depending on the type ofinference one is making along with the stakes in making it. If onewere to use a 0-1 loss function, for example, where all non-zeroerrors are treated equally—meaning that “a miss as good asa mile”—the decomposition above breaks down. In fact, for0-1 loss, bias and variance combine multiplicatively (J. Friedman1997)! A generalization of the bias-variance decomposition thatapplies to a variety of loss functions \(\mathrm{L}(\cdot)\),including 0-1 loss, has been offered by (Domingos 2000),

\[\mathrm{L}(h)\ = \ \textrm{Bias}(h)^2 \ + \ \beta_1\textrm{Var}(h) \ + \ \beta_2\mathrm{N}\]

where the original bias-variance decomposition,Equation 4 (in the bias-variance supplement), appears as a special case, namely when \(\mathrm{L}(h) =\textrm{MSE}(h)\) and \(\beta_1 = \beta_2 = 1\).

One point toobserve is that the interpretation of the bias-variance trade-off as atension between two sources of error originates from the frequentistschool of statistics, which remains dominant in the judgment anddecision-making community (see thebias-variance supplement). Within Bayesian statistics, however, bias and variance are integratedinto a broader understanding of uncertainty, wherebias—introduced through priors—is deliberatelyincorporated and viewed as beneficial rather than detrimental. For aBayesian, the emphasis is to minimize overall uncertainty and improvethe quality of inference or prediction, rather than minimize bias forits own sake (Gelman et al. 2013). Additionally, note that the term‘bias’ is ambiguous, as it may be used as a descriptiveterm or a pejorative term, especially in discussions of“algorithmic bias” (Danks and London 2017) and heuristics (section 5.2).

4. Better with Bounds

Our discussion of improper linear models (section 2.3) highlighted a model that often approximates a proper linear modelsurprisingly well, and our examination of the bias-variance trade-off (section 3.5) considered how cognitive systems might make accurate predictions withminimal data. In this section, we review examples of models thatdeviate from normative standards of global rationality yet yieldmarkedly improved outcomes—achieving results impossible underglobal rationality conditions. We will explore examples from thestatistics of small samples andgame theory todemonstrate some advantages of deviating from global rationality.Then, we turn to theirreducible thermodynamic cost of informationprocessing: since information processing is inherently a physicalprocess, it is bound by physical laws, implying that all rationalitymay be viewed as inherently bounded.

4.1 Homo Statisticus and Small Samples

In a review of experimental results assessing human statisticalreasoning published in the late 1960s, which took stock of researchconducted after psychology’s full embrace of statisticalresearch methods (section 2.3), Petersen and Beach argued that the normative standard of probabilitytheory and statistical optimization methods were “a good firstapproximation for a psychological theory of inference” (Peterson& Beach 1967: 42). Petersen and Beach’s view that humanswereintuitive statisticians that closely approximate theideal standards ofhomo statisticus fit into a broaderconsensus at that time about the close fit between the normativestandards of logic and intelligent behavior (Newell & Simon 1956,1976). The assumption that human judgment and decision making closelyapproximate normative theories of probability and logic was laterchallenged by experimental results from Kahneman and Tversky, and thebiases and heuristics program more generally (section 7.1).

Among Kahneman and Tversky’s earliest findings was that peopletend to make statistical inferences from samples that are too small,even when given the opportunity to control the sampling procedure.They attributed this effect to a systematic failure of people toappreciate the biases inherent in small samples. Hertwig andcolleagues provided further insight into this phenomenon byinvestigating how people make decisions from experience versusdecisions from description (Hertwig, Barron et al. 2004). Kahneman andTversky attributed this effect to a systematic failure of people toappreciate the biases that attend small samples, although Hertwig andothers have offered evidence that samples drawn from a singlepopulation are close to the known limits to working memory (Hertwig,Barron et al. 2004). They found that when individuals make decisionsfrom experience, they often rely on relatively small samples ofinformation. This reliance is partly due to the cognitive limitationsof working memory, which can handle only a limited amount ofinformation at a time. Therefore, the tendency to make inferences fromsmall samples can be seen as a reflection of these cognitiveconstraints, rather than a mere oversight.

Overconfidence can be understood as an artifact of small samples. TheNaïve Sampling Model (Juslin, Winman, & Hansson2007) assumes that agents base judgments on a small sample retrievedfrom long-term memory at the moment a judgment is called for, evenwhen there are a variety of other methods available to the agent. Thismodel presumes that people are naïve statisticians (Fiedler &Juslin 2006) who assume, sometimes falsely, that samples arerepresentative of the target population of interest and that sampleproperties can be used directly to yield accurate estimates of apopulation. The idea is that when sample properties are uncriticallytaken as estimators of population parameters a reasonably accurateprobability judgment can be made with overconfidence, even if thesamples are unbiased, accurately represented, and correctly processedby the cognitive mechanisms of the agent. When sample sizes arerestricted, these effects are amplified.

However, sometimes effective behavior is aided by inaccurate judgmentsor cognitively adaptive illusions (Howe 2011). The statisticalproperties of small samples are a case in point. One feature of smallsamples is that correlations are amplified, making them easier todetect (Kareev 1995). This fact about small samples, when combinedwith the known limits to human short-term memory, suggests that ourworking-memory limits may be an adaptive response to our environmentthat we exploit at different stages in our lives. Adult short-termworking memory is widely believed to be limited to seven items, plusor minus two (Miller 1956), although more recent work suggests thelimit is closer to four items (Cowan 2001). For correlations of 0.5and higher, Kareev demonstrates that sample sizes between five andnine are most likely to yield a sample correlation that is greaterthan the true correlation in the population (Kareev 2000), makingthose correlations nevertheless easier to detect. Furthermore,children’s short-term memories are even more restricted thanadults, thus making correlations in the environment that much easierto detect. Of course, there is no free lunch: this small-sample effectcomes at the cost of inflating estimates of the true correlationcoefficients and admitting a higher rate of false positives (Juslin& Olsson 2005). However, in many contexts, including childdevelopment, the cost of error arising from under-sampling may be morethan compensated by the benefits from simplifying choice (Hertwig& Pleskac 2008) and accelerating learning. In the spirit ofBrunswik’s argument for representative experimental design (section 3.2), a body of literature cautions that the bulk of experiments onadaptive decision-making are performed in highly simplifiedenvironments that differ in important respects from the natural worldin which human beings make decisions (Fawcett et al. 2014). Inresponse, Houston, MacNamara and colleagues argue, we shouldincorporate more environmental complexity in our models.

4.2 Game Theory

Pro-social behavior, such as cooperation, is challenging to explainthrough traditionalevolutionary game theory, which predicts thatindividuals will forgo public goods in favor of personal utilitymaximization. Although behavior consistent with personal utilitymaximization is often observed in economic experiments, cooperativebehavior is pervasive in broader society (Bowles & Gintis 2011).Exploring this discrepancy between the global rationality of utilitymaximization and pro-social behavior has led to a body of workdemonstrating how bounded rationality can better account for observedcooperative behaviors. This research suggests that cognitive andenvironmental constraints, such as limited computational capacity,social network structures, and evolutionary dynamics, play crucialroles in fostering cooperation and other pro-social behaviors.

Traditional evolutionary explanations of human cooperation in terms ofreputation,reciprocation, andretribution(Trivers 1971; R. Alexander 1987) do not fully explain why cooperationis stable. If a group punishes individuals for failing to perform abehavior, and the punishment costs exceed the benefit of doing thatbehavior, then this behavior will become stable regardless of itssocial benefits. Anti-social norms arguably take root through asimilar mechanism (Bicchieri & Muldoon 2014). Although reputation,reciprocation, and retribution may explain how large-scale cooperationis sustained in human societies, it does not explain how the behavioremerged (Boyd & Richerson 2005). Additionally, cooperation isobserved in microorganisms (Damore & Gore 2012), suggesting thatsimpler mechanisms are sufficient for cooperative behavior.

Our understanding of the advantages deviating from global rationalitythat accrue to individual players or a group of players has advancedsince the 1970s, when it was first widely recognized that impropermodels often yield good enough approximations of corresponding propermodels (section 2.3). The 1980s and 1990s witnessed a series of results involving impropermodels yielding performance that was strictly better than what wasprescribed by the corresponding proper model. In the early 1980sRobert Axelrod held a tournament to empirically test which among acollection of strategies for playing iterations of theprisoner’s dilemma performed best in a round-robin competition.The winner was a simple reciprocal altruism strategy calledtit-for-tat (Rapoport & Chammah 1965), which simplystarts off each game cooperating then, on each successive round,copies the strategy the opposing player played in the previous round.So, if your opponent cooperated in this round, then you will cooperateon the next round; and if your opponent defected this round, then youwill defect the next. Subsequent tournaments have shown thattit-for-tat is remarkably robust against much more sophisticatedalternatives (Axelrod 1984). For example, even a rational utilitymaximizing player playing against an opponent who only playstit-for-tat (i.e., will play tit-for-tat no matter whom he faces) mustadapt and play tit-for-tat—or a strategy very close to it(Kreps, Milgrom, et al. 1982).

Tit-for-tat’s simplicity allows us to explore rationalityemerging in boundedly rational agents and observe those boundscontributing to pro-social norms. For instance, Rubinstein (1986)studied finite automata in repeated prisoner’s dilemmas with theobjective to maximize average payoff while minimizing the number ofstates of a machine. Rubinstein’s results showed that optimalsolutions involve each player’s machine being optimal at everygame stage. However, contrast this with Neyman’s study, whereplayers of repeated games use finite automata with a fixed number ofstates, allowing traits like reputation to arise (Neyman 1985).Further, while cooperation is impossible in infinitely repeatedprisoner’s dilemmas, a cooperative equilibrium exists infinitely repeated dilemmas for finite automata players with statesless than exponential in the game rounds (Papadimitriou &Yannakakis 1994; Ho 1996). The memory demands of these strategies mayexceed human psychological capacities, however, even for simplestrategies like tit-for-tat among a moderately sized group of players(Stevens, Volstorf, et al. 2011). So, while insightful, thesetheoretical models showing a number of simple paths to pro-socialbehavior may not, on their own, be simple enough to offer a plausibleprocess model for human cooperation.

Attention then shifted to environmental constraints. Nowak andMay’s study of spatial distribution in iterated prisoner’sdilemmas found that cooperation could emerge among players withoutmemory or strategic foresight (Nowak & May 1992). This work led tothe study ofnetwork topology as a factor in social behavior(Jackson 2010), including social norms (Bicchieri 2005; J. Alexander2007), signaling (Skyrms 2003), and wisdom of crowd effects (Golub& Jackson 2010). When social ties in a network follow a scale-freedistribution, the resulting diversity in the number and size ofpublic-goods games is found to promote cooperation, which contributesto explaining the emergence of cooperation in communities withoutreputation or punishment mechanisms (F. Santos, M. Santos, &Pacheco 2008).

Perhaps the simplest case for bounded rationality involves agentsachieving desirable goals without deliberation. Insects, flowers, andbacteria exhibit evolutionary stable strategies (Maynard Smith 1982),effectively arriving at Nash equilibria in strategic normal formgames. For example, honey bees (Apis mellifera) and flowersinteract based on genetic endowments, without foresight or choice,leading toevolutionary dynamics—a form of boundedrationality without foresight.

Nevertheless, improper model can misfire, as seen in theultimatumgame (Güth, Schmittberger, & Schwarze 1982). Theultimatum game is a two-player game in which one player, endowed witha sum of money, is given the task of splitting the sum with anotherplayer who may either accept the offer—in which case the pot isaccordingly split between the two players—or rejected, in whichcase both players receiving nothing. People receiving offers of 30percent or less of the pot are often observed rejecting the offer,even when players are anonymous and therefore would not suffer from anegative reputation signal associated with accepting a very low offer.In such cases, one might reasonably argue that no proposed split isworse than the status quo of zero, so people ought to accept whateveramount they are offered.

4.3 Less is More Effects

Simon’s remark that people satisfice when they haven’t thewits to maximize (Simon 1957a: xxiv) points to a common assumption,that there is a trade-off between effort and accuracy (section 2.1). Because the rules of global rationality are expensive to operate(Good 1952: 7(i)), people will trade a loss in accuracy for gains incognitive efficiency (Payne, Bettman, & Johnson 1988). Themethodology of rational analysis (section 3.3) likewise appeals to this trade-off.

The results surveyed inSection 4.2 caution against blindly endorsing the accuracy-effort trade-off asuniversal, a point that has been pressed in the defense of heuristicsas reasonable models for decision-making (Katsikopoulos 2010; Hogarth2012). Simple heuristics likeTallying, which is a type ofimproper linear model (section 2.3), andTake-the-best (section 5.2), when tested against linear regression on many data sets, have beenboth found to outperform linear regression on out-of-sample predictiontasks, particularly when the training-sample size is low (Czerlinskiet al. 1999; Rieskamp & Dieckmann 2012).

A fundamental principle of Bayesian rationality, calledGood’s principle, compels decision makers to not turndown free information. The principle—a part of Bayesian loreremarked on by Ramsey (1931), argued for by Savage (1972, 107),partially formalized by Raiffa and Schlaifer (1961), asserted intextbooks starting with Lindley (1965), then succinctly formalized byGood (1967)—recommends to delay making a terminal decisionbetween alternative courses of action if there is an opportunity tolearn, at zero cost, the outcome of an experiment relevant to thedecision (Pedersen and Wheeler 2015). At issue is the value ofinformation to rational decision making. Good’s principleassumes that, all else equal, a person facing a decision problemcannot be harmed and may only be helped from acquiring additionalinformation, assuming the expected cost of acquisition does not exceedthe expected benefits. Hence, if information is free, you can only behelped (in expectation) acquiring it. However, the conditions underwhich this maxim is true are narrower than first believed.Good’s principle does not apply in strategic interactions(Osborne 2003, 281): market failures fromadverse selectionprovide a prominent counterexample to Good’s principle (Akerlof1970). Good’s principle also fails for decision problems underKnightian uncertainty (Knight 1921) involving dilatingindeterminate or imprecise probabilities (Wheeler 2020).

4.5 All Rationality is Bounded

Our discussion has tacitly accepted a difference between boundedrationality and unbounded rationality. However, information processingof any kind incurs a thermodynamic cost, suggesting thatallrationality is inherently bounded by physical laws.

Here is the argument. Moving from the revealed preferences of“economic man” (section 1.1) to the cognitive and procedural mechanisms of human decision makingintroduces bounds due to natural human limitations. However, thisswitch to procedural rationality (section 2) suggests that human limitations necessitate introducing bounds onrational decision making. What if insteadany decision-makingprocess is inherently bounded? Then it is the switch from formalcoherence criteria (section 7.1) to ratiocination itself that entails bounds, not merely the inquiryinto human decision making. This is the implication of the irreduciblethermodynamic cost of information processing.

The foundation for the thermodynamic cost of computation isLandauer’s Principle (Landauer 1961), which states thatany logically irreversible operation, such as erasing a bit ofinformation, necessarily incurs a minimum loss of heat energy.Landauer’s principle has since been generalized to include theminimum energy costsany computational process incurs(Wolpert 2019). The idea is that computational processes necessitatethe physical manipulation of information-bearing particles, whichinevitably involves energy dissipation. This generalizedLandauer-Wolpert principle establishes that informationprocessing is rooted in physical thermodynamic laws, revealingintrinsic physical limitations of computation.

Energy costs therefore impose practical limits on the efficiency andfeasibility of computation. The limitations are both in terms ofenergy resources and heat dissipation, which pose challenges andnatural physical constraints on biological systems and microprocessorsalike. The upshot is thatany abstract computational modelought to be grounded in physical reality. Kolchinsky and Wolpert(Kolchinsky and Wolpert 2020) illustrate this point by discussing howthermodynamic constraints apply even to theoretical constructs likeTuring machines. They argue that every computation, regardless of itsabstraction, involves physical processes subject to thermodynamiclaws. This means that the energy required for computation and theresulting heat dissipation are fundamental constraints that cannot beignored. It follows therefore that models of rationality are subjectto the same energy constraints.

5. Two Schools of Heuristics

Heuristics are simple rules of thumb for rendering a judgment ormaking a decision. Some examples that we have seen thus far includeSimon’s satisficing, Dawes’s improper linear models,Rapoport’s tit-for-tat, imitation, and several effects observedby Kahneman and Tversky in our discussion of prospect theory.

There are two primary perspectives on heuristics, corresponding to theresearch traditions of Kahneman and Tversky’sbiases andheuristics program and Gigerenzer’sfast and frugalheuristics program, respectively. A central dispute between thesetwo research programs is the appropriate normative standard forjudging human behavior (Vranas 2000). According to Gigerenzer, thebiases and heuristics program mistakenly classifies all biases aserrors (Gigerenzer, Todd, et al. 1999; Gigerenzer & Brighton 2009)despite evidence pointing to some biases in human psychology beingadaptive. In contrast, in a rare exchange with a critic, Kahneman andTversky maintain that the dispute is merely terminological (Kahneman& Tversky 1996; Gigerenzer 1996).

In this section, we briefly survey each of these two schools. Our aimis to give a characterization of each research program rather thanprovide an exhaustive overview.

5.1 Biases and Heuristics

Beginning in the 1970s, Kahneman and Tversky conducted a series ofexperiments showing various ways that human participants’responses to decision tasks deviate from answers purportedly derivedfrom the appropriate normative standards (sections2.4 and5.1). These deviations were given names, such asavailability(Tversky & Kahneman 1973),representativeness, andanchoring (Tversky & Kahneman 1974). The set of cognitivebiases now numbers into the hundreds, although some are minor variantsof other well-known effects, such as “The IKEA effect”(Norton, Mochon, & Ariely 2012) being a version of the well-knownendowment effect (section 1.2). Nevertheless, core effects studied by the biases and heuristicsprogram, particularly those underpinning prospect theory (section 2.4), are entrenched in cognitive psychology (Kahneman, Slovic, &Tversky 1982).

An example of a probability judgment task is Kahneman andTversky’s Taxi-cab problem, which purports to show that subjectsneglect base rates:

A cab was involved in a hit-and-run accident at night. Two cabcompanies, the Green and the Blue, operate in the city. You are giventhe following data:
85% of the cabs in the city are Green and 15% are Blue.
A witness identified the cab as a Blue cab. The court tested hisability to identify cabs under the appropriate visibility conditions.When presented with a sample of cabs (half of which were Blue and halfof which were Green) the witness made correct identifications in 80%of the cases and erred in 20% of the cases.
Question: What is the probability that the cab involved in theaccident was Blue rather than Green? (Tversky & Kahneman 1977:3–3).

Continuing, Kahneman and Tversky report that several hundred subjectshave been given slight variations of this question and for allversions the modal and median responses was 0.8, instead of thecorrect answer of \(\bfrac{12}{29}\) (\(\approx 0.41\)).

Thus, the intuitive judgment of probability coincides with thecredibility of the witness and ignores the relevant base-rate, i.e.,the relative frequency of Green and Blue cabs. (Tversky & Kahneman1977: 3–3)

Critical responses to results of this kind fall into three broadcategories. The first types of reply is to argue that theexperimenters, rather than the subjects, are in error (Cohen 1981). Inthe Taxi-cab problem, arguably Bayes sides with the folk (Levi 1983)or, alternatively, is inconclusive because the normative standard ofthe experimenter and the presumed normative standard of the subjectrequires a theory of witness testimony, neither of which is specified(Birnbaum 1979). Other cognitive biases have been ensnared in thereplication crises (Camerer, et al. 2018; Wiggins and Christopherson2019), such asimplicit bias (Oswald, Mitchell, et al. 2013;Forscher, Lai et al. 2017) andsocial priming (Doyen, Klein,et al. 2012; Kahneman 2017 [Other Internet Resources]).

The second response is to argue that there is an important differencebetween identifying a normative standard for combining probabilisticinformation and applying it across a range of cases (section 8.2), and it is difficult in practice to determine that a decision-maker isrepresenting the task in the manner that the experimenters intend(Koehler 1996). Observed behavior that appears to be boundedlyrational or even irrational may result from a difference between theintended specification of a problem and the actual problem subjectsface.

For example, consider the systematic biases in people’sperception of randomness reported in some of Kahneman andTversky’s earliest work (Kahneman & Tversky 1972). Forsequences of flips of a fair coin, people expect to see, even forsmall samples, a roughly-equal number heads and tails and alternationrates between heads and tails that are slightly higher than long-runaverages (Bar-Hillel & Wagenaar 1991). This effect is thought toexplain thegambler’s fallacy, the false belief that arun of heads from an i.i.d. sequence of fair coin tosses will make thenext flip more likely to land tails. Hahn and Warren argue that thelimited nature of people’s experiences with random sequences isa better explanation than to view them as cognitive deficiencies.Specifically, people only ever experience finite sequence of outputsfrom a randomizer, such as a sequence of fair coin tosses, and thelimits to their memory (section 5.1) of past outcomes in a sequence will mean that not all possiblesequences of a given length with appear to them with equalprobability. Therefore, there is a psychologically plausibleinterpretation of the question, “is it more likely to seeHHHT thanHHHH from flips of a fair coin?”,for which the correct answer is, “Yes” (Hahn & Warren2009). If the gambler’s fallacy boils down to a failure todistinguish between sampling with and without replacement, Hahn andWarren’s point is that our intuitive statistical abilitiesacquired through experience alone is unable to make the distinctionbetween these two sampling methods. Analytical reasoning isnecessary.

Consider also the risky-choice framing effect that was mentionedbriefly insection 2.4. An example is the Asian disease example:

If programA is adopted, 200 people will be saved.
If programB is adopted, there is a ⅓ probability that600 people will be saved, and a ⅔ probability that no peoplewill be saved (Tversky & Kahneman 1981: 453).

Tversky and Kahneman report that a majority of respondents (72percent) chose option (a), whereas a majority of respondents (78percent) shown an equivalent reformulation of the problem in terms ofthe number of people who would die rather than survive chose (b). Ameta-analysis of subsequent experiments has shown that the framingcondition accounts for most of the variance, but it also reveals nolinear combination of formally specified predictors that are used inprospect theory, cumulative prospect theory, and Markowitz’sutility theory, suffices to capture this framing effect(Kühberger, Schulte-Mecklenbeck, & Perner 1999). Furthermore,the use of an indicative conditional in this and other experiments toexpress the consequences is also not adequately understood.Experimental evidence collected about how people’s judgmentschange when learning an indicative conditional, while straight-forwardand intuitive, cannot be accommodated by existing theoreticalframeworks for conditionals (Collins, Krzyżanowska, Hartmann,Wheeler, and Hahn 2020).

The point to this second line of criticism is not that people’sresponses are at variance with the correct normative standard butrather that the explanation for why they are at variance will matternot only for assessing the rationality of people but what prescriptiveinterventions ought to be taken to counter the error. It is rash toconclude that people, rather than the peculiarities of the task or thetheoretical tools available to us at the moment, are in error.

Lastly, the third type of response is to accept the experimentalresults but challenge the claim that they are generalizable. In acontrolled replication of Kahneman and Tversky’s lawyer-engineerexample (Tversky & Kahneman 1977), for example, a crucialassumption is whether the descriptions of the individuals were drawnat random, which was tested by having subjects draw blindly from anurn (Gigerenzer, Hell, & Blank 1988). Under these conditions,base-rate neglect disappeared. In response to the Linda example(Tversky & Kahneman 1983), rephrasing the example in terms ofwhich alternative is more frequent rather thanwhichalternative is more probable reduces occurrences of theconjunction fallacy among subjects from 77% to 27% (Fiedler 1988).More generally, a majority of people presented with the Linda exampleappear to interpret ‘probability’ non-mathematically butswitch to a mathematical interpretation when asked for frequencyjudgments (Hertwig & Gigerenzer 1999). Ralph Hertwig andcolleagues have since noted a variety of other effects involvingprobability judgments to diminish or disappear when subjects arepermitted to learn the probabilities through sampling, suggesting thatpeople are better adapted to making adecision by experienceof the relevant probabilities as opposed to making a decision by theirdescription (Hertwig, Barron et al. 2004; Appelhoff, Hertwig,and Spitzer 2023).

5.2 Fast and Frugal Heuristics

The Fast and Frugal school and the Biases and Heuristics school bothagree that heuristics are biased. Where they disagree, and disagreesharply, is whether those biases are necessarily a sign ofirrationality. For the Fast and Frugal program the question is underwhat environmental conditions, if any, does a particular heuristicperform effectively. If the heuristic’s structural bias iswell-suited to the task environment, then the bias of that heuristicmay be an advantage for making accurate judgments rather than aliability (section 3.5). We saw this adaptive strategy before in our discussion ofBrunswik’s lens model (section 3.2), although there the bias in the model was to assume that both theenvironment and the subject’s responses were linear. The aim ofthe Fast and Frugal program is to adapt this Brunswikian strategy to avariety of improper models.

This general goal of the Fast and Frugal program leads to a seconddifference between the two schools. Because the Fast and Frugalprogram aims to specify the conditions under which a heuristic willlead to better outcomes than competing models, heuristics are treatedas algorithmic models of decision-making rather than descriptions oferrant effects; heuristics are themselves objects of study. To thatend, all heuristics in the fast and frugal tradition are conceived tohave three components:(i) a search rule,(ii) a stopping rule, and(iii) adecision rule. For example,Take-the-Best (Gigerenzer &Goldstein 1996), is a heuristic applied to binary, forced-choiceproblems. Specifically, the task is to pick the correct optionaccording to an external criterion, such as correctly picking which ofa pair of cities has a larger population, based on cue informationthat is available to the decision-maker, such as whether she has heardof one city but not the other, whether one city is known to have afootball franchise in the professional league, et cetera. One can thencompute the predictive validity of different cues, and thus derivetheir weights. Take-the-Best then has the following structure:Search rule: Look up the cue with the highest cue-validity;Stopping rule: If the pair of objects have different cuevalues, that is, one is positive and the other negative, stop thesearch. If the cue values are the same, continue searching down thecue-order;Decision rule: Predict that the alternative withthe positive cue value has the higher target-criterion value. If allcues fail to discriminate, that is, if all cue values are the same,then predict the alternative randomly by a coin flip. The bias ofTake-the-Best is that it ignores relevant cues. Another example istallying, which is a type of improper linear model (section 2.3). Tallying has the following structure for a binary, forced-choicetask:Search rule: Look up cues in a random order;Stopping rule: After some exogenously determinedm\((1 < m \leq N)\) of theN available cues are evaluated,stop the search;Decision rule: Predict that the alternativewith the higher number of positive cue values has the highertarget-criterion value. The bias in tallying is that it ignores cueweights. One can see then how models are compared to one another byhow they process cues and their performance is evaluated with respectto a specified criterion for success, such as the number of correctanswers to the city population task.

Because Fast and Frugal heuristics are computational models, thisleads to a third difference between the two schools. Kahneman endorsesthe System I and System II theory of cognition (Stanovich & West2000). Furthermore, Kahneman classifies heuristics as fast, intuitive,and non-deliberative System I thinking. Gigerenzer, by contrast, doesnot endorse the System I and System II hypothesis, thus rejectsclassifying heuristics as, necessarily, non-deliberative cognitiveprocesses. Because heuristics are computational models in the Fast andFrugal program, in principle each may be used deliberatively by adecision-maker or used by a decision-modeler to explain or predict adecision-maker’s non-deliberative behavior. The Linear OpticalTrajectory (LOT) heuristic (McBeath, Shaffer, & Kaiser 1995) thatbaseball players use intuitively, without deliberation, to catch flyballs, and which some animals appear to use to intercept prey, is thesame heuristic that the “Miracle on the Hudson” airlinepilots used deliberatively to infer that they could not reach anairport runway and decided instead to land their crippled plane in theHudson river.

Here are a list of heuristics studied in the Fast and Frugal program(Gigerenzer, Hertwig, & Pachur 2011), along with an informaldescription for each along with historical and selected contemporaryreferences.

Imitation.People have a strong tendency toimitate the successful members of their communities (Henrich& Gil-White 2001).
If some one man in a tribe …invented a new snare or weapon, orother means of attack or defense, the plainest self-interest, withoutthe assistance of much reasoning power, would prompt other members toimitate him. (Darwin 1871, 155)
Imitation is presumed to be fundamental to the speed of culturaladaptation including the adoption of social norms (section 3.4).
Preferential Attachment.When given the choice toform a new connection to someone, pick the individual with the mostconnections to others (Yule 1925; Barabási & Albert1999; Simon 1955b).
Default rules.If there is an applicable defaultrule, and no apparent reason for you to do otherwise, follow therule. (Fisher 1936; Reiter 1980; Thaler & Sustein 2008;Wheeler 2004).
Satisficing.Search available options and choosethe first one that exceeds your aspiration level. (Simon 1955a;Hutchinson et al. 2012).
Tallying.To estimate a target criterion, ratherthan estimate the weights of available cues, instead count the numberof positive instances (Dawes 1979; Dana & Dawes 2004).
One-bounce Rule (Hey’s Rule B).Have atleast two searches for an option. Stop if a price quote is larger thanthe previous quote. The one-bounce rule plays“winning-streaks” by continuing search while you keepreceiving a series of lower and lower quotes, but stops as soon asyour luck runs out (Hey 1982; Charness & Kuhn 2011).
Tit-for-tat.Begin by cooperating, then respondin kind to your opponent; If your opponent cooperates, then cooperate;if your opponent defects, then defect (Axelrod 1984; Rapaport,Seale, & Colman 2015).
Linear Optical Trajectory (LOT).To intersectwith another moving object, adjust your speed so that your angle ofgaze remains constant. (McBeath et al. 1995; Gigerenzer2007).
Take-the-best.To decide which of twoalternatives has a higher value on a specific criterion, (i) firstsearch the cues in order of their predictive validity; (ii) next, stopsearch when a cue is found which discriminates between thealternatives; (iii) then, choose the alternative selected by thediscriminating cue. (iv) If all cues fail to discriminate between thetwo alternatives, then choose an alternative by chance (Einhorn1970; Gigerenzer & Goldstein 1996).
Recognition:To decide which of two alternativeshas a higher value on a specific criterion and one of the twoalternatives is recognized, choose the alternative that isrecognized (Goldstein & Gigerenzer 2002; Davis-Stober, Dana,& Budescu 2010; Pachur, Todd, et al. 2012).
Fluency:To decide which of two alternatives hasa higher value on a specific criterion, if both alternatives arerecognized but one is recognized faster, choose the alternative thatis recognized faster (Schooler & Hertwig 2005; Herzog &Hertwig 2013).
\(\frac{1}{N}\) Rule:ForN feasibleoptions, invest resources equally across allN options(Hertwig, Davis, & Sulloway 2002; DeMiguel, Garlappi, & Uppal2009).

There are three lines of responses to the Fast and Frugal program tomention. Take-the-Best is an example of anon-compensatorydecision rule, which means that the first discriminating cue cannot be“compensated” by the cue-information remaining down theorder. This condition, when it holds, is thought to warrant taking adecision on the first discriminating cue and ignoring the remainingcue-information. The computational efficiency of Take-the-Best issupposed to come from only evaluating a few cues, which number lessthan 3 on average in benchmarks tests (Czerlinski et al. 1999).However, all of the cue validities need to be known by thedecision-maker and sorted before initiating the search. So,Take-the-Best by design treats a portion of the necessarycomputational costs to execute the heuristic as exogenous. Althoughthe lower-bound for sorting cues by comparison is \(O(n \log n)\),there is little evidence to suggest that humans sort cues by the mostefficient sorting algorithms in this class. On the contrary, suchoperations are precisely of the kind that qualitative probabilityjudgments demand (section 1.2). Furthermore, in addition to the costs of ranking cue validities,there is the cost of acquisition and the determination that theagent’s estimates are non-compensatory. Although the exactaccounting of the cognitive effort presupposed is unknown, and arguedto be lower than critics suggest (Katsikopoulos et al. 2010),nevertheless these necessary steps threaten to render Take-the-Bestnon-compensatory in execution but not in what is necessary prior tosetting up the model to execute.

A second line of criticism concerns the cognitive plausibility of Takethe Best (Chater, Oaksford, Nakisa, & Redington 2003). Nearly allof the empirical data on the performance characteristics ofTake-the-Best are by computer simulations, and those originalcompetitions pitted Take the Best against standard statistical models(Czerlinski et al. 1999) but omitted standard machine learningalgorithms that Chater, Oaksford and colleagues found performed justas well as Take the Best. Since these initial studies, the focus hasshifted to machine learning, and includes variants of Take-the-Best,such as “greedy cue permutation” that performs provablybetter than the original and is guaranteed to always find accuratesolutions when they exist (Schmitt & Martignon 2006). Settingaside criticisms targeting the comparative performance advantages ofTake the Best qua decision model, others have questioned theplausibility of using Take-the-Best as a cognitive model. For example,Take-the-Best presumes that cue-information is processed serially, butthe speed advantages of the model translate to an advantage in humandecision-making derives only if humans process cue information on aserial architecture. If instead people process cue information on aparallel cognitive architecture, then the comparative speed advantagesof Take-the-Best would become moot (Chater et al. 2003).

The third line of objection concerns whether the Fast-and-Frugalprogram truly mounts a challenge to the normative standards ofoptimization, dominance-reasoning, and consistency, as advertised.Take-the-Best is an algorithm for decision-making that does notcomport with the axioms of expected utility theory. For one thing, itslexicographic structure violates the Archimedean axiom (section 1.3, A2). For another, it is presumed to violate the transitivity condition ofthe Ordering axiom (A1). Further still, the “less-is-more” effects appear toviolate Good’s principle (Good 1967), a central pillar ofBayesian decision theory, which recommends to delay making a terminaldecision between alternative options if the opportunity arises toacquire free information. In other words, according canonicalBayesianism, free advice is a bore but no one ought to turn down freeinformation (Pedersen & Wheeler 2014; 2015). If noncompensatorydecision rules like Take-the-Best violate Good’s principle, thenperhaps the whole Bayesian machinery ought to go (Gigerenzer &Brighton 2009).

But these points merely tell us that attempts to formulateTake-the-Best in terms of an ordering of prospects on a real-valuedindex won’t do, not that ordering and numerical indices have allgot to go. As we saw insection 1.2, there is a long and sizable literature on lexicographic probabilitiesand non-standard analysis, including early work specificallyaddressing non-compensatory nonlinear models (Einhorn 1970). Second,Gigerenzer argues that “cognitive algorithms…need to meetmore important constraints than internal consistency”(Gigerenzer & Goldstein 1996), which includes transitivity, andelsewhere advocates abandoning coherence as a normative standard(Arkes, Gigerenzer, & Hertwig 2016). However, Take-the-Bestpresupposes that cues are ordered by cue validity, which naturallyentails transitivity, otherwise Take-The-Best could neither becoherently specified nor effectively executed. More generally, theFast and Frugal school’s commitment to formulating heuristicsalgorithmically and implementing them as computational models commitsthem to the normative standards of optimization, dominance reasoning,and logical consistency.

Finally, Good’s principle states that a decision-maker facing asingle-person decision-problem cannot be worse (in expectation) fromreceiving free information. Exceptions are known in game theory(Osborne 2003: 283), however, that involve asymmetric informationamong two or more decision-makers. But there is also an exception forsingle-person decision-problems involving indeterminate or impreciseprobabilities (Wheeler 2020). The point is that Good’s principleis not a fundamental principle of probabilistic methods, but insteadis a specific result that holds for the canonical theory ofsingle-person decision-making with determinate probabilities.

6. Aumann’s Five Arguments for Bounded Rationality

In this section we summarize Aumman’s arguments in favor ofbounded rationality. We have postponed this review to allow us toorganically develop the origins of each argument, allowing us toconnect each to the various facets of bounded rationality coveredabove.

Aumann (1997) advanced five arguments for bounded rationality, whichwe paraphrase here.

Even in very simple decision problems, most economic agents are not(deliberate) maximizers. People do not scan the choice set andconsciously pick a maximal element from it.
Even if economic agents aspired to pick a maximal element from achoice set, performing such maximizations are typically difficult andmost people are unable to do so in practice.
Experiments indicate that people fail to satisfy the basic assumptionsof rational decision theory.
Experiments indicate that the conclusions of rational analysis(broadly construed to include rational decision theory) do not matchobserved behavior.
Some conclusions of rational analysis appear normativelyunreasonable.

In the previous sections we covered the origins of each ofAumann’s arguments. Here we briefly review each, highlightingmaterial in other sections under this context.

The first argument, that people are not deliberate maximizers, was aworking hypothesis of Simon’s, who maintained that people tendto satisfice rather than maximize (section 2.2). Kahneman and Tversky gathered evidence for the reflection effect inestimating the value of options, which is the reason for referencepoints in prospect theory (section 2.4) and analogous properties within rank-dependent utility theory moregenerally (sections1.3 and2.4). Gigerenzer’s and Hertwig’s groups at the Max PlanckInstitute for Human Development both study the algorithmic structureof simple heuristics and the adaptive psychological mechanisms whichexplain their adoption and effectiveness; both of their researchprograms start from the assumption that expected utility theory is notthe right basis for a descriptive theory of judgment anddecision-making (sections3,4.3, and5.2).

The second argument, that people are often unable to maximize even ifthey aspire to, was made by Simon and Good, among others, and later byKahneman and Tversky. Simon’s remarks about the complexity of\(\Gamma\)-maxmin reasoning in working out the end-game moves in chess (section 2.2) is one of many examples he used over the span of his career, startingbefore his seminal papers on bounded rationality in the 1950s. Thebiases and heuristics program spurred by Tversky andKahneman’s work in the late 1960s and 1970s (section 5.1) launched the systematic study of when and why people’sjudgments deviate from the normative standards of expected utilitytheory and logical consistency.

The third argument, that experiments indicate that people fail tosatisfy the basic assumptions of expected utility theory, was knownfrom early on and emphasized by the very authors who formulated andrefined the homo economicus hypothesis (section 1) and whose names are associated with the mathematical foundations. Wehighlighted an extended quote from Savage insection 1.4, but could mention as well a discussion of the theory’slimitations by de Finetti and Savage (1962), and even a closer readingof the canonical monographs of each, namely Savage 1954 and de Finetti1970. A further consideration, which we discussed insection 1.4 is the demand oflogical omniscience in expected utilitytheory and nearly all axiomatic variants.

The fourth argument, regarding the differences between the predictionsof rational analysis and observed behavior, we addressed indiscussions of Brunswik’s notion of ecological validity (section 3.2) and the traditional responses to these observations by rationalanalysis (section 3.3). The fifth argument, that some of the conclusions of rational analysisdo not agree with a reasonable normative standard, was touched on insections1.3,1.4, and the subject ofsection 4.

Implicit in Aumann’s first four arguments is the notion thatglobal rationality (section 2) is a reasonable normative standard but problematic for descriptivetheories of human judgment and decision-making (section 7). Even the literature standing behind Aumann’s fifth argument,namely that there are problems with expected utility theory as anormative standard, nevertheless typically address those shortcomingsthrough modifications to, or extensions of, the underlyingmathematical theory (section 1.3). This broad commitment to optimization methods, dominance reasoning,and logical consistency as bedrock normative principles is behindapproaches that view bounded rationality asoptimization underconstraints:

Boundedly rational procedures are in fact fully optimal procedureswhen one takes account of the cost of computation in addition to thebenefits and costs inherent in the problem as originally posed (Arrow2004).

For a majority of researchers across disciplines, bounded rationalityis identified with some form of optimization problem underconstraints.

Gerd Gigerenzer is among the most prominent and vocal critics of therole that optimization methods and logical consistency plays incommonplace normative standards for human rationality (Gigerenzer& Brighton 2009), especially the role those standards play inKahneman and Tversky’s biases and heuristics program (Kahneman& Tversky 1996; Gigerenzer 1996). We discuss this debate insection 5.

7. Appraising Human Rationality

The rules of logic, the axioms of probability, the principles ofutility theory—humans flout them all, and do so as a matter ofcourse. But are we irrational to do so? That depends on what beingrational amounts to. For a Bayesian, any qualitative comparativejudgment that does not abide by the axioms of probability is, bydefinition, irrational. For a baker, any recipe for bread thatincludes equal parts salt and flour is irrational, even if coherent.Yet Bayesians do not war with bakers. Why? Because bakers aresatisfied with the term ‘inedible’ and do not aspire tocommandeer ‘irrational’.

The two schools of heuristics (section 7) reach sharply different conclusions about human rationality. Unlikebakers, their disagreement involves the meaning of‘rationality’ and how we ought to appraise human judgmentand decision making. The “rationality wars” are not theresult of “rhetorical flourishes” concealing a broadconsensus (cf., Samuels, Stich, & Bishop 2002), but rathersubstantive disagreements (section 7.2) that are obscured by ambiguous use of terms like ‘bias’ (section 3.5) and ‘rationality’.

In this section, we first distinguish seven different notions ofrationality, highlighting the differences in aim, scope, standards ofassessment, and objects of evaluation. We then consider twoimportantly different normative standards used in bounded rationality,followed by an example, theperception-cognition gap,illustrating how slight variations of classical experimental designsin the biases and heuristics literature change both the results andthe normative standards used to evaluate those results.

7.1 Rationality

While Aristotle is credited with saying that humans are rational,Bertrand Russell later confessed to searching a lifetime in vain forevidence in Aristotle’s favor. Yet ‘rationality’ iswhat Marvin Minsky called a suitcase word, a term that needs to beunpackedbefore getting anywhere.

One conception of rationality, central to decision theory, iscoherence, which is merely the requirement that yourcommitments not be self-defeating. The subjective Bayesianrepresentation of rational preference over options as inequalities insubjective expected utility delivers coherence by applying a dominanceprinciple to (suitably structured) preferences. A closely relatedapplication of dominance reasoning is the minimization of expectedloss (or maximization of expected gain) according to a suitable lossfunction (suitable index of welfare), which may even be asymmetric(Elliott, Komunjer, & Timmermann 2005) or applied to radicallyrestricted agents, such as finite automata (Rubinstein 1986).Coherence and dominance reasoning underpin expected utility theory (section 1.1), too.

A second meaning of rationality refers to an interpretive stance ordisposition we take to understand the beliefs, desires, and actions ofanother person (Dennett 1971) or anything they might say in a sharedlanguage (Davidson 1974). On this view, rationality refers to a bundleof assumptions we grant to another person in order to understand theirbehavior, including speech. When we offer a reasons-giving explanationfor another person’s behavior, we take such a stance. If I say“the driver laughed because she made a joke” youwould not get far in understanding me without granting to me, and eventhis imaginary driver and woman, a lot. So, in contrast to the loftynormative standards of coherence that few if any mortals meet, thestandards of rationality associated with an interpretive stance areafforded to practically everyone.

A third meaning of rationality, due to Hume (1738), applies to yourbeliefs, appraising them in how well they are calibrated with yourexperience. If in your experience the presence of one thing isinvariably followed by an experience of another, then believing thatthe latter follows the former is rational. We might even go so far asto say that your expectation of the latter given your experience ofthe former is rational. This view of rationality is an evaluation of aperson’s commitments, like coherence standards; but unlikecoherence, Hume’s notion of rationality seeks to tie therational standing of a belief directly to evidence from the world.Much of contemporary epistemology endorses this concept of rationalitywhile attempting to specify the conditions under which to correctlyattribute knowledge to someone.

A fourth meaning of rationality, calledsubstantiverationality by Max Weber (1905), applies to the evaluation ofyour aims of inquiry. Substantive rationality invokes a Kantiandistinction between the worthiness of a goal, on the one hand, and howwell you perform instrumentally in achieving that goal, on the other.Aiming to count the blades of grass in your lawn is arguably not asubstantively rational end to pursue, even if you were to use theinstruments of rationality flawlessly to arrive at the correctcount.

A fifth meaning of rationality, due to Charles Peirce (1955) and takenup by the American pragmatists, applies to the process of changing abelief rather than the Humean appraisal of a currently held belief. OnPeirce’s view, people are plagued by doubt rather than bybelief; we don’t expend effort testing the sturdiness of ourbeliefs, but instead focus on our commitments that come into doubt.Since inquiry is pursued to remove the doubts we have, not certify thestable beliefs we already possess, principles of rationality ought toapply to the methods for removing doubt (Dewey 1960). On this view,questions of what is or is not substantively rational will be answeredby the inquirer: for an agronomist interested in grass coversufficient to crowd out an invasive weed, obtaining the grass-bladecount of a lawn might be a substantively rational aim to pursue.

A sixth meaning of rationality appeals to an organism’scapacities to assimilate and exploit complex information and revise ormodify it when it is no longer suited to task. The object ofrationality according to this notion iseffective behavior.Jonathan Bennett discusses this notion of rationality in his casestudy of bees:

All ourprima facie cases of rationality or intelligence werebased on the observation that some creature’s behaviour was incertain dependable ways successful or appropriate or apt, relative toits presumed wants or needs. …There are canons ofappropriateness whereby we can ask whether an apian act is appropriatenot to that which is particular and present to the bee but rather tothat which is particular and past or to that which is not particularat all but universal. (Bennett 1964: 85)

Like Hume’s conception, Bennett’s view ties rationality tosuccessful interactions with the world. Further, like the pragmatists,Bennett includes for appraisal the dynamic process rather than simplythe synchronic state of one’s commitments or the current meritsof a goal. But unlike the pragmatists, Bennett conceives ofrationality to apply to a wider range of behavior than the logic ofdeliberation, inquiry, and belief change.

A seventh meaning of rationality resembles the notion of coherence bydefining rationality as the absence of a defect. For Bayesians,sure-loss is the epitome ofirrationality and coherence issimply its absence. Sorensen has suggested a generalization of thisstrategy, one where rationality is conceived as the absence ofirrationalitytout court, just as cleanliness is the absenceof dirt. Yet, owing to the long and varied ways that irrationality canarise, a consequence of this view is that there then would be nounified notion of rationality to capture the idea of thinking as oneought to think (Sorensen 1991).

These seven accounts of rationality are neither exhaustive norcomplete. For instance, we have focused on the rationality ofindividuals rather than groups, which raise additional complications(Hahn 2022). However, they suffice to illustrate the range ofdifferences among rationality concepts, from the objects of evaluationand the standards of assessment, to the roles, if any, thatrationality is conceived to play in reasoning, planning, deliberation,explanation, prediction, signaling, and interpretation. Oneconsequence of this hodgepodge of rationality concepts is a pliancy inthe attribution of irrationality that resembles Victorian methods fordiagnosing the vapors. The time may have come to retire talk ofrationality altogether, or at the very least to demand a specificationof the objects of evaluation, the normative standards to be used forassessment, and ample attention to the implications that follow fromthose commitments.

7.2 Normative Standards in Bounded Rationality

What are the standards against which our judgments and decisions oughtto be evaluated? A property like systematic bias may be viewed as afault or an advantage depending on how outcomes are scored (section 4). A full reckoning of the costs of operating a decision procedure maytip the balance in favor of a model that is sub-optimal when costs areno constraint, even when there is agreement of how outcomes are to bescored (sections 2.1). Desirable behavior, such as prosocial norms, may be impossible withinan idealized model but commonplace in several different types ofnon-idealized models (section 5.2).

Accounts of bounded rationality typically invoke one of two types ofnormative standards, a coherence standard or an accuracy standard.Among the most important insights from the study of boundedly rationaljudgment and decision making is that, not only is it possible to meetone standard without meeting the other, but meeting one standard mayinhibit meeting the other.

Coherence standards in bounded rationality typically appeal toprobability, statistical decision theory, or propositional logic. The“standard picture” of rational reasoning, according toEdward Stein,

is to reason in accordance with principles of reasoning that are basedon rules of logic, probability theory, and so forth. If the standardpicture of reasoning is right, principles of reasoning that are basedon such rules arenormative principles of reasoning, namelythey are principles weought to reason in accordance with.(Stein 1996: 1.2)

Logic and probability coherence standards are usually invoked whenthere are experimental results pointing to violations of thosestandards, particularly in the heuristics and biases literature (section 7.1). However, little is said about how and when our reasoning ought to bein accordance with these standards or even what, precisely, thenormative standards of logic and probability amount to. Steindiscusses the logical rule ofAnd-Elimination and a normativeprinciple for belief that it supports, one where believing theconjunctionbirds sing and bees waggle commits you rationallyto believing each conjunct. Yet Stein switches to probability todiscuss what principle ought to govern conjoining two beliefs.Why?

Propositional logic and probability are very different formalisms(Haenni, Romeijn, Wheeler, & Williamson 2011). For one thing, thetruth-functional semantics of logic is compositional whereasprobability is not compositional, except when events areprobabilistically independent. Why then is the elimination rule fromlogic and the introduction rule from probability the standard ratherthan the elimination rule from probability (marginalization) and theintroduction rule from logic (adjunction)? Answering this questionrequires a positive account of what “based on”,“anchored in”, or other metaphorical relationships amountto. By way of comparison, there is typically no analog to therepresentation theorems of expected utility theory (section 1.1) specifying the relationship between qualitative judgment andquantitative representation, and no accounting for the conditionsunder which that relationship holds (Wheeler 2021).

The second type of normative standard assesses the accuracy of ajudgment or decision making process, where the focus is getting thecorrect answer. Consider the accuracy of a categorical judgment, suchas predicting whether a credit-card transaction is fraudulent (\(Y =1\)) or legitimate (\(Y = 0\)).Classification accuracy isthe number of correct predictions from all predictions made, which isoften expressed as a ratio. But classification accuracy can yield amisleading assessment. For example, a method that always reportedtransactions as legitimate, \(Y = 0\), would in fact yield a very highaccuracy score (>97%) due to the very low rate (<3%) offraudulent credit card transactions. The problem here is thatclassification accuracy is a poor metric for problems that involveimbalanced classes with few positive instances (i.e., few cases where\(Y=1\)). More generally, a model with no predictive power can havehigh accuracy, and a model with comparatively lower accuracy can havegreater predictive power. This observation is referred to as theaccuracy paradox.

The accuracy paradox is one motivation for introducing other measuresof predictive performance. For our fraud detection problem there aretwo ways your prediction can be correct and two ways it can be wrong.A prediction can be correct by predicting that \(Y=1\) when in fact atransaction is fraudulent (atrue positive) or predicting\(Y=0\) when in fact a transaction is legitimate (atruenegative). Correspondingly, one may err by either predicting\(\hat{Y}=1\) when in fact \(Y=0\) (afalse positive) orpredicting \(\hat{Y}=0\) when in fact a transaction is fraudulent (afalse negative). These four possibilities are presented inthe following two-by-two contingency table, which is referred to as aconfusion matrix:

		Actual Class
	\(Y\)	1	0
Predicted Class	1	true positive	false positive
Predicted Class	0	false negative	true negative

For a binary classification problem involvingN examples,each prediction will fall into one of these four categories. Theperformance of your classifier with respect to thoseNexamples can then be assessed. A perfectly inaccurate classifier willhave all zeros in the diagonal; a perfectly accurate classifier willhave all zeros in the counterdiagonal. Theprecision of yourclassifier is the ratio of true positives to all positive predictions,that istrue positives / (true positives +falsepositives). Therecall of your classifier is the ratioof true positives to all true predictions, that istruepositives / (true positives +falsenegatives).

There are two key points to note. First, there is typically atrade-off between precision and recall, and the costs associated witheach will vary from one problem to another. A trade-off of precisionand recall that suits detecting credit card fraud may not suitdetecting cancer, even if the frequencies of positive instances areidentical. The purpose of training a classifier on known data is tomake predictions on out-of-sample instances. Therefore, tuning yourclassifier to yield a suitable trade-off between precision and recallon your training data does not guarantee that this trade-off willgeneralize.

The moral is that to evaluate the performance of your classifier it isnecessary to specify the purpose of the classification. Even then,good performance on your training data may not generalize. None ofthis contradicts coherence reasoning per se, as we are makingcomparative judgments and reasoning by dominance. However, framing theargument in terms of coherence changes the objects of evaluation,shifting from the first-person perspective of the decision maker tothe third-person perspective of the decision modeler.

7.3 The Perception-Cognition Gap

Do human beings systematically violate the norms of probability andstatistics? Petersen and Beach (1967) thought not. In their view,human beings are intuitive statisticians (section 5.1), making probability theory and statistics a good first approximationof human judgement and decision making. However, just as theiroptimistic review appeared to cement a consensus view about humanrationality, Amos Tversky and Daniel Kahneman began their work to undoit. According to the heuristics and biases program (section 7.1), people are particularly bad at probability and statistics, indicatingthat probability theory, statistics, and even logic do not offer agood approximation of human decision making. One controversy overthese negative findings concerns the causes of thoseeffects—whether the observed responses point to minor flaws inotherwise adaptive human behavior or something much more troublingabout our habits and constitution.

In contrast to this poor showing on cognitive tasks, people aregenerally thought to be optimal or near-optimal in performinglow-level motor control and perception tasks. Planning goal-directedmovement, like pressing an elevator button with your finger or placingyour foot on a slippery river stone, requires your motor controlsystem to pick one among a dizzying number of possible movementstrategies to achieve your goal while minimizing biomechanical costs(Trommershäuser, Maloney, & Landy 2003). The loss functionthat our motor control system appears to use increases approximatelyquadratically with error for small errors but significantly less forlarge errors, suggesting that our motor control system is also robustto outliers (Körding & Wolpert 2004). What is more, advancesin machine learning have been guided by treating human performanceerrors for a range of perception tasks as proxies for Bayes error,yielding an observable, near-perfect normative standard. Unlikecognitive decisions, there is very little controversy concerning theoverall optimality of our motor-perceptual decisions. This differencebetween high-level and low-level decisions is called theperception-cognition gap.

Some view the perception-cognition gap as evidence for the claim thatpeople use fundamentally different strategies for each type of task (section 7.2). An approximation of an optimal method is not necessarily an optimalapproximation of that method, and the study of cognitive judgments anddeliberative decision-making is led astray by assuming otherwise(Mongin 2000). Another view of the perception-cognition gap is that itis largely an artifact of methodological differences across studiesrather than a robust feature of human behavior. We review evidence forthis second argument here.

Classical studies of decision-making present choice problems tosubjects where probabilities are described. For example, you might beasked to choose the prospect of winning €300 with probability0.25 or the prospect of winning €400 with probability 0.2. Here,subjects are given a numerical description of probabilities, aretypically asked to make one-shot decisions without feedback, and theirresponses are found to deviate from the expected utility hypothesis.However, in motor control tasks, subjects have to use internal,implicit estimates of probabilities, often learned with feedback, andthese internal estimates are near optimal. Are perceptual-motorcontrol decisions better because they provide feedback whereasclassical decision tasks do not, or are perceptual-motor controldecisions better because they are non-cognitive?

Jarvstad et al. (2013) explored the robustness of theperception-cognition gap by designing (a) a finger-pointing task thatinvolved varying target sizes on a touch-screen computer display; (b)an arithmetic learning task involving summing four numbers andaccepting or rejecting a proposed answer with a target tolerance,where the tolerance range varied from problem to problem, analogous tothe width of the target in the motor-control task; and (c) a standardclassical probability judgment task that involved computing theexpected value of two prospects. The probability information acrossthe tasks was in three formats: low-level, high-level, and classical,respectively.

Once confounding factors across the three types of tasks arecontrolled for, Jarvstad et al.’s results suggest that (i) theperception-cognition gap is largely explained by differences in howperformance is assessed; (ii) thedecisions by experience vsdecisions by description gap (Hertwig, Barron et al. 2004) isdue to assuming that exogenous objective probabilities and subjectiveprobabilities match; (iii) people’s ability to make high-leveldecisions is better than the biases and heuristics literature suggests (section 7.1); and (iv) differences between subjects are more important forpredicting performance than differences between the choice tasks(Jarvstad et al. 2013).

The upshot, then, is that once the methodological differences arecontrolled for, the perception-cognition gap appears to be an artifactof two different normative standards applied to tasks. If thestandards applied to assessing perceptual-motor tasks are applied toclassical cognitive decision-making tasks, then both appear to performwell. If instead the standards used for assessing the classicalcognitive tasks are applied to perceptual-motor tasks, then both willappear to perform poorly.

Bibliography

Akerlof, George A., 1970, “The Market for‘Lemons’: Quality, Uncertainty and the MarketMechanism”,The Quarterly Journal of Economics, 84(3):488–500.
Alexander, Jason McKenzie, 2007,The Structural Evolution ofMorality, New York: Cambridge University Press.doi:10.1017/CBO9780511550997
Alexander, Richard D., 1987,The Biology of MoralSystems, London: Routledge.
Allais, Maurice, 1953, “Le Comportement de L’hommeRationnel Devant Le Risque: Critique Des Postulats et Axiomes deL’école Américaine”,Econometrica,21(4): 503–546. doi:10.2307/1907921
Anand, Paul, 1987, “Are the Preference Axioms ReallyRational?”Theory and Decision, 23(2): 189–214.doi:10.1007/BF00126305
Anderson, John R., 1991, “The Adaptive Nature of HumanCategorization”,Psychological Review, 98(3):409–429. doi:10.1037/0033-295X.98.3.409
Anderson, John R. and Lael J. Schooler, 1991, “Reflectionsof the Environment in Memory”,Psychological Science,2(6): 396–408. doi:10.1111/j.1467-9280.1991.tb00174.x
Appelhoff, Stefan, Ralph Hertwig, and Bernhard Spitzer, 2023,“Emergence of Scaling in Random Networks”,CerebralCortex, 33: 207–221.
Arkes, Hal R., Gerd Gigerenzer, and Ralph Hertwig, 2016,“How Bad Is Incoherence?”,Decision, 3(1):20–39. doi:10.1037/dec0000043
Arló-Costa, Horacio and Arthur Paul Pedersen, 2011,“Bounded Rationality: Models for Some Fast and FrugalHeuristics”, in A. Gupta, Johan van Benthem, & Eric Pacuit(eds.),Games, Norms and Reasons: Logic at the Crossroads,Dordrecht: Springer Netherlands. doi:10.1007/978-94-007-0714-6_1
Arrow, Kenneth, 2004, “Is Bounded Rationality UnboundedlyRational? Some Ruminations”, in M. Augier & J. G. March(eds.),Models of a Man: Essays in Memory of Herbert A.Simon, Cambridge, MA: MIT Press, pp. 47–55.
Aumann, Robert J., 1962, “Utility Theory without theCompleteness Axiom”,Econometrica, 30(3):445–462. doi:10.2307/1909888
–––, 1997, “Rationality and BoundedRationality”,Games and Economic Behavior,21(1–2): 2–17. doi:10.1006/game.1997.0585
Axelrod, Robert, 1984,The Evolution of Cooperation, NewYork: Basic Books.
Ballard, Dana H. and Christopher M. Brown, 1982,ComputerVision, Englewood Cliffs, NJ: Prentice Hall.
Bar-Hillel, Maya and Avishai Margalit, 1988, “How ViciousAre Cycles of Intransitive Choice?”Theory andDecision, 24(2): 119–145. doi:10.1007/BF00132458
Bar-Hillel, Maya and Willem A Wagenaar, 1991, “ThePerception of Randomness”,Advances in AppliedMathematics, 12(4): 428–454.doi:10.1016/0196-8858(91)90029-I
Barabási, Albert-László and Reka Albert,1999, “Emergence of Scaling in Random Networks”,Science, 286(5439): 509–512.doi:10.1126/science.286.5439.509
Barkow, Jerome, Leda Cosmides, and John Tooby (eds.), 1992,The Adapted Mind: Evolutionary Psychology and the Generation ofCulture, New York: Oxford University Press.
Baumeister, Roy F., Ellen Bratslavsky, Catrin Finkenauer, andKathleen D. Vohs, 2001, “Bad Is Stronger than Good.”,Review of General Psychology, 5(4): 323–370.doi:10.1037/1089-2680.5.4.323
Bazerman, Max H. and Don A. Moore, 2008,Judgment inManagerial Decision Making seventh edition, New York: Wiley.
Bell, David E., 1982, “Regret in Decision Making UnderUncertainty”,Operations Research, 30(5):961–981. doi:10.1287/opre.30.5.961
Bennett, Jonathan, 1964,Rationality: An Essay Towards anAnalysis, London: Routledge.
Berger, James O., 1985,Statistical Decision Theory andBayesian Analysis, second edition, New York: Springer.doi:10.1007/978-1-4757-4286-2
Bernoulli, Daniel, 1738, “Exposition of a New Theory on theMeasurement of Risk”,Econometrica, 22(1): 23–36.doi:10.2307/1909829
Bewley, Truman S., 2002, “Knightian Decision Theory: PartI”,Decisions in Economics and Finance, 25(2):79–110. doi:10.1007/s102030200006
Bicchieri, Cristina, 2005,The Grammar of Society: The Natureand Dynamics of Social Norms, Cambridge: Cambridge UniversityPress. doi:10.1017/CBO9780511616037
Bicchieri, Cristina and Ryan Muldoon, 2014, “SocialNorms”, inThe Stanford Encyclopedia of Philosophy,(Spring 2014), Edward N. Zalta (ed.), URL = <https://plato.stanford.edu/archives/spr2014/entries/social-norms/>
Birnbaum, Michael H., 1983, “Base Rates in BayesianInference: Signal Detection Analysis of the Cab Problem”,The American Journal of Psychology, 96(1): 85–94.doi:10.2307/1422211
Bishop, Christopher M., 2006,Pattern Recognition and MachineLearning, New York: Springer.
Blume, Lawrence, Adam Brandenburger, and Eddie Dekel, 1991,“Lexicographic Probabilities and Choice UnderUncertainty”,Econometrica, 58(1): 61–78.doi:10.2307/2938240
Bowles, Samuel and Herbert Gintis, 2011,A CooperativeSpecies: Human Reciprocity and Its Evolution, Princeton, NJ:Princeton University Press.doi:10.23943/princeton/9780691151250.001.0001
Boyd, Robert and Peter J. Richerson, 2005,The Origin andEvolution of Cultures, New York: Oxford University Press.
Brown, Alexander L., Taisuke Imai, Ferdinand M. Vieider, and ColinF. Camerer, 2024, “Meta-Analysis of Empirical Estimates ofLoss-Aversion”,Journal of Economic Literature, 62(2):485–516.
Brown, Scott D. and Andrew Heathcote, 2008, “The SimplestComplete Model of Choice Response Time: Linear BallisticAccumulation”,Cognitive Psychology, 57(3):153–178. doi:10.1016/j.cogpsych.2007.12.002
Brunswik, Egon, 1943, “Organismic Achievement andEnvironmental Probability”,Psychological Review,50(3): 255–272. doi:10.1037/h0060889
–––, 1955, “Representative Design andProbabilistic Theory in a Functional Psychology”,Psychological Review, 62(3): 193–217.doi:10.1037/h0047470
Camerer, Colin F., 2003,Behavioral Game Theory: Experimentsin Strategic Interaction, Princeton University Press.
Camerer, Colin F., Anna Dreber, Felix Holzmeister,Hua-Teck Ho, Jürgen Huber, Magnus Johannesson, MichaelKirchler, Gideon Nave, Brian A. Nosek, Thomas Pfeiffer,Adam Altmeid, Bick Buttrick, Chan Taizan, Yiling Chen,Eskil Forsell, Anup Gampa, Emma Heikensten, Lily Hummer,Taisuke Imai, Siri Isaksson, Dylan Manfredi, Julia Rose,Eric-Jan Wagenmakers, and Huan Wu, 2018, “Evaluating theReplicability of Social Science Experiments in Nature and ScienceBetween 2010 and 2015”,Nature Human Behavior, 2(9):637–644.
Charness, Gary and Peter J. Kuhn, 2011, “Lab Labor: What CanLabor Economists Learn from the Lab?” inHandbook of LaborEconomics, Vol. 4, pp. 229–330, Elsevier.doi:10.1016/S0169-7218(11)00409-6
Chater, Nick, 2014, “Cognitive Science as an InterfaceBetween Rational and Mechanistic Explanation”,Topics inCognitive Science, 6(2): 331–337.doi:10.1111/tops.12087
Chater, Nick and Mike Oaksford, 1999, “Ten Years of theRational Analysis of Cognition”,Trends in CognitiveScience, 3(2): 57–65.
Chater, Nick, Mike Oaksford, Ramin Nakisa, and Martin Redington,2003, “Fast, Frugal, and Rational: How Rational Norms ExplainBehavior”,Organizational Behavior and Human DecisionProcesses, 90(1): 63–86.doi:10.1016/S0749-5978(02)00508-3
Clark, Andy and David Chalmers, 1998, “The ExtendedMind”,Analysis, 58(1): 7–19.doi:10.1093/analys/58.1.7
Cohen, L. Jonathan, 1981, “Can Human Irrationality BeExperimentally Demonstrated?”Behavioral and BrainSciences, 4(3): 317–331. doi:10.1017/S0140525X00009092
Coletii, Giulianella and Romano Scozzafava, 2002,Probabilistic Logic in a Coherent Setting, (Trends in Logic,15), Dordrecht: Springer Netherlands.doi:10.1007/978-94-010-0474-9
Collins, Peter, Karolina Krzyżanowska, Stephan Hartmann,Gregory Wheeler and Ulrike Hahn, 2020, “Conditionals andTestimony”,Cognitive Psychology, 122: 101329.
Cowan, Nelson, 2001, “The Magical Number 4 in Short-TermMemory: A Reconsideration of Mental Storage Capacity”,Behavioral and Brain Sciences, 24(1): 87–114.
Czerlinski, Jean, Gerd Gigerenzer, and Daniel G. Goldstein, 1999,“How Good Are Simple Heuristics?” in Gigerenzer et al.1999: 97–118.
Damore, James A. and Jeff Gore, 2012, “UnderstandingMicrobial Cooperation”,Journal of Theoretical Biology,299: 31–41. doi:10.1016/j.jtbi.2011.03.008
Dana, Jason and Robin M. Dawes, 2004, “The Superiority ofSimple Alternatives to Regression for Social SciencePredictions”,Journal of Educational and BehavioralStatistics, 29(3): 317–331.doi:10.3102/10769986029003317
Danks, David and Alex John London, 2017, “Algorithmic Biasin Autonomous Systems”,Proceedings of the Twenty-SixthInternational Joint Conference on Artificial Intelligence,IJCAI-17, Melbourne: 4691–4697. doi:10.24963/ijcai.2017/654.
Darwin, Charles, 1871,The Descent of Man, New York: D.Appleton and Company.
Davidson, Donald, 1974, “Belief and the Basis ofMeaning”,Synthese, 27(3–4): 309–323.doi:10.1007/BF00484597
Davis-Stober, Clintin P., Jason Dana, and David V. Budescu, 2010,“Why Recognition Is Rational: Optimality Results onSingle-Variable Decision Rules”,Judgment and DecisionMaking, 5(4): 216–229.
Dawes, Robin M., 1979, “The Robust Beauty of Improper LinearModels in Decision Making”,American Psychologist,34(7): 571–582. doi:10.1037/0003-066X.34.7.571
de Finetti, Bruno, 1970 [1974],Teoria Delle Probabilita,Italy: Giulio Einaudi. Translated asTheory of Probability: ACritical Introductory Treatment. Vol. 1 and 2, AntonioMachí and Adrian Smith (trans.), Chichester: Wiley.doi:10.1002/9781119286387
de Finetti, Bruno and Leonard J. Savage, 1962, “Sul Modo DiScegliere Le Probabilità Iniziali”,Biblioteca DelMetron, Serie C, 1: 81–154.
DeMiguel, Victor, Lorenzo Garlappi, and Raman Uppal, 2009,“Optimal Versus Naive Diversification: How Inefficient Is the\(\frac{1}{N}\) Portfolio Strategy?”,Review of FinancialStudies, 22(5): 1915–1953. doi:10.1093/rfs/hhm075
Dennett, Daniel C., 1971, “Intentional Systems”,Journal of Philosophy, 68(4): 87–106.doi:10.2307/2025382
Dewey, John, 1960,The Quest for Certainty, New York:Capricorn Books.
Dhami, Mandeep K., Ralph Hertwig, and Ulrich Hoffrage, 2004,“The Role of Representative Design in an Ecological Approach toCognition”,Psychological Bulletin, 130(6):959–988. doi:10.1037/0033-2909.130.6.959
Domingos, Pedro, 2000, “A Unified Bias-VarianceDecomposition and Its Applications”, inProceedings of the17th International Conference on Machine Learning, MorganKaufmann, pp. 231–238.
Doyen, Stéphane, Olivier Klein, Cora-Lise Pichton, and AxelCleeremans, 2012, “Behavioral Priming: It’s All in theMind, but Whose Mind?”PLoS One, 7(1): e29081.doi:10.1371/journal.pone.0029081
Dubins, Lester E., 1975, “Finitely Additive ConditionalProbability, Conglomerability, and Disintegrations”,Annalsof Probability, 3(1): 89–99.doi:10.1214/aop/1176996451
Einhorn, Hillel J., 1970, “The Use of Nonlinear,Noncompensatory Models in Decision Making”,PsychologicalBulletin, 73(3): 221–230. doi:10.1037/h0028695
Elliott, Graham, Ivana Komunjer, and Allan Timmermann, 2005,“Estimation and Testing of Forecast Rationality under FlexibleLoss”,Review of Economic Studies, 72(4):1107–1125. doi:10.1111/0034-6527.00363
Ellsberg, Daniel, 1961, “Risk, Ambiguity and the SavageAxioms”,Quarterly Journal of Economics, 75(4):643–669. doi:10.2307/1884324
Fawcett, Tim W., Benja Fallenstein, Andrew D. Higginson, AlasdairI. Houston, Dave E.W. Mallpress, Pete C. Trimmer, and John M.McNamara, 2014, “The Evolution of Decision Rules in ComplexEnvironments”,Trends in Cognitive Sciences, 18(3):153–161. doi:10.1016/j.tics.2013.12.012
Fennema, Hein and Peter Wakker, 1997, “Original andCumulative Prospect Theory: A Discussion of EmpiricalDifferences”,Journal of Behavioral Decision Making,10(1): 53–64.doi:10.1002/(SICI)1099-0771(199703)10:1<53::AID-BDM245>3.0.CO;2-1
Fiedler, Klaus, 1988, “The Dependence of the ConjunctionFallacy on Subtle Linguistic Factors”,PsychologicalResearch, 50(2): 123–129. doi:10.1007/BF00309212
Fiedler, Klaus and Peter Juslin (eds.), 2006,InformationSampling and Adaptive Cognition, Cambridge: Cambridge UniversityPress. doi:10.1017/CBO9780511614576
Fishburn, Peter C., 1982,The Foundations of ExpectedUtility, Dordrecht: D. Reidel. doi:10.1007/978-94-017-3329-8
Fisher, Ronald Aylmer, 1936, “Uncertain Inference”,Proceedings of the American Academy of Arts and Sciences,71(4): 245–258. doi:10.2307/20023225
Friedman, Jerome, 1997, “On Bias, Variance, 0-1 Loss and theCurse of Dimensionality”,Journal of Data Mining andKnowledge Discovery, 1(1): 55–77.doi:10.1023/A:1009778005914
Friedman, Milton, 1953, “The Methodology of PositiveEconomics”, inEssays in Positive Economics, Chicago:University of Chicago Press, pp. 3–43.
Friedman, Milton and Leonard J. Savage, 1948, “The UtilityAnalysis of Choices Involving Risk”,Journal of PoliticalEconomy, 56(4): 279–304. doi:10.1086/256692
Friston, Karl, 2010, “The Free-Energy Principle: A UnifiedBrain Theory?”,Nature Reviews Neuroscience, 11(2):127–138. doi:10.1038/nrn2787
Galaabaatar, Tsogbadral and Edi Karni, 2013, “SubjectiveExpected Utility with Incomplete Preferences”,Econometrica, 81(1): 255–284. doi:10.3982/ECTA9621
Gelman, Andrew, John B. Carlin, Hal S. Stern, David B. Dunson, AkiVehrari, and Donald B. Rubin, 2013,Bayesian Data Analysis,3rd edition, Boca Raton, FL: CRC Press.
Gergely, György, Harold Bekkering, and IldikóKirály, 2002, “Developmental Psychology: RationalImitation in Preverbal Infants”,Nature, 415(6873):755–755. doi:10.1038/415755a
Gibson, James J., 1979,The Ecological Approach to VisualPerception, Boston: Houghton Mifflin.
Gigerenzer, Gerd, 1996, “On Narrow Norms and VagueHeuristics: A Reply to Kahneman and Tversky”,PsychologicalReview, 103(3): 592–596.doi:10.1037/0033-295X.103.3.592
–––, 2007,Gut Feelings: The Intelligence ofthe Unconscious, New York: Viking Press.
Gigerenzer, Gerd and Henry Brighton, 2009, “HomoHeuristicus: Why Biased Minds Make Better Inferences”,Topics in Cognitive Science, 1(1): 107–143.doi:10.1111/j.1756-8765.2008.01006.x
Gigerenzer, Gerd and Daniel G. Goldstein, 1996, “Reasoningthe Fast and Frugal Way: Models of Bounded Rationality”,Psychological Review, 103(4): 650–669.doi:10.1037/0033-295X.103.4.650
Gigerenzer, Gerd, Wolfgang Hell, and Hartmut Blank, 1988,“Presentation and Content: The Use of Base Rates as a ContinuousVariable.”,Journal of Experimental Psychology: HumanPerception and Performance, 14(3): 513–525.doi:10.1037/0096-1523.14.3.513
Gigerenzer, Gerd, Ralph Hertwig, and Thorsten Pachur (eds), 2011,Heuristics: The Foundations of Adaptive Behavior, OxfordUniversity Press. doi:10.1093/acprof:oso/9780199744282.001.0001
Gigerenzer, Gerd, Peter M. Todd, and the ABC Group (eds.), 1999,Simple Heuristics That Make Us Smart, Oxford: OxfordUniversity Press.
Giles, Robin, 1976, “A Logic for Subjective Belief”,inFoundations of Probability Theory, Statistical Inference, andStatistical Theories of Science, William L. Harper and CliffordAlan Hooker (eds.), Dordrecht: Springer Netherlands, vol. I:41–72. doi:10.1007/978-94-010-1853-1_4
Giron, F. J., and S. Rios, 1980, “Quasi-Bayesian Behavior: AMore Realistic Approach to Decision Making?”Trabajos deEstadistica Y de Investigacion Operativa, 31(1): 17–38.doi:10.1007/BF02888345
Glymour, Clark, 2001,The Mind’s Arrows, Cambridge,MA: MIT Press.
Goldblatt, Robert, 1998,Lectures on the Hyperreals: AnIntroduction to Nonstandard Analysis, (Graduate Texts inMathematics 188), New York: Springer New York.doi:10.1007/978-1-4612-0615-6
Goldstein, Daniel G. and Gerd Gigerenzer, 2002, “Models ofEcological Rationality: The Recognition Heuristic.”,Psychological Review, 109(1): 75–90.doi:10.1037/0033-295X.109.1.75
Golub, Benjamin and Matthew O Jackson, 2010, “NaïveLearning in Social Networks and the Wisdom of Crowds”,American Economic Journal: Microeconomics, 2(1):112–149. doi:10.1257/mic.2.1.112
Good, Irving J., 1952, “Rational Decisions”,Journal of the Royal Statistical Society. Series B, 14(1):107–114. Reprinted in Good 1983: 3–14
–––, 1967, “On the Principle of TotalEvidence”,The British Journal for the Philosophy ofScience, 17(4): 319–321. Reprinted in Good 1983:178–180. doi:10.1093/bjps/17.4.319
–––, 1971 [1983], “Twenty-Seven Principlesof Rationality”,Foundations of Statistical Inference,V. P. Godambe and D. A. Sprott (eds), Toronto: Holt, Rinehart &Winston. Reprinted in Good 1983: 15–20.
–––, 1983,Good Thinking: The Foundations ofProbability and Its Applications, Minneapolis: University ofMinnesota Press.
Güth, Werner, Rolf Schmittberger, and Bernd Schwarze, 1982,“An Experimental Analysis of Ultimatum Bargaining”,Journal of Economic Behavior and Organization, 3(4):367–388. doi:10.1016/0167-2681(82)90011-7
Hacking, Ian, 1967, “Slightly More Realistic PersonalProbability”,Philosophy of Science, 34(4):311–325. doi:10.1086/288169
Haenni, Rolf, Jan-Willem Romeijn, Gregory Wheeler, and JonWilliamson, 2011,Probabilistic Logics and ProbabilisticNetworks, Dordrecht: Springer Netherlands.doi:10.1007/978-94-007-0008-6
Hahn, Ulrike, 2022, “Collective and EpistemicRationality”,Topics in Cognitive Science, 14(3):602–620.
Hahn, Ulrike and Paul A. Warren, 2009, “Perceptions ofRandomness: Why Three Heads Are Better Than Four”,Psychological Review, 116(2): 454–461.doi:10.1037/a0015241
Halpern, Joseph Y., 2010, “Lexicographic Probability,Conditional Probability, and Nonstandard Probability”,Gamesand Economic Behavior, 68(1): 155–179.doi:10.1016/j.geb.2009.03.013
Hammond, Kenneth R., 1955, “Probabilistic Functioning andthe Clinical Method”,Psychological Review, 62(4):255–262. doi:10.1037/h0046845
Hammond, Kenneth R., Carolyn J. Hursch, and Frederick J. Todd,1964, “Analyzing the Components of Clinical Inference”,Psychological Review, 71(6): 438–456.doi:10.1037/h0040736
Hammond, Peter J., 1994, “Elementary Non-ArchimedeanRepresentations of Probability for Decision Theory and Games”,in Paul Humphreys (ed.),Patrick Suppes: ScientificPhilosopher, Vol. 1: Probability and Probabilistic Causality,Dordrecht, The Netherlands: Kluwer, pp. 25–59.doi:10.1007/978-94-011-0774-7_2
Haykin, Simon O., 2013,Adaptive Filter Theory, fifthedition, London: Pearson.
Henrich, Joseph and Francisco J Gil-White, 2001, “TheEvolution of Prestige: Freely Conferred Deference as a Mechanism forEnhancing the Benefits of Cultural Transmission”,Evolutionand Human Behavior, 22(3): 165–196.doi:10.1016/S1090-5138(00)00071-4
Hertwig, Ralph and Gerd Gigerenzer, 1999, “The‘Conjunction Fallacy’ Revisited: How IntelligentInferences Look Like Reasoning Errors”,Journal ofBehavioral Decision Making, 12(4): 275–305.doi:10.1002/(SICI)1099-0771(199912)12:4<275::AID-BDM323>3.0.CO;2-M
Hertwig, Ralph and Timothy J. Pleskac, 2008, “The Game ofLife: How Small Samples Render Choice Simpler”, inTheProbabilistic Mind: Prospects for Bayesian Cognitive Science,Nick Chater and Mike Oaksford (eds.), Oxford: Oxford University Press,209–236. doi:10.1093/acprof:oso/9780199216093.003.0010
Hertwig, Ralph, Greg Barron, Elke U. Weber, and Ido Erev, 2004,“Decisions from Experience and the Effect of Rare Events inRisky Choice”,Psychological Science, 15(8):534–539. doi:10.1111/j.0956-7976.2004.00715.x
Hertwig, Ralph, Jennifer Nerissa Davis, and Frank J. Sulloway,2002, “Parental Investment: How an Equity Motive Can ProduceInequality”,Psychological Bulletin, 128(5):728–745. doi:10.1037/0033-2909.128.5.728
Herzog, Stefan M. and Ralph Hertwig, 2013, “The EcologicalValidity of Fluency”, in Christian Unkelbach & RainerGreifeneder (eds.),The Experience of Thinking: How Fluency ofMental Processes Influences Cognition and Behavior, PsychologyPress, pp. 190–219.
Hey, John D., 1982, “Search for Rules for Search”,Journal of Economic Behavior and Organization, 3(1):65–81. doi:10.1016/0167-2681(82)90004-X
Ho, Teck-Hua, 1996, “Finite Automata Play RepeatedPrisoner’s Dilemma with Information Processing Costs”,Journal of Economic Dynamics and Control, 20(1–3):173–207. doi:10.1016/0165-1889(94)00848-1
Ho, Tech-Hua and Xuanming Su, 2024, “A DynamicLevel-k Model in Sequential Games”,ManagementScience, 59(2): 452–469.
Hochman, Guy and Eldad Yechiam, 2011, “Loss Aversion in theEye and in the Heart: The Autonomic Nervous System’s Responsesto Losses”,Journal of Behavioral Decision Making,24(2): 140–156. doi:10.1002/bdm.692
Hogarth, Robin M., 2012, “When Simple Is Hard toAccept”, in Todd et al. 2012: 61–79.doi:10.1093/acprof:oso/9780195315448.003.0024
Hogarth, Robin M. and Natalia Karelaia, 2007, “Heuristic andLinear Models of Judgment: Matching Rules and Environments”,Psychological Review, 114(3): 733–758.doi:10.1037/0033-295X.114.3.733
Howe, Mark L., 2011, “The Adaptive Nature of Memory and ItsIllusions”,Current Directions in PsychologicalScience, 20(5): 312–315. doi:10.1177/0963721411416571
Hume, David, 1738 [2008],A Treatise of Human Nature,Jonathan Bennett (ed.), www.earlymoderntexts.com, 2008. [Hume 1738 available online]
Hutchinson, John M., Carola Fanselow, and Peter M. Todd, 2012,“Car Parking as a Game Between Simple Heuristics”, in Toddet al. 2012: 454–484.doi:10.1093/acprof:oso/9780195315448.003.0133
Jackson, Matthew O., 2010,Social and Economic Networks,Princeton, NJ: Princeton University Press.
Jarvstad, Andreas, Ulrike Hahn, Simon K. Rushton, and Paul A.Warren, 2013, “Perceptuo-Motor, Cognitive, and Description-BasedDecision-Making Seem Equally Good”,Proceedings of theNational Academy of Sciences, 110(40): 16271–16276.doi:10.1073/pnas.1300239110
Jevons, William Stanley, 1871,The Theory of PoliticalEconomy, London: Macmillian; Company.
Juslin, Peter and Henrik Olsson, 2005, “Capacity Limitationsand the Detection of Correlations: Comment on Kareev”,Psychological Review, 112(1): 256–267.doi:10.1037/0033-295X.112.1.256
Juslin, Peter, Anders Winman, and Patrik Hansson, 2007, “TheNaïve Intuitive Statistician: A Naïve Sampling Model ofIntuitive Confidence Intervals.”,Psychological Review,114(3): 678–703. doi:10.1037/0033-295X.114.3.678
Kahneman, Daniel and Amos Tversky, 1972, “SubjectiveProbability: A Judgment of Representativeness”,CognitivePsychology, 3(3): 430–454.doi:10.1016/0010-0285(72)90016-3
–––, 1979, “Prospect Theory: An Analysisof Decision Under Risk”,Econometrica, 47(2):263–291. doi:10.2307/1914185
–––, 1996, “On the Reality of CognitiveIllusions”,Psychological Review, 103(3):582–591. doi:10.1037/0033-295X.103.3.582
Kahneman, Daniel, Baruch Slovic, and Amos Tversky (eds.), 1982,Judgment Under Uncertainty: Heuristics and Biases, Cambridge:Cambridge University Press. doi:10.1017/CBO9780511809477
Kareev, Yaakov, 1995, “Through a Narrow Window: WorkingMemory Capacity and the Detection of Covariation”,Cognition, 56(3): 263–269.doi:10.1016/0010-0277(95)92814-G
–––, 2000, “Seven (Indeed, Plus or MinusTwo) and the Detection of Correlations”,PsychologicalReview, 107(2): 397–402.doi:10.1037/0033-295X.107.2.397
Karni, Edi, 1985,Decision Making Under Uncertainty: The Caseof State-Dependent Preferences, Cambridge, MA: HarvardUniversity.
Katsikopoulos, Konstantinos V., 2010, “The Less-Is-MoreEffect: Predictions and Tests”,Judgment and DecisionMaking, 5(4): 244–257.
Katsikopoulos, Konstantinos V., Lael J. Schooler, and RalphHertwig, 2010, “The Robust Beauty of OrdinaryInformation”,Psychological Review, 117(4):1259–1266. doi:10.1037/a0020418
Kaufmann, Esther and Werner W. Wittmann, 2016, “The Successof Linear Bootstrapping Models: Decision Domain-, Expertise-, andCriterion-Specific Meta-Analysis”,PLoS One, 11(6):e0157914. doi:10.1371/journal.pone.0157914
Keeney, Ralph L. and Howard Raiffa, 1976,Decisions withMultiple Objectives: Preferences and Value Trade-Offs, New York:Wiley.
Kelly, Kevin T. and Oliver Schulte, 1995, “The ComputableTestability of Theories Making Uncomputable Predictions”,Erkenntnis, 43(1): 29–66. doi:10.1007/BF01131839
Keynes, John Maynard, 1921,A Treatise on Probability,London: Macmillan.
Kidd, Celeste and Benjamin Y. Hayden, 2015, “The Psychologyand Neuroscience of Curiosity”,Neuron, 88(3):449–460. doi:10.1016/j.neuron.2015.09.010
Kirsh, David, 1995, “The Intelligent Use of Space”,Artificial Intelligence, 73(1–2): 31–68.doi:10.1016/0004-3702(94)00017-U
Klaes, Matthias and Esther-Mirjam Sent, 2005, “A ConceptualHistory of the Emergence of Bounded Rationality”,History ofPolitical Economy, 37(1):27–59.doi:10.1215/00182702-37-1-27
Knight, Frank H., 1921,Risk, Uncertainty and Profit,Boston: Houghton Mifflin.
Knill, David C. and Whitman Richards, 1996,Perception asBayesian Inference, Cambridge: Cambridge University Press.
Koehler, Jonathan J., 1996, “The Base Rate FallacyReconsidered: Descriptive, Normative, and MethodologicalChallenges”,Behavioral and Brain Sciences, 19(1):1–53. doi:10.1017/S0140525X00041157
Kolchinsky, Artemy and David H. Wolpert, 2020,“Thermodynamic Costs of Turing Machines”,PhysicalReview Research, 2: 033312.
Konek, Jason, 2023, “Degrees of Incoherence, DutchBookability, and Guidance Value”,PhilosophicalStudies, 180: 395–428.
Koopman, Bernard O., 1940, “The Axioms and Algebra ofIntuitive Probability”,Annals of Mathematics, 41(2):269–292. doi:10.2307/1969003
Körding, Konrad Paul and Daniel M. Wolpert, 2004, “TheLoss Function of Sensorimotor Learning”,Proceedings of theNational Academy of Sciences, 101(26): 9839–9842.doi:10.1073/pnas.0308394101
Kreps, David M, Paul Milgrom, John Roberts, and Robert Wilson,1982, “Rational Cooperation in the Finitely RepeatedPrisoners’ Dilemma”,Journal of Economic Theory,27(2): 245–252. doi:10.1016/0022-0531(82)90029-1
Kühberger, Anton, Michael Schulte-Mecklenbeck, and JosefPerner, 1999, “The Effects of Framing, Reflection, Probability,and Payoff on Risk Preference in Choice Tasks”,Organizational Behavior and Human Decision Processes, 78(3):204–231. doi:10.1006/obhd.1999.2830
Kyburg, Henry E., Jr., 1978, “Subjective Probability:Criticisms, Reflections, and Problems”,Journal ofPhilosophical Logic, 7(1): 157–180.doi:10.1007/BF00245926
Landauer, Rolf, 1961, “Irreversibility and Heat Generationin the Computing Process”,IBM Journal of Research andDevelopment, 5(3): 183–191.
Levi, Isaac, 1977, “Direct Inference”,Journal ofPhilosophy, 74: 5–29. doi:10.2307/2025732
–––, 1983, “Who Commits the Base RateFallacy?”,Behavioral and Brain Sciences, 6(3):502–506. doi:10.1017/S0140525X00017209
Lewis, Richard L., Andrew Howes, and Satinder Singh, 2014,“Computational Rationality: Linking Mechanism and BehaviorThrough Bounded Utility Maximization”,Topics in CognitiveScience, 6(2): 279–311. doi:10.1111/tops.12086
Lichtenberg, Jan Malte and Özgür Simsek, 2016,“Simple Regression Models”,Proceedings of MachineLearning Research, 58: 13–25.
Loomes, Graham and Robert Sugden, 1982, “Regret Theory: AnAlternative Theory of Rational Choice Under Uncertainty”,Economic Journal, 92(368): 805–824.doi:10.2307/2232669
Lieder, Falk and Thomas L. Griffiths, 2020,“Resource-rational analysis: Understanding human cognition asthe optimal use of limited computational resources”,Behavioral and Brain Sciences, 43(e1): 1–16
Lindley, Dennis Victor, 1965,Introduction to Probability andStatistics, Cambridge: Cambridge University Press.
Long, Daniel Zhuoyu, Melvyn Sim, and Minglong Zhou, 2022,“Robust Satisficing”,Operations Research, 71(2):61–82.
Loridan, P., 1984, “\(\epsilon\)-Solutions in VectorMinimization Problems”,Journal of Optimization Theory andApplications, 43(2): 265–276. doi:10.1007/BF00936165
Luce, R. Duncan and Howard Raiffa, 1957,Games and Decisions:Introduction and Critical Suvey, New York: Dover.
Marr, D. C., 1982,Vision, New York: Freeman.
May, Kenneth O., 1954, “Intransitivity, Utility, and theAggregation of Preference Patterns”,Econometrica,22(1): 1–13. doi:10.2307/1909827
Maynard Smith, John, 1982,Evolution and the Theory ofGames, Cambridge: Cambridge University Press.doi:10.1017/CBO9780511806292
McBeath, Michael K., Dennis M. Shaffer, and Mary K. Kaiser, 1995,“How Baseball Outfielders Determine Where to Run to Catch FlyBalls”,Science, 268(5210): 569–573.doi:10.1126/science.7725104
McNamara, John M., Pete C. Trimmer, and Alasdair I. Houston, 2014,“Natural Selection Can Favour `Irrational’Behavior”,Biology Letters, 10(1): 20130935.doi:10.1098/rsbl.2013.0935
Meder, Björn, Ralf Mayrhofer, and Michael R. Waldmann, 2014,“Structure Induction in Diagnostic Causal Reasoning”,Psychological Review, 121(3): 277–301.doi:10.1037/a0035944
Meehl, Paul, 1954,Clinical Versus Statistical Prediction: ATheoretical Analysis and a Review of the Evidence, Minneapolis:Minnesota Press.
Mill, John Stuart, 1844, “On the Definition of PoliticalEconomy”, reprinted in John M. Robson (ed.),The CollectedWorks of John Stuart Mill, Vol. IV, Toronto: University ofToronto Press.
Miller, George A., 1956, “The Magical Number Seven, Plus orMinus Two”,Psychological Review, 63(2):81–97.
Mongin, Philippe, 2000, “Does Optimization ImplyRationality”,Synthese, 124(1–2): 73–111.doi:10.1023/A:1005150001309
Nau, Robert, 2006, “The Shape of IncompletePreferences”,The Annals of Statistics, 34(5):2430–2448. doi:10.1214/009053606000000740
Neumann, John von and Oskar Morgenstern, 1944,Theory of Gamesand Economic Behavior, Princeton, NJ: Princeton UniversityPress.
Newell, Allen and Herbert A. Simon, 1956,The Logic TheoryMachine: A Complex Information Processing System (No. P-868),Santa Monica, CA: The Rand Corporation.
–––, 1972,Human Problem Solving,Englewood Cliffs, NJ: Prentice-Hall.
–––, 1976, “Computer Science as EmpiricalInquiry: Symbols and Search”,Communications of theACM, 19(3): 113–126. doi:10.1145/360018.360022
Neyman, Abraham, 1985, “Bounded Complexity JustifiesCooperation in the Finitely Repeated Prisoners’ Dilemma”,Economics Letters, 19(3): 227–229.doi:10.1016/0165-1765(85)90026-6
Norton, Michael I., Daniel Mochon, and Dan Ariely, 2012,“The IKEA Effect: When Labor Leads to Love”,Journalof Consumer Psychology, 22(3): 453–460.doi:10.1016/j.jcps.2011.08.002
Nowak, Martin A. and Robert M. May, 1992, “EvolutionaryGames and Spatial Chaos”,Nature, 359(6398):826–829. doi:10.1038/359826a0
Oaksford, Mike and Nick Chater, 1994, “A Rational Analysisof the Selection Task as Optimal Data Selection”,Psychological Review, 101(4): 608–631.doi:10.1037/0033-295X.101.4.608
–––, 2007,Bayesian Rationality,Oxford: Oxford University Press.doi:10.1093/acprof:oso/9780198524496.001.0001
Ok, Efe A., 2002, “Utility Representation of an IncompletePreference Relation”,Journal of Economic Theory,104(2): 429–449. doi:10.1006/jeth.2001.2814
Osborne, Martin J., 2003,An Introduction to Game Theory,Oxford: Oxford University Press.
Osborne-Crawley, Katherine, 2020, “Social Cognition in theReal World: Reconnecting the Study of Social Cognition with SocialReality”,Review of General Psychology, 24(2):144–158.
Oswald, Frederick L., Gregory Mitchell, Hart Blanton, JamesJaccard, and Philip E. Tetlock, 2013, “Predicting Ethnic andRacial Discrimination: A Meta-Analysis of IAT CriterionStudies.”,Journal of Personality and SocialPsychology, 105(2): 171–192. doi:10.1037/a0032734
Pachur, Thorsten, Peter M. Todd, Gerd Gigerenzer, Lael J.Schooler, and Daniel Goldstein, 2012, “When Is the RecognitionHeuristic an Adaptive Tool?” in Todd et al. 2012: 113–143.doi:10.1093/acprof:oso/9780195315448.003.0035
Palmer, Stephen E., 1999,Vision Science, Cambridge, MA:MIT Press.
Papadimitriou, Christos H. and Mihalis Yannakakis, 1994, “OnComplexity as Bounded Rationality (Extended Abstract)”, inProceedings of the Twenty-Sixth Annual ACM Symposium on Theory ofComputing - STOC ’94, Montreal: ACM Press, 726–733.doi:10.1145/195058.195445
Parikh, Rohit, 1971, “Existence and Feasibility inArithmetic”,Journal of Symbolic Logic, 36(3):494–508. doi:10.2307/2269958
Payne, John W., James R. Bettman, and Eric J. Johnson, 1988,“Adaptive Strategy Selection in Decision Making.”,Journal of Experimental Psychology: Learning, Memory, andCognition, 14(3): 534–552.doi:10.1037/0278-7393.14.3.534
Pedersen, Arthur Paul, 2014, “ComparativeExpectations”,Studia Logica, 102(4): 811–848.doi:10.1007/s11225-013-9539-7
Pedersen, Arthur Paul and Gregory Wheeler, 2014,“Demystifying Dilation”,Erkenntnis, 79(6):1305–1342. doi:10.1007/s10670-013-9531-7
–––, 2015, “Dilation, Disintegrations, andDelayed Decisions”, inProceedings of the 9th Symposium onImprecise Probabilities and Their Applications (Isipta), Pescara,Italy, pp. 227–236.
Peirce, Charles S., 1955,Philosophical Writings ofPeirce, Justus Buchler (ed.), New York: Dover.
Peterson, Cameron R. and Lee Roy Beach, 1967, “Man as anIntuitive Statistician”,Psychological Bulletin, 68(1):29–46. doi:10.1037/h0024722
Popper, Karl R., 1959,The Logic of Scientific Discovery,London: Routledge.
Quiggin, John, 1982, “A Theory of AnticipatedUtility”,Journal of Economic Behavior andOrganization, 3(4): 323–343.doi:10.1016/0167-2681(82)90008-7
Rabin, Matthew, 2000, “Risk Aversion and Expected-UtilityTheory: A Calibration Theorem”,Econometrica, 68(5):1281–1292. doi:10.1111/1468-0262.00158
Raiffa, Howard and Robert Schlaifer, 1961,Applied StatisticalDecision Theory (Studies in Managerial Economics: Volume 1),Cambridge, MA: Division of Research, Graduate School of BusinessAdministration, Harvard University.
Ramsey, Frank, 1931,The Foundations of Mathematics and OtherEssays (Volume 1), New York: Humanities Press.
Rapoport, Amnon, Darryl A. Seale, and Andrew M. Colman, 2015,“Is Tit-for-Tat the Answer? On the Conclusions Drawn fromAxelrod’s Tournaments”,PLoS One, 10(7):e0134128. doi:10.1371/journal.pone.0134128
Rapoport, Anatol and A.M. Chammah, 1965,Prisoner’sDilemma: A Study in Conflict and Cooperation, Ann Arbor:University of Michigan Press.
Regenwetter, Michel, Jason Dana, and Clintin P. Davis-Stober,2011, “Transitivity of Preferences”,PsychologicalReview, 118(1): 42–56. doi:10.1037/a0021150
Reiter, Ray, 1980, “A Logic for Default Reasoning”,Artificial Intelligence, 13(1–2): 81–132.doi:10.1016/0004-3702(80)90014-4
Rényi, Alfréd, 1955, “On a New AxiomaticTheory of Probability”,Acta Mathematica AcademiaeScientiarum Hungaricae, 6(3–4): 285–335.doi:10.1007/BF02024393
Rick, Scott, 2011, “Losses, Gains, and Brains:Neuroeconomics Can Help to Answer Open Questions About LossAversion”,Journal of Consumer Psychology, 21(4):453–463. doi:10.1016/j.jcps.2010.04.004
Rieskamp, Jörg and Anja Dieckmann, 2012, “Redundancy:Environment Structure That Simple Heuristics Can Exploit”, inTodd et al. 2012: 187–215.doi:10.1093/acprof:oso/9780195315448.003.0056
Roberts, Caroline, Emily Gilbert, Nick Allum, andLéïla Eisner, 2019, “Research Synthesis: Satisfyingin Surveys: A Systematic Review of the Literature”,PublicOpinion Quarterly, 83(3): 598–626.
Rubinstein, Ariel, 1986, “Finite Automata Play the RepeatedPrisoner’s Dilemma”,Journal of Economic Theory,39(1): 83–96. doi:10.1016/0022-0531(86)90021-9
Russell, Stuart J., and Subramanian, Devika, 1995, “ProvablyBounded-Optimal Agents”,Journal of Artificial IntelligenceResearch, 2(1): 575–609. doi:10.1613/jair.133
Samuels, Richard, Stephen Stich, and Michael Bishop, 2002,“Ending the Rationality Wars: How to Make Disputes About HumanRationality Disappear”, inCommon Sense, Reasoning, andRationality, Renee Elio (ed.), New York: Oxford University Press,236–268. doi:10.1093/0195147669.003.0011
Samuelson, Paul, 1947,Foundations of Economic Analysis,Cambridge, MA: Harvard University Press.
Santos, Francisco C., Marta D. Santos, and Jorge M. Pacheco, 2008,“Social Diversity Promotes the Emergence of Cooperation inPublic Goods Games”,Nature, 454(7201): 213–216.doi:10.1038/nature06940
Savage, Leonard J., 1954,Foundations of Statistics, NewYork: Wiley.
–––, 1967, “Difficulties in the Theory ofPersonal Probability”,Philosophy of Science, 34(4):305–310. doi:10.1086/288168
–––, 1972,Foundations of Statistics,2nd edition, New York: Dover.
Schervish, Mark J., Teddy Seidenfeld, and Joseph B. Kadane, 2012,“Measures of Incoherence: How Not to Gamble If You Must, withDiscussion”, in José Bernardo, A. Phlip Dawid, James O.Berger, Mike West, David Heckerman, M.J. Bayarri, & Adrian F. M.Smith (eds.),Bayesian Statistics 7: Proceedings of the 7thValencia International Meeting, Oxford: Clarendon Press, pp.385–402.
Schick, Frederic, 1986, “Dutch Bookies and MoneyPumps”,Journal of Philosophy, 83(2): 112–119.doi:10.2307/2026054
Schmitt, Michael, and Laura Martignon, 2006, “On theComplexity of Learning Lexicographic Strategies”,Journal ofMachine Learning Research, 7(Jan): 55–83. [Schmitt & Martignon 2006 available online]
Schooler, Lael J. and Ralph Hertwig, 2005, “How ForgettingAids Heuristic Inference”,Psychological Review,112(3): 610–628. doi:10.1037/0033-295X.112.3.610
Seidenfeld, Teddy, Mark J. Schervish, and Joseph B. Kadane, 1995,“A Representation of Partially Ordered Preferences”,The Annals of Statistics, 23(6): 2168–2217.doi:10.1214/aos/1034713653
–––, 2012, “What Kind of Uncertainty IsThat? Using Personal Probability for Expressing One’s Thinkingabout Logical and Mathematical Propositions”,Journal ofPhilosophy, 109(8–9): 516–533.doi:10.5840/jphil20121098/925
Selten, Reinhard, 1998, “Aspiration AdaptationTheory”,Journal of Mathematical Psychology,42(2–3): 191–214. doi:10.1006/jmps.1997.1205
Shamay-Tsoory, Simone G. and Avi Mendelsohnm, 2019,“Real-Life Neuroscience: An Ecological Approach to Brain andBehavior Research”,Perspectives on PsychologicalScience, 14(5): 841–859.
Simon, Herbert A., 1947,Administrative Behavior: A Study ofDecision-Making Processes in Administrative Organization, firstedition, New York: Macmillan.
–––, 1955a, “A Behavioral Model ofRational Choice”,Quarterly Journal of Economics,69(1): 99–118. doi:10.2307/1884852
–––, 1955b, “On a Class of SkewDistribution Functions”,Biometrika, 42(3–4):425–440. doi:10.1093/biomet/42.3-4.425
–––, 1957a,Administrative Behavior: A Studyof Decision-Making Processes in Administrative Organization,second edition, New York: Macmillan.
–––, 1957b,Models of Man, New York:John Wiley.
–––, 1976, “From Substantive to ProceduralRationality”, in25 Years of Economic Theory, T. J.Kastelein, S. K. Kuipers, W. A. Nijenhuis, and G. R. Wagenaar (eds.),Boston, MA: Springer US, 65–86.doi:10.1007/978-1-4613-4367-7_6
Skyrms, Brian, 2003,The Stag Hunt and the Evolution of SocialStructure, Cambridge: Cambridge University Press.doi:10.1017/CBO9781139165228
Sorensen, Roy A., 1991, “Rationality as an AbsoluteConcept”,Philosophy, 66(258): 473–486.doi:10.1017/S0031819100065128
Spirtes, Peter, 2010, “Introduction to CausalInference”,Journal of Machine Learning Research,11(May): 1643–1662. [Spirtes 2010 available online]
Stalnaker, Robert, 1991, “The Problem of LogicalOmniscience, I”,Synthese, 89(3): 425–440.doi:10.1007/BF00413506
Stanovich, Keith E. and Richard F. West, 2000, “IndividualDifferences in Reasoning: Implications for the RationalityDebate?”Behavioral and Brain Sciences, 23(5):645–65.
Stein, Edward, 1996,Without Good Reason: The RationalityDebate in Philosophy and Cognitive Science, Oxford: ClarendonPress. doi:10.1093/acprof:oso/9780198237730.001.0001
Stevens, Jeffrey R., Jenny Volstorf, Lael J. Schooler, andJörg Rieskamp, 2011, “Forgetting Constrains the Emergenceof Cooperative Decision Strategies”,Frontiers inPsychology, 1: article 235. doi:10.3389/fpsyg.2010.00235
Stigler, George J., 1961, “The Economics ofInformation”,Journal of Political Economy, 69(3):213–225. doi:10.1086/258464
Tarski, Alfred, Andrzej Mostowski, and Raphael M. Robinson, 1953,Undecidable Theories, Amsterdam: North-Holland PublishingCo.
Thaler, Richard H., 1980, “Toward a Positive Theory ofConsumer Choice”,Journal of Economic Behavior andOrganization, 1(1): 39–60.doi:10.1016/0167-2681(80)90051-7
Thaler, Richard H. and Cass R. Sustein, 2008,Nudge: ImprovingDecisions About Health, Wealth, and Happiness, New Haven: YaleUniversity Press.
Todd, Peter M. and Geoffrey F. Miller, 1999, “From Pride andPrejudice to Persuasion: Satisficing in Mate Search”, inGigerenzer et al. 1999: 287–308.
Todd, Peter M., Gerd Gigerenzer, and ABC Research Group (eds.),2012,Ecological Rationality: Intelligence in the World, NewYork: Oxford University Press.doi:10.1093/acprof:oso/9780195315448.001.0001
Trivers, Robert L., 1971, “The Evolution of ReciprocalAltruism”,The Quarterly Review of Biology, 46(1):35–57. doi:10.1086/406755
Trommershäuser, Julia, Laurence T. Maloney, and Michael S.Landy, 2003, “Statistical Decision Theory and Trade-Offs in theControl of Motor Response”,Spatial Vision,16(3–4): 255–275. doi:10.1163/156856803322467527
Turner, Brandon M., Christian A. Rodriguez, Tony M. Norcia, SamuelM. McClure, and Mark Steyvers, 2016, “Why More Is Better:Simultaneous Modeling of EEG, FMRI, and Behavioral Data”,NeuroImage, 128(March): 96–115.doi:10.1016/j.neuroimage.2015.12.030
Tversky, Amos, 1969, “Intransitivity of Preferences”,Psychological Review, 76(1): 31–48.doi:10.1037/h0026750
Tversky, Amos and Daniel Kahneman, 1973, “Availability: AHeuristic for Judging Frequency and Probability”,CognitivePsychology, 5(2): 207–232.doi:10.1016/0010-0285(73)90033-9
–––, 1974, “Judgment Under Uncertainty:Heuristics and Biases”,Science, 185(4157):1124–1131. doi:10.1126/science.185.4157.1124
–––, 1977,Causal Schemata in JudgmentsUnder Uncertainty (No. TR-1060-77-10), Defense Advanced ResearchProjects Agency (DARPA).
–––, 1981, “The Framing of Decisions andthe Psychology of Choice”,Science, 211(4481):483–458. doi:10.1126/science.7455683
–––, 1983, “Extensional Versus IntuitiveReasoning: The Conjunction Fallacy in Probability Judgment”,Psychological Review, 90(4): 293–315.doi:10.1037/0033-295X.90.4.293
–––, 1992, “Advances in Prospect Theory:Cumulative Representation of Uncertainty”,Journal of Riskand Uncertainty, 5(4): 297–323. doi:10.1007/BF00122574
Vranas, Peter B.M., 2000, “Gigerenzer’s NormativeCritique of Kahneman and Tversky”,Cognition, 76(3):179–193. doi:10.1016/S0010-0277(99)00084-0
Wakker, Peter P., 2010,Prospect Theory: For Risk andAmbiguity, Cambridge: Cambridge University Press.doi:10.1017/CBO9780511779329
Waldmann, Michael R., Keith J. Holyoak, and Angela Fratianne,1995, “Causal Models and the Acquisition of CategoryStructure.”,Journal of Experimental Psychology:General, 124(2): 181–206.doi:10.1037/0096-3445.124.2.181
Walley, Peter, 1991,Statistical Reasoning with ImpreciseProbabilities, London: Chapman; Hall.
Weber, Max, 1905,The Protestant Ethic and the Spirit ofCapitalism, London: Allen; Unwin.
Wheeler, Gregory, 2004, “A Resource Bounded DefaultLogic”, in James Delgrande & Torsten Schaub (eds.),10thInternational Workshop on Non-Monotonic Reasoning (Nmr 2004),Whistler, Canada, pp. 416–422.
–––, 2017, “Machine Epistemology and BigData”, in Lee McIntyre & Alex Rosenberg (eds.),TheRoutledge Companion to Philosophy of Social Science, Routledge,pp. 321–329.
–––, 2020, “Less is More for Bayesians,too”, in Riccardo Viale (ed.),The Routledge Handbook onBounded Rationality, London: Routledge, pp. 471–483.
–––, 2021, “Moving Beyond Sets ofProbabilities”,Statistical Science, 36(2):201–204.
–––, 2022, “A Gentle Approach to ImpreciseProbabilities”, in Thomas Augustin, Fabio Cozman and GregoryWheeler (eds.),Reflections on the Foundations of Statistics:Essays in Honor of Teddy Seidenfeld (Theory and Decision LibraryA), Cham: Springer, pp. 37–67.
Wheeler, Gregory and Fabio Cozman, 2021, “On the Imprecisionof Full Conditional Probabilities”,Synthese, 199:3761–3782. doi:10.1007/s11229-020-02954-z
White, D. J., 1986, “Epsilon Efficiency”,Journalof Optimization Theory and Applications, 49(2): 319–337.doi:10.1007/BF00940762
Wiggins, Bradford and Cody Christopherson, 2023, “Thereplication crisis in psychology: An overview for theoretical andphilosophical psychology”,Journal of Theoretical andPhilosophical Psychology, 39(4): 202–217.
Wolpert, David H., 2019, “The Stochastic Thermodynamics ofComputation”,Journal of Physics A: Mathematical andTheoretical, 52: 193001. doi: 10.1088/1751-8121/ab0850.
Wong, Stanley, 2006,Foundations of Paul Samuelson’sRevealed Preference Theory, London: Routledge.
Yechiam, Eldad, 2019, “Acceptable Losses: The DebatableOrigins of Loss Aversion”,Psychological Research, 83:1327–1339.
––– and Guy Hochman, 2014, “Loss Attentionin a Dual-Task Setting”,Psychological Science, 25(2):494–502. doi:10.1177/0956797613510725.
Yongacoglu, Bora, Gürdal Arslan and Serdar Yüksel, 2023,“Satisficing Paths and Independent Multiagent ReinforcementLearning in Stochastic Games”,Siam (Society for Industrialand Applied Mathematics) Journal on Mathematics of Data Science,5(3): 745–773.
Yule, G. Udny, 1925, “A Mathematical Theory of Evolution,Based on the Conclusions of Dr. J. C. Williss, F.R.S.”,Philosophical Transactions of the Royal Society of London. SeriesB, Containing Papers of a Biological Character,213(402–410): 21–87. doi:10.1098/rstb.1925.0002
Zaffalon, Marco and Enrique Miranda, 2017, “AxiomatisingIncomplete Preferences through Sets of Desirable Gambles”,Journal of Artificial Intelligence Research, 60(December):1057–1126. doi:10.1613/jair.5230

Academic Tools

How to cite this entry.
Preview the PDF version of this entry at theFriends of the SEP Society.
Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO).
Enhanced bibliography for this entryatPhilPapers, with links to its database.

Other Internet Resources

Brickhill, Hazel and Leon Horsten, 2016, “Popper Functions, Lexicographical Probability, and Non-Archimedean Probability”,manuscript available at arXiv.org (1608.02850v1).
Forscher, Patrick, Calvin K. Lai, Jordan R. Axt, Charles R.Ebersole, Michelle Herman, Patricia G. Devine, and Brian A. Nosek,2017, “A Meta-Analysis of Procedures to Change Implicit Measures”, manuscript at psyarxiv.com.
Kahneman, Daniel, 2017, “Reply to Schimmack, Heene, and Kesavan’s ‘Reconstruction of a Train Wreck: How Priming Research Went Off the Rails’”, blog post,Replicability-Index, Ulrich Schimmack (ed.), 2 February 2017.

Acknowledgments

Thanks to Sebastian Ebert, Ulrike Hahn, Ralph Hertwig, KonstantinosKatsikopoulos, Jan Nagler, Christine Tiefensee, Conor Mayo-Wilson, andan anonymous referee for helpful comments on earlier drafts of thisarticle.

Open access to the SEP is made possible by a world-wide funding initiative.
The Encyclopedia Now Needs Your Support
Please Read How You Can Help Keep the Encyclopedia Free

Browse

About

Support SEP

Mirror Sites

View this site from another server:

USA (Main Site)Philosophy, Stanford University

Info about mirror sites

Library of Congress Catalog Data: ISSN 1095-5054

	How to cite this entry.
	Preview the PDF version of this entry at theFriends of the SEP Society.
	Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO).
	Enhanced bibliography for this entryatPhilPapers, with links to its database.

Movatterモバイル変換