[Editor’s Note: The following new entry by Roman Frigg and Charlotte Werndl replaces theformer entryon this topic by the previous author.]
Statistical Mechanics is the third pillar of modern physics, next toquantum theory and relativity theory. Its aim is to account for themacroscopic behaviour of physical systems in terms of dynamical lawsgoverning the microscopic constituents of these systems andprobabilistic assumptions. Like other theories in physics, statisticalmechanics raises a number of foundational and philosophical issues.But philosophical discussions in statistical mechanics face animmediate difficulty because unlike other theories, statisticalmechanics has not yet found a generally accepted theoretical frameworkor a canonical formalism. In this entry we introduce the differenttheoretical approaches to statistical mechanics and the philosophicalquestion that attach to them.
Statistical Mechanics (SM) is the third pillar of modern physics, nextto quantum theory and relativity theory. Its aim is to account for themacroscopic behaviour of physical systems in terms of dynamical lawsgoverning the microscopic constituents of these systems and theprobabilistic assumptions made about them. One aspect of thatbehaviour is the focal point of SM: equilibrium. Much of SMinvestigates questions concerning equilibrium, and philosophicaldiscussions about SM focus on the foundational assumptions that areemployed in answers to these questions.
Let us illustrate the core questions concerning equilibrium with astandard example. Consider a gas confined to the left half of acontainer with a dividing wall (seeFigure 1a). The gas is inequilibrium and there is no manifest change inany of its macro properties like pressure, temperature, and volume.Now you suddenly remove the dividing wall (seeFigure 1b), and, as result, the gas starts spreading through the entire availablevolume. The gas is now no longer in equilibrium (seeFigure 1c). The spreading of the gas comes to an end when the entire availablespace is filled evenly (seeFigure 1d). At this point, the gas has reached a new equilibrium. Since theprocess of spreading culminates in a new equilibrium, this process isanapproach to equilibrium. A key characteristic of theapproach to equilibrium is that it seems to beirreversible:systems move from non-equilibrium to equilibrium, but notviceversa; gases spread to fill the container evenly, but they do notspontaneously concentrate in the left half of the container. Since anirreversible approach to equilibrium is often associated withthermodynamics, this is referred to asthermodynamicbehaviour. Characterising the state of equilibrium and accountingfor why, and how, a system approaches equilibrium is the core task forSM. Sometimes these two problems are assigned to separate theories (orseparate parts of a larger theory), which are then referred to asequilibrium SM andnon-equilibrium SM,respectively.
While equilibrium occupies centre stage, SM of course also deals withother issues such as phase transitions, the entropy costs ofcomputation, and the process of mixing substances, and inphilosophical contexts SM has also been employed to shed light on thenature of the direction of time, the interpretation of probabilitiesin deterministic theories, the state of the universe shortly after thebig bang, and the possibility of knowledge about the past. We willtouch on all these below, but in keeping with the centrality ofequilibrium in SM, the bulk of this entry is concerned with ananalysis of the conceptual underpinnings of both equilibrium andnon-equilibrium SM.
Sometimes the aim of SM is said to provide a reduction of the laws ofthermodynamics: the laws of TD provide a correct description of themacroscopic behaviour of systems and the aim of SM is to account forthese laws in microscopic terms. We avoid this way of framing the aimsof SM. Both the nature of reduction itself, and the question whetherSM can provide a reduction of TD (in some specifiable sense) arematters of controversy, and we will come back to them inSection 7.5.
Philosophical discussions in SM face an immediate difficulty.Philosophical projects in many areas of physics can take an acceptedtheory and its formalism as their point of departure. Philosophicaldiscussions of quantum mechanics, for instance, can begin with theHilbert space formulation of the theory and develop their argumentswith reference to it. The situation in SM is different. Unliketheories such as quantum mechanics, SM has not yet found a generallyaccepted theoretical framework or a canonical formalism. What weencounter in SM is a plurality of different approaches and schools ofthought, each with its own mathematical apparatus and foundationalassumptions. For this reason, a review of the philosophy of SM cannotsimply start with a statement of the theory’s basic principlesand then move on to different interpretations of the theory. Our taskis to first classify different approaches and then discuss how eachworks; a further question then concerns the relation between them.
Classifying and labelling approaches raises its own issues, anddifferent routes are possible. However, SM’s theoreticalplurality notwithstanding, most of the approaches one finds in it canbe brought under one of three broad theoretical umbrellas. These areknown as “Boltzmannian SM” (BSM), the “BoltzmannEquation” (BE), and “Gibbsian SM” (GSM). The label“BSM” is somewhat unfortunate because it might suggestthat Boltzmann, only (or primarily) championed this particularapproach, whereas he has in fact contributed to the development ofmany different theoretical positions (for an overview of hiscontributions to SM see the entry onBoltzmann’s work in statistical physics; for detailed discussions see Cercignani (1998), Darrigol (2018), andUffink (2007)). These labels have, however, become customary and so westick with “BSM” despite its historical infelicity. Wewill now discuss the theoretical backdrop against which thesepositions are formulated, namely dynamical systems, and then introducethe positions in§4,§5, and§6, respectively. Extensive synoptic discussion of SM can also be foundin Frigg (2008b), Shenker (2017a, 2017b), Sklar (1993), and Uffink(2007).
Before delving into the discussion of SM, some attention needs to bepaid to the “M” in SM. The mechanical background theoryagainst which SM is formulated can be either classical mechanics orquantum mechanics, resulting in either classical SM or quantum SM.Foundational debates are by and large conducted in the context ofclassical SM. We follow this practice in the current entry, but webriefly draw attention to problems and issues that occur when movingfrom a classical to a quantum framework (§4.8). From the point of view of classical mechanics, the systems ofinterest in SM have the structure ofdynamical system, atriple \((X,\) \(\phi,\) \(\mu).\) \(X\) is the state space of thesystem (and from a mathematical point of view is a set). In the caseof a gas with \(n\) molecules this space has \(6n\) dimensions: threecoordinates specifying the position and three coordinates specifyingthe momentum of each molecule. \(\phi\) is the time evolutionfunction, which specifies how a system’s state changes overtime, and we write \(\phi_{t}(x)\) to denote the state into which\(x\) evolves after time \(t\). If the dynamic of the system isspecified by an equation of motion like Newton’s orHamilton’s, then \(\phi\) is the solution of that equation. Ifwe let time evolve, \(\phi_{t}(x)\) draws a “line” in\(X\) that represents the time evolution of a system that wasinitially in state \(x\); this “line” is called atrajectory. Finally, \(\mu\) is a measure on \(X\), roughly ameans to say how large a part of \(X\) is. This is illustratedschematically inFigure 2. For a more extensive introductory discussion of dynamical systems seethe entry onthe ergodic hierarchy, section on dynamical systems, and for mathematical discussions see, for instance, Arnold and Avez(1967 [1968]) and Katok and Hasselblatt (1995).
Figure 2 [Anextended description of figure 2 is in the supplement.]
It is standard to assume that \(\phi\) is deterministic, meaning, thatevery state \(x\) has exactly one past and exactly one future, or, ingeometrical terms, that trajectories cannot intersect (for adiscussion of determinism see Earman (1986)). The systems studied inBSM are such that the volume of “blobs” in the state spaceis conserved: if we follow the time evolution of a “blob”in state space, this blob can change its shape but not its volume.From a mathematical point of view, this amounts to saying that thedynamics ismeasure-preserving: \(\mu(A) = \mu(\phi_{t}(A))\)for all subsets \(A\) of \(X\) and for all times \(t\). Systems in SMare often assumed to be governed by Hamilton’s equations ofmotion, and it is a consequence of Liouville’s theorem that thetime evolution of a Hamiltonian system is measure-preserving.
In the current debate, “BSM” denotes a family of positionsthat take as their starting point the approach that was firstintroduced by Boltzmann in his 1877 paper and then presented in astreamlined manner by Ehrenfest and Ehrenfest-Afanassjewa in their1911 [1959] review. In this section we discuss different contemporaryarticulations of BSM along with the challenges they face.
To articulate the framework of BSM, we distinguish betweenmicro-states and macro-states; for a discussion of the this frameworksee, for instance, Albert (2000), Frigg (2008b), Goldstein (2001), andSklar (1993). Themicro-state of a system at time \(t\) isthe state \(x \in X\) in which the system is at time \(t\). This statespecifies the exact mechanical state of every micro-constituent of thesystem. As we have seen in the previous section, in the case of a gas\(x\) specifies the positions and momenta of every molecule in thegas. Intuitively, themacro-state \(M\) of a system at time\(t\) specifies the macro-constitution of the system at \(t\) interms of variables like volume, temperature and other propertiesmeasurable, loosely speaking, at human scales, although, as we willsee inSection 4.8, reference to thermodynamic variables in this context must be takenwith a grain of salt. The configurations shown inFigure 1 are macro-states in this sense.
The core posit of BSM is that macro-states supervene onmicro-states, meaning that any change in the system’smacro-statemust be accompanied by a change in thesystem’s micro-state: every micro-state \(x\) has exactly onecorresponding macro-state \(M\). This rules out that, say, thepressure of a gas can change while the positions and momenta of each ofits molecules remain the same (see entry onsupervenience). Let \(M(x)\) be the unique macro-state that corresponds to microstate\(x\). The correspondence between micro-states and macro-statestypically is not one-to-one and macro-states are multiply realisable.If, for instance, we swap the positions and momenta of two molecules,the gas’ macro-state does not change. It is therefore natural togroup together all micro-states \(x\), that correspond to the samemacro-state \(M\):
\[X_{M} = \{ x \in X \text{ such that } M(x) = M\}.\]\(X_{M}\) is themacro-region of \(M\).
Now consider a complete set of macro-states (i.e., a set that containsevery macro-state that the system can be in), and assume that thereare exactly \(m\) such states. This complete set is \(\{M_{1},\ldots,M_{m}\}\). It is then the case that the corresponding setof macro-regions, \(\{ X_{M_{1}},\ldots,X_{M_{m}}\}\), forms apartition of \(X\), meaning that the elements of the set do notoverlap and jointly cover \(X\). This is illustrated inFigure 3.
Figure 3 [Anextended description of figure 3 is in the supplement.]
The figure also indicates that if the system under study is a gas,then the macro-states correspond to different states of the gas wehave seen inSection 1. Specifically, one of the macro-states corresponds to the initialstate of the gas, and another one corresponds to its final equilibriumstate.
This raises two fundamental questions that occupy centre stage indiscussions about BSM. First, what are macro-states and how is theequilibrium state identified? That is, where do we get the set\(\{M_{1},\ldots,M_{m}\}\) from and how do we single out one member ofthe set as the equilibrium macro-state? Second, as already illustratedinFigure 3, an approach to equilibrium takes place if the time evolution of thesystem is such that a micro-state \(x\) in a non-equilibriummacro-region evolves such that \(\phi_{t}(x)\) lies in the equilibriummacro-region at a later point in time. Ideally one would want this tohappen for all \(x\) in any non-equilibrium macro-region, because thiswould mean that all non-equilibrium states would eventually approachequilibrium. The question now is whether this is indeed the case, and,if not, what “portion” of states evolves differently.
Before turning to these questions, let us introduce theBoltzmannentropy \(S_{B}\), which is a property of a macro-state definedthrough the measure of the macro-states’ macro-region:
\[S_{B}(M_{i}) = k\log{\lbrack\mu(X_{M_{i}}})\rbrack\]for all \(i = 1,\ldots, m\), where \(k\) is the so-called Boltzmannconstant. Since the logarithm is a monotonic function, the larger themeasure \(\mu\) of a macro-region, the larger the entropy of thecorresponding macro-state.
This framework is the backbone of positions that self-identify as“Boltzmannian”. Differences appear in how the elements ofthis framework are articulated and in how difficulties areresolved.
An influential way of defining equilibrium goes back to Boltzmann(1877); for contemporary discussion of the argument see, for instance,Albert (2000), Frigg (2008b), and Uffink (2007). The approach firstfocusses on the state space of one particle of the system, which inthe case of a gas has six dimensions (three for the particle’spositions in each spatial dimension and a further three for thecorresponding momenta). We then introduce a grid on thisspace—an operation known ascoarse-graining—andsay that two particles have the samecoarse-grainedmicro-state if they are in the same grid cell. The state of theentire gas is then represented by anarrangement, aspecification of \(n\) points on this space (one for each particle inthe gas). But for the gas’ macro-properties it is irrelevantwhich particle is in which state, meaning that the gas’macro-state must be unaffected by a permutation of the particles. Allthat the macro-state depends on is thedistribution ofparticles, a specification of how many particles are in each gridcell.
The core idea of the approach is to determine how many arrangementsare compatible with a given distribution, and to define theequilibrium state as the one for which this number is maximal. Makingthe strong (and unrealistic) assumption that the particles in the gasare non-interacting (which also means that they never collide) andthat the energy of the gas is preserved, Boltzmann offered a solutionto this problem and showed that the distribution for which the numberof arrangements is maximal is the so-called discrete Maxwell-Boltzmanndistribution:
\[n_{i} = \alpha\exp\left({-\beta} E_{i} \right),\]where \(n_{i}\) is the number of particles in cell \(i\) if thecoarse-graining, \(E_{i}\) is the energy of a particle in that cell,and \(\alpha\) and \(\beta\) are constants that depend on the numberof particles and the temperature of the system (Tolman 1938 [1979]:Ch. 4). From a mathematical point of view, deriving this distributionis a problem in combinatorics, which is why the approach is now knownas thecombinatorial argument.
As Paul and Tatiana Ehrenfest pointed out in their 1911 [1959] review,the mathematical structure of the argument also shows that if we nowreturn to the state space \(X\) of the entire system (which, recall,has \(6n\) dimensions), the macro-region of the equilibrium state thusdefined is the largest of all macro-regions. Hence, the equilibriummacro-state is the macro-state with the largest macro-region. Incontemporary discussions this is customarily glossed as theequilibrium macro-state not only being larger than any othermacro-state, but as being enormously larger and in fact taking up mostof \(X\) (see, for instance, Goldstein 2001). However, as Lavis (2008)points out, the formalism only shows that the equilibrium macro-regionis larger than any other macro-region and it is not a general truismthat it takes up most of the state space; there are in fact systems inwhich the non-equilibrium macro-regions taken together are larger thanthe equilibrium macro-region.
Since, as we have seen, the Boltzmann entropy is a monotonic functionof the measure of a macro-region, this implies that the equilibriummicrostate is also the macro-state with the largest Boltzmann entropy,and the approach to equilibrium is a process that can be characterisedby an increase of entropy.
Two questions arise: first, is this a tenable general definition ofequilibrium, and, second, how does it explain the approach toequilibrium? As regards the first question, Uffink (2007) highlightsthat the combinatorial argument assumes particles to benon-interacting. The result can therefore be seen as a goodapproximation for dilute gases, but it fails to describe (evenapproximately) interacting systems like liquids and solids. Butimportant applications of SM are to systems that are not dilute gasesand so this is a significant limitation. Furthermore, from aconceptual point of view, the problem is that a definition ofequilibrium in terms of the number of arrangements compatible with adistribution makes no contact with the thermodynamic notion ofequilibrium, where equilibrium is defined as the state to which anisolated system converges when left to itself (Werndl & Frigg2015b). Finally, this definition of equilibrium is completelydisconnected form the system’s dynamics, which has the oddconsequence that it would still provide an equilibrium state even ifthe system’s time evolution was the identity function (and hencenothing ever changed and no approach to equilibrium took place). Andeven if one were to set thermodynamics aside, there is nothing trulymacro about the definition, which in fact directly constructs amacro-region without ever specifying a macro-state.
A further problem (still as regards the first question) is thejustification of coarse-graining. The combinatorial argument does notget off the ground without coarse-grained micro-states, and so thequestion is what legitimises the use of such states. The problem isaccentuated by the facts that the procedure only works for particularkind of coarse-graining (namely if the grid is parallel to theposition and momentum axes) and that the grid cannot be eliminated bytaking a limit which lets the grid size tend toward zero. A number ofjustificatory strategies have been proposed but none is entirelysatisfactory. A similar problem arises with coarse-gaining in GibbsianSM, and we refer the reader toSection 6.5 for a discussion.
As regards the second question, the combinatorial argument itself issilent about why and how systems approach equilibrium and additionalingredients must be added to the account to provide such anexplanation. Before discussing some of these ingredients (which is thetopic of much of the remainder of this section), let us discuss twochallenges that every explanation of the approach to equilibrium mustaddress: the reversibility problem and the recurrence problem.
InSection 2 we have seen that at bottom the physical systems of BSM have thestructure of a dynamical system \((X,\) \(\phi,\) \(\mu)\) where\(\phi\) is deterministic and measure preserving. Systems of this kindhave two features that pose a challenge for an understanding of theapproach to equilibrium.
The first feature is what is known astime-reversalinvariance. Intuitively you can think of the time-reversal of aprocess as what you get when you play a movie of a process backwards.The dynamics of system is time-reversal invariant if every processthat is allowed to happen in one direction of time is also allowed tohappen the reverse direction of time. That is, for every process thatis allowed by the theory it is that case that if you capture theprocess in a movie, then the process that you see when you play themovie backwards is also allowed by the theory; for detailed and moretechnical discussions see, for instance, Earman (2002), Malament(2004), Roberts (2022), and Uffink (2001).
Hamiltonian systems are time-reversal invariant and so the most commonsystems studied in SM have this property. A look atFigure 3 makes the consequences of this for an understanding of the approachto equilibrium clear. We consider a system whose micro-state initiallylies in a non-equilibrium macro-region and then evolves into amicro-state that lies in the equilibrium macro-region. Obviously, thisprocess ought to be allowed by the theory. But this means that thereverse process—a process that starts in the equilibriummacro-region and moves back into the initial non-equilibrium macroregion—must be allowed too. InSection 1 we have seen that the approach to equilibrium is expected to beirreversible, prohibiting systems like gases to spontaneouslyleave equilibrium and evolve into a non-equilibrium state. But we arenow faced with a contradiction: if the dynamics of the system istime-reversal invariant, then the approach to equilibrium cannot beirreversible because the evolution from the equilibrium state to anon-equilibrium state is allowed. This observation is known asLoschmidt's reversibility objection because it was first putforward by Loschmidt (1876); for a historical discussion of thisobjection, see Darrigol (2021).
The second feature that poses a challenge isPoincarérecurrence. The systems of interest in BSM are bothmeasure-preserving and spatially bounded: they are gases in a box,liquids in a container and crystals on a laboratory table. This meansthat the system’s micro-state can only access a finite region in\(X\). Poincaré showed that dynamical systems of this kindmust, at some point, return arbitrarily close to their initial state,and, indeed do so infinitely many times. The time that it takes thesystem to return close to its initial condition is called therecurrence time. Like time-reversal invariance,Poincaré recurrence contradicts the supposedirreversibility of the approach to equilibrium: it implies thatsystems will return to non-equilibrium states at some point. One justhas to wait for long enough. This is known asZermelo’srecurrence objection because it was first put forward by Zermelo(1896); for a historical discussion see Uffink (2007).
Any explanation of the approach to equilibrium has to address thesetwo objections.
A classical explanation of the approach to equilibrium is given withinergodic theory. A system is ergodic iff, in the long run (i.e., in thelimit of time \(t \rightarrow \infty\)), for almost all initialconditions it is the case that the fraction of time that thesystem’s trajectory spends in a region \(R\) of \(X\) is equalto the fraction that \(R\) occupies in \(X\) (Arnold & Avez 1967[1968]). For instance, if \(\mu(R)/\mu(X) = 1/3,\) then an ergodicsystem will, in the long run, spend 1/3 of its time in \(R\) (for amore extensive discussion of ergodicity see entry onthe ergodic hierarchy).
InSection 4.2 we have seen that if the equilibrium macro-region is constructed withthe combinatorial argument, then it occupies the largest portion of\(X\). If we now also assume that the system is ergodic, it followsimmediately that the system spends the largest portion of time inequilibrium. This is then often given a probabilistic gloss byassociating the time that a system spends in a certain part of \(X\)with the probability of finding the system in that part of \(X\), andso we get that we are overwhelmingly likely to find that system inequilibrium; for a discussion of this approach to probabilities seeFrigg (2010) and references therein.
The ergodic approach faces a number of problems. First, being ergodicis a stringent condition that many systems fail to meet. This is aproblem because among those systems are many to which SM issuccessfully applied. For instance, in a solid the molecules oscillatearound fixed positions in a lattice, and as a result the phase pointof the system can only access a small part of the energy hypersurface(Uffink 2007: 1017). The Kac Ring model and a system of anharmonicoscillators behave thermodynamically but fail to be ergodic (Bricmont2001). And even the ideal gas—supposedly the paradigm system ofSM—is not ergodic (Uffink 1996b: 381). But if core-systems of SMare not ergodic, then ergodicity cannot provide an explanation for theapproach to equilibrium, at least not one that is applicable acrossthe board (Earman & Rédei 1996; van Lith 2001). Attemptshave been made to improve the situation through the notion ofepsilon-ergodicity, where a system is epsilon-ergodic if it is ergodiconly on subset \(Y \subset X\) where \(\mu(Y) \geq 1 - \varepsilon\),for small positive real number \(\varepsilon\) (Vranas 1998). Whilethis approach deals successfully with some systems (Frigg & Werndl2011), it is still not universally applicable and hence remains silentabout large classes of SM systems.
The ergodic approach accommodates Loschmidt’s andZermelo’s objections by rejecting the requirement of strictirreversibility. The approach insists that systems, can, and actuallydo, move away from equilibrium. What SM should explain is not strictirreversibility, but the fact that systems spend most of the time inequilibrium. The ergodic approach does this by construction, and onlyallows for brief and infrequent episodes of non-thermodynamicbehaviour (when the system moves out of equilibrium). This response isin line with Callender (2001) who argues that we should not takethermodynamics “too seriously” and see its strictlyirreversible approach to equilibrium as an idealisation that is notempirically accurate because physical systems turn out to exhibitequilibrium fluctuations.
A more technical worry is what is known as themeasure zeroproblem. As we have seen, ergodicity says that “almost allinitial conditions” are such that the fraction of time spent in\(R\) is equal to the fraction \(R\) occupies in \(X\). In technicalterms this means that set of initial conditions for which this is notthe case has measure zero (with respect to \(\mu\)). Intuitively thiswould seem to suggest that these conditions are negligible. However,as Sklar (1993: 182–88) points out, sets of measure zero can berather large (remember that set of rational numbers has measure zeroin the real numbers), and the problem is to justify why a set ofmeasure zero really is negligible.
An alternative account explains the approach to equilibrium in termsoftypicality. Intuitively something is typical if it happensin the “vast majority” of cases: typical lottery ticketsare blanks, and in a typical series of a thousand coin tosses theratio of the number of heads and the number of tails is approximatelyone. The leading idea of a typicality-based account of SM is to showthat thermodynamic behaviour is typical and is therefore to beexpected. The typicality account comes in different version, whichdisagree on how exactly typicality reasoning is put to use; differentversions have been formulated, among others, by Goldstein (2001),Goldstein and Lebowitz (2004), Goldstein, Lebowitz, Tumulka, andZanghì (2006), Lebowitz (1993a, 1993b), and Volchan (2007). Inits paradigmatic version, the account builds on the observation(discussed inSection 4.2) that the equilibrium macro-region is so large that \(X\) consistsalmost entirely of equilibrium micro-states, which means thatequilibrium micro-states are typical in \(X\). The account submitsthat, for this reason, a system that starts its time-evolution in anon-equilibrium state can simply not avoid evolving into a typicalstate—i.e., an equilibrium state—and staying there forvery long time, which explains the approach to equilibrium.
Frigg (2009, 2011) and Uffink (2007) argue that from the point of viewof dynamical systems theory this is unjustified because there is noreason to assume that micro-states in an atypical set have to evolveinto a typical set without there being any further dynamicalassumptions in place. To get around this problem Frigg and Werndl(2012) formulate a version of the account that takes the dynamics ofthe system into account. Lazarovici and Reichert (2015) disagree thatsuch additions are necessary. For further discussions of the use oftypicality in SM, see Badino (2020), Bricmont (2022), Chibbaro,Rondoni and Vulpiani (2022), Crane and Wilhelm (2020), Goldstein(2012), Hemmo and Shenker (2015), Luczak (2016), Maudlin (2020),Reichert (forthcoming), and Wilhelm (2022). As far asLoschmidt’s and Zermelo’s objections are concerned, thetypicality approach has to make the same move as the ergodic approachand reject strict irreversibility as a requirement.
An altogether different approach has been formulated by Albert (2000).This approach focusses on the internal structure of macro-regions andaims to explain the approach to equilibrium by showing that theprobability for system in a non-equilibrium macro-state to evolvetoward a macro-state of higher Boltzmann entropy is high. The basisfor this discussion is the so-calledstatistical postulate.Consider a particular macro-state \(M\) with macro-region \(X_{M}\)and assume that the system is in macro-state \(M\). The postulate thensays that for any subset \(A\) of \(X_{M}\) the probability of findingthe system’s micro-state in \(A\) is \({\mu(A)/\mu(X}_{M})\). Wecan now separate the micro-states in \(X_{M}\) into those that evolveinto a higher entropy macro-state and those that move towardmacro-states of lower entropy. Let’s call these sets\(X_{M}^{+}\) and \(X_{M}^{-}\). The statistical postulate then saysthat the probability of a system in \(M\) evolving toward a higherentropy macro-state is \({\mu(X}_{M}^{+})/\mu(X_{M})\).
For it to be likely that system approaches equilibrium thisprobability would have to be high. It now turns out that for purelymathematical reasons, if the system is highly likely to evolve towarda macro-state of higher entropy, then it is also highly likely to haveevolved into the current macro-state \(M\) from a macro-state of highentropy. In other words, if the entropy is highly likely to increasein the future, it is also highly likely to have decreased in the past.Albert suggests solving this problem by regarding the entire universeas the system being studied and then conditionalizing on thePast-Hypothesis, which is the assumption that
that the world first came into being in whatever particularlow-entropy highly condensed big-bang sort of macrocondition it isthat the normal inferential procedures of cosmology will eventuallypresent to us. (2000: 96)
Let \(M_{p}\) be thepast state, the state in which the worldfirst came into being according to the Past-Hypothesis, and let\(I_{t} = \phi_{t}(X_{M_{p}}) \cap X_{M}\) be the intersection of thetime-evolved macro-region of the past state and the currentmacro-state. The probability of high entropy future is then\({\mu(I_{t} \cap X}_{M}^{+})/\mu(I_{t})\). If we further assume“abnormal” states with low entropy futures are scatteredall over \(X_{M}\), then a high entropy future can be highly likelywithout it a high entropy past also being highly likely.
This approach to SM is based on three core elements: the deterministictime evolution of the system given by \(\phi_{t}\), thePast-Hypothesis, and the statistical postulate. Together they resultin the assignment of a probability to propositions about the historyof a system. Albert (2015) calls this assignment theMentaculus. Albert regards the Mentaculus not only as anaccount of thermodynamic phenomena, but as the backbone of a completescientific theory of the universe because the Mentaculus assignsprobabilities to propositions in all sciences. This raises all kind ofissues about the nature of laws, reduction, and the status of thespecial sciences, which are discussed, for instance, in Frisch (2011),Hemmo and Shenker (2021) and Myrvold and others (2016).
Like the ergodic approach, the Mentaculus must accommodateLoschmidt’s and Zermelo’s objections by rejecting therequirement of strict irreversibility. Higher to lower entropytransitions are still allowed, but they are rendered unlikely, andrecurrence can be tamed by noting that the recurrence time for atypical SM system is larger than age of the universe, which means thatwe won’t observe recurrence (Bricmont 1995; Callender 1999).Yet, this amounts to admitting that entropy increase is not universaland the formalism is compatible with there being periods of decreasingentropy at some later point in the history of the universe.
A crucial ingredient of the Mentaculus is the Past-Hypothesis. Theidea of grounding thermodynamic behaviour in a cosmic low-entropy pastcan be traced back to Boltzmann (Uffink 2007: 990) and has since beenadvocated by prominent physicists like Feynman (1965: Ch. 5) and R.Penrose (2004: Ch. 27). This raises two questions: first, can thePast-Hypothesis be given a precise formulation that serves the purposeof SM, and, second, what status does the Past-Hypothesis have and doesthe fact that the universe started in this particular state require anexplanation?
As regards the first question, Earman has cast the damning verdictthat the Past-Hypothesis is “not even false” (2006)because in cosmologies described in general relativity there is nowell-defined sense in which the Boltzmann entropy has a low value. Afurther problem is that in the Mentaculus the Boltzmann entropy is aglobal quantity characterising the entire universe. But, as Winsbergpoints out, the fact that this quantity is low does not imply that theentropy of a particular small subsystem of interest is also low, and,worse just because the overall entropy of the universe increases itneed not be the case that the entropy in a small subsystem alsoincreases (2004a). The source of these difficulties is that theMentaculus takes the entire universe to be the relevant system and soone might try get around them by reverting to where we started:laboratory systems like gases in boxes. One can then take thepast state simply to be the state in which such a gas is prepared atthe beginning of a process (say in the left half of the container).This leads to the so-called branch systems approach, because a systemis seen as “branching off” from the rest of the universewhen it is isolated from its environment and prepared innon-equilibrium state (Davies 1974; Sklar 1993: 318–32). Albert(2000) dismisses this option for a number of reasons, chief among themthat it is not clear why one should regard the statistical postulateas valid for such a state (see Winsberg (2004b) for a discussion).
As regards the second question, Chen (forthcoming), Goldstein (2001),and Loewer (2001) argue that Past-Hypothesis has the status of afundamental law of nature. Albert seems to regard it as something likea Kantian regulative principle in that its truth must be assumed inorder to make knowledge of the past possible at all. By contrast,Callender, Price, and Wald regard that the Past-Hypothesis acontingent matter of fact, but they disagree on whether this factstands in need of an explanation. Price (1996, 2004) argues that itdoes because the crucial question in SM is not why entropy increase,but rather why it ever got to be low in the first place. Callender(1998, 2004a, 2004b) disagrees: the Past-Hypothesis simply specifiesinitial conditions of a process, and initial conditions are not thekind of thing that needs to be explained (see also Sklar (1993:309–18)). Parker (2005) argues that conditionalising on theinitial state of the universe does not have the explanatory power toexplain irreversible behaviour. Baras and Shenker (2020) and Farr(2022) analysed the notion of explanation that is involved in thisdebate and argue that different questions are in play that requiredifferent answers.
The long-run residence time account offers a different perspectiveboth on the definition of equilibrium and the approach to it (Werndl& Frigg 2015a, 2015b). Rather than first defining equilibriumthrough combinatorial considerations (as in§4.2) andthen asking why systems approach equilibrium thusdefined (as do the accounts discussed in§§4.4–4.6), the long-run residence time accountdefines equilibriumthrough thermodynamic behaviour. The account begins by characterisingthe macro-states in the set \(\{ M_{1},\ldots,M_{n}\}\) in purelymacroscopic terms, i.e., through thermodynamic variables like pressureand temperature, and then identifies the state in which a systemresides most of the time as the equilibrium state: among the\(M_{i}\), the equilibrium macro-state is by definition the state inwhich a system spends most of its time in the long run (which givesthe account its name).
This definition requires no assumption about the size of theequilibrium macro-region, but one can then show that it is a propertyof the equilibrium macro-state that its macro-region is large. Thisresult is fully general in that it does not depend on assumptions likeparticles being non-interacting (which makes it applicable to allsystems including liquids and solids), and it does not depend oncombinatorial considerations at the micro-level. The approach toequilibrium is built into the definition in the sense that if there isno macro-state in which the system spends most of its time, then thesystem simply has no equilibrium. This raises the question of thecircumstances under which an equilibrium exists. The account answersthis question by providing a general existence theorem which furnishescriteria for the existence of an equilibrium state (Werndl & Friggforthcoming-b). Intuitively, the existence theorem says thatthere is an equilibrium just in case the system’s state space issplit up into invariant regions on which the motion is ergodic and theequilibrium macro-state is largest in size relative to the othermacro-states on each such region.
Like the account previously discussed, the long-run residence timeaccount accommodates Loschmidt’s and Zermelo’s objectionsby rejecting the requirement of strict irreversibility: it insiststhat being in equilibrium most of the time is as much as one canreasonably ask for because actual physical systems show equilibriumfluctuations and equilibrium is not the dead and immovable state thatthermodynamics says it is.
BSM enjoys great popularity in foundational debates due to its clearand intuitive theoretical structure. Nevertheless, BSM faces a numberof problems and limitations.
The first problem is that BSM only deals with closed systems thatevolve under their own internal dynamics. As we will see inSection 6, GSM successfully deals with systems that can exchange energy and evenparticles with their environments, and systems of this kind play animportant role in SM. Those who think that SM only deals with theentire universe can set this problem aside because the universe(arguably) is a closed system. However, those who think that theobjects of study in SM are laboratory-size systems like gases andcrystals will have to address the issues of how BSM can accommodateinteractions between systems and their environments, which is alargely ignored problem.
A second problem is that even though macro-states are ubiquitous indiscussions about BSM, little attention is paid to a precisearticulation of what these states are. There is loose talk about how asystem looks from macroscopic perspective, or there is a vague appealto thermodynamic variables. However, by the lights of thermodynamics,variables like pressure and temperature are definedonly inequilibrium and it remains unclear how non-equilibrium states, andwith them the approach to equilibrium, should be characterised interms of thermodynamic variables. Frigg and Werndl (forthcoming-a)suggest solving this problem by defining macro-states in terms oflocal field-variables, but the issue needs further attention.
A third problem is that current formulations of BSM are closely tiedto deterministic classical systems (§3). Some versions of BSM can be formulated based on classical stochasticsystem (Werndl & Frigg 2017). But the crucial question is whether,and if so how, a quantum version of BSM can be formulated (for adiscussion see the entry onquantum mechanics). Dizadji-Bahmani (2011) discusses how a result due to Linden andothers (2009) can be used to construct an argument for the conclusionthat an arbitrary small subsystem of a large quantum system typicallytends toward equilibrium. Chen (forthcoming) formulates a quantumversion of the Mentaculus, which he calls theWentaculus (seealso his 2022). Goldstein, Lebowitz, Tumulka, and Zanghì (2020)describe a quantum analogue of the Boltzmann entropy and argue thatthe Boltzmannian conception of equilibrium is vindicated also inquantum mechanics by recent work on thermalization of closed quantumsystems. These early steps have not yet resulted in comprehensive andwidely accepted formulation of quantum version of BSM, the formulationof a such a version of remains an understudied topic. Albert (2000:Ch. 7) suggested that the spontaneous collapses of the so-called GRWtheory (for introduction see the entry oncollapse theories), a particular approach quantum mechanics, could be responsible for theemergence of thermodynamic irreversibility. Te Vrugt, Tóth andWittkowski (2021) put this proposal to test in computer simulationsand found that for initial conditions leading to anti-thermodynamicbehaviour GRW collapses do not lead to thermodynamic behaviour andthat therefore the GRW does not induce irreversible behaviour.
Finally, there is no way around recognising that BSM is mostly used infoundational debates, but it is GSM that is the practitioner’sworkhorse. When physicists have to carry out calculations and solveproblems, they usually turn to GSM which offers user-friendlystrategies that are absent in BSM. So either BSM has to be extendedwith practical prescriptions, or it has to be connected to GSM so thatit can benefit from its computational methods (for a discussion of thelatter option see§6.7).
A different approach to the problem is taken by Boltzmann in hisfamous (1872 [1966 Brush translation]) paper, which contains tworesults that are now known as the Boltzmann Equation and theH-theorem. As before, consider a gas, now described through adistribution function \(f_{t}(\vec{v})\), which specifies whatfraction of molecules in the gas has a certain velocity \(\vec{v}\) attime \(t\). This distribution can change over time, andBoltzmann’s aim was to show that as time passes thisdistribution function changes so that it approximates theMaxwell-Boltzmann distribution, which, as we have seen inSection 4.2, is the equilibrium distribution for a gas.
To this end, Boltzmann derived an equation describing the timeevolution of \(f_{t}(\vec{v})\). The derivation assumes that the gasconsists of particles of diameter \(D\) that interact like hardspheres (i.e., they interactonly when they collide); thatall collisions are elastic (i.e., no energy is lost); that the numberof particles is so large that their distribution, which in reality isdiscrete, can be well approximated by a continuous and differentiablefunction \(f_{t}(\vec{v})\); and that the density of the gas is so lowthat only two-particle collisions play a role in the evolution of\(f_{t}(\vec{v})\).
The crucial assumption in the argument is the so-called“Stosszahlansatz”, which specifies how manycollisions of a certain type take place in certain interval of time(the German “Stosszahlansatz” literally meanssomething like “collision number assumption”). Assume thegas has \(N\) molecules per unit volume and the molecules are equallydistributed in space. The type of collisions we are focussing on isthe one between a particle with velocity \(\vec{v}_{1}\) and one withvelocity \(\vec{v}_{2}\), and we want to know the number\(N(\vec{v}_{1}, \vec{v}_{2})\) of such collisions during a smallinterval of time \(\Delta t\). To solve this problem, we begin byfocussing on one molecule with \(\vec{v}_{1}\). The relative velocityof this molecule and a molecule moving with \(\vec{v}_{2}\) is\(\vec{v}_{2} - \vec{v}_{1}\) and the absolute value of that relativevelocity is \(\left\| \vec{v}_{2} - \vec{v}_{1} \right\|\). Moleculesof diameter D only collide if their centres come closer than \(D\). Solet us look at a cylinder with radius \(D\) and height \(\left\|\vec{v}_{2} - \vec{v}_{1} \right\|\Delta t\), which is the volume inspace in which molecules with velocity \(\vec{v}_{2}\) would collidewith our molecule during \(\Delta t\). The volume of this cylinderis
\[\pi D^{2}\left\| \vec{v}_{2} - \vec{v}_{1} \right\|\Delta t .\]If we now make the strong assumption that the initial velocities ofcolliding particles are independent, it follows that number ofmolecules with velocity \(\vec{v}_{2}\) in a unit volume of the gas attime \(t\) is \(Nf_{t}(\vec{v}_{2})\), and hence the number of suchmolecules in our cylinder is
\[ N f_{t} (\vec{v}_{2}) \pi D^{2} \left\| \vec{v}_{1} - \vec{v}_{2} \right\| \Delta t.\]This is the number of collisions that the molecule we are focussing oncan be expected to undergo during \(\Delta t\). But there is nothingspecial about this molecule, and we are interested in the number ofall collisions between particles with velocities\(\vec{v}_{1}\) and \(\vec{v}_{2}\). To get to that number, note thatthe number of molecules with velocity \(\vec{v}_{1}\) in a unit volumeof gas at time \(t\) is \(Nf_{t}(\vec{v}_{1})\). That is, there are\(Nf_{t}(\vec{v}_{1})\) molecules like the one we were focussing on.It is then clear that the total number of collisions can be expectedto be the product of the number of collisions for each molecule with\(\vec{v}_{1}\) times the number of molecules with\(\vec{v}_{1}\):
\[ N\left( \vec{v}_{1}, \vec{v}_{2} \right) = N^{2} f_{t}(\vec{v}_{1}) f_{t}(\vec{v}_{2}) \left\| \vec{v}_{2} - \vec{v}_{1} \right\| \pi D^{2}\Delta t. \]This is theStosszahlansatz. For ease of presentation, wehave made the mathematical simplification of treating\(f_{t}(\vec{v})\) as a fraction rather than as density in ourdiscussion of theStosszahlansatz; for a statement of theStosszahlansatz for densities see, for instance, Uffink(2007). Based on theStosszahlansatz, Boltzmann derived whatis now known as the Boltzmann Equation:
\[\frac{\partial f_{t}(\vec{v}_{1})}{\partial t} = {\pi D^{2}N}^{2} \int_{}^{} {d^{3}\vec{v}_{2}} \left\| \vec{v}_{2} - \vec{v}_{1} \right\| \left( f_{t} (\vec{v}_{1}) f_{t} (\vec{v}_{2}) - f_{t} (\vec{v}_{1}^{*}) f_{t} (\vec{v}_{2}^{*}) \right),\]where \(\vec{v}_{1}^{*}\) and \(\vec{v}_{2}^{*}\) are the velocitiesof the particlesafter the collision. The integration is overthe space of the box that contains the gas. This is a so-calledintegro-differential equation. The details of this equation need notconcern us (and the mathematics of such equations is rather tricky).What matters is the overall structure, which says that the way thedensity \(f_{t}(\vec{v})\) changes over time depends on the differenceof the products of the densities of the incoming an of the outgoingparticles. Boltzmann then introduced the quantity \(H\),
\[H \left\lbrack f_{t}(\vec{v}) \right\rbrack = \int_{}^{} {d^{3}\vec{v}} f_{t}(\vec{v}) \ln \left(f_{t}(\vec{v})\right), \]and proved that \(H\) decreases monotonically in time,
\[\frac{dH\left\lbrack f_{t}(\vec{v}) \right\rbrack}{dt} \leq 0,\]and that \(H\) is stationary (i.e., \(dH\lbrack f_{t}(\vec{v})\rbrack/dt = 0\)) iff \(f_{t}(\vec{v})\) is the Maxwell-Boltzmanndistribution. These two results are theH-Theorem.
The definition of \(H\) bears formal similarities both to theexpression of the Boltzmann entropy in the combinatorial argument (§4.3) and, as we will see, to the Gibbs entropy (§6.3); in fact \(H\) looks like a negative entropy. For this reason theH-theorem is often paraphrased as showing that entropyincreases monotonically until the system reaches the equilibriumdistribution, which would provide a justification of thermodynamicbehaviour based on purely mechanical assumptions. Indeed, in his 1872paper, Boltzmann himself regarded it as a rigorous general proof ofthe Second Law of thermodynamics (Uffink 2007: 965; Klein 1973:73).
The crucial conceptual questions at this point are: what exactly didBoltzmann prove with theH-theorem? Under which conditions isthe Boltzmann Equation valid? And what role do the assumptions, inparticular, theStosszahlansatz play in deriving it? Thediscussion of these question started four years after the paper waspublished, when Loschmidt put forward his reversibility objection (§4.3). This objection implies that \(H\) must be able to increase as well asdecrease. Boltzmann’s own response to Loschmidt’schallenge and the question of the scope of theH-theorem is amatter of much debate; for discussions see, for instance, Brown,Myrvold, and Uffink (2009), Cercignani (1998), Brush (1976), andUffink (2007). We cannot pursue this matter here, but the gist ofBoltzmann’s reply would seem to have been that he admitted thatthere exists initial states for which \(H\) decreases, but that theserarely, if ever, occur in nature. This leads to what is now known as astatistical reading of theH-theorem: theH-theoremshows entropy increase to be likely rather universal.
A century later, Lanford published a string of papers (1973, 1975,1976, 1981) culminating in what is now known as Lanford’stheorem, which provides rigorous results concerning the validity ofthe Boltzmann Equation. Lanford’s starting point is the questionwhether, and if so in what sense, the Boltzmann equation is consistentwith the underlying Hamiltonian dynamics. To this end, note that everypoint \(x\) in the state space \(X\) of a gas has a distribution\(f_{x}(\vec{r}, \vec{v})\) associated with it, where \(\vec{r}\) and\(\vec{v}\) are, respectively, the location and velocity of oneparticle (recall from§3 that \(X\) contains the position and momenta of all molecules). For afinite number of particles \(f_{x}(\vec{r}, \vec{v})\) is notcontinuous, let alone differentiable. So as a first step, Lanforddeveloped a way to obtain a differentiable distribution functiondistribution \(f^{(x)}(\vec{r}, \vec{v})\), which involves taking theso-called Boltzmann-Grad limit. He then evolved this distributionforward in time both under the fundamental Hamiltonian dynamics, whichyields \(f_{\text{Ht}}^{(x)}(\vec{r}, \vec{v})\), and under theBoltzmann Equation, which yields \(f_{\text{Bt}}^{(x)}(\vec{r},\vec{v})\). Lanford’s theorem compares these two distributionsand essentially says that for most points \(x\) in \(X\),\(f_{\text{Ht}}^{(x)}(\vec{r}, \vec{v})\) and\(f_{\text{Bt}}^{(x)}(\vec{r}, \vec{v})\) are close to each other fortimes in the interval \(\left\lbrack 0, t^{*} \right\rbrack,\) where\(t^{*}\) is a cut-off time (where “most” is judged by theso-called microcanonical measure on the phase space; for discussion ofthis measure see§6.1). For rigorous statements and further discussions of the theorem seeArdourel (2017), Uffink and Valente (2015), and Valente (2014).
Lanford's theorem is a remarkable achievement because it shows that astatistical and approximate version of the Bolzmann Equation can bederived from the Hamiltonian mechanics and most initial conditions inthe Bolzmann-Grad limit for a finite amount of time. In this sense itcan be seen as a vindication of Boltzmann’s statistical versionof theH-theorem. At the same time the theorem alsohighlights the limitations of the approach. The relevant distributionsare close to each other only up to time \(t^{*}\), and it turns outthat \(t^{*}\) is roughly two fifths of the mean time a particle movesfreely between two collisions. But this is a very short time! Duringthe interval \(\left\lbrack 0, t^{*} \right\rbrack\), which for a gaslike air at room temperature is in the order of microseconds, onaverage 40% of the molecules in the gas will have been involved in onecollision and the other 60% will have moved freely. This is patientlytoo short to understand macroscopic phenomena like the one that wedescribed at the beginning of this article, which take place on alonger timescale and will involve many collisions for all particles.And like Boltzmann's original results, Lanford's theorem also dependson strong assumptions, in particular a measure-theoretic version oftheStosszahlansatz and Valente (cf. Uffink & Valente2015).
Finally, one of the main conceptual problems concerningLanford’s theorem is where the apparent irreversibility comesfrom. Various opinions have been expressed on this issue. Lanfordhimself first argued that irreversibility results from passing to theBoltzmann-Grad limit (Lanford 1975: 110), but later changed his mindand argued that theStosszahlansatz for incoming collisionpoints is responsible for the irreversible behaviour (1976, 1981).Cercignani, Illner, and Pulvirenti (1994) and Cercignani (2008) claimthat irreversibility arises as a consequence of assuming a hard-spheredynamics. Valente (2014) and Uffink and Valente (2015) argue thatthere is no genuine irreversibility in the theorem because the theoremis time-reversal invariant. For further discussions on the role ofirreversibility in Lanford’s theorem, see also Lebowitz (1983),Spohn (1980, 1991), and Weaver (2021, 2022)
Gibbsian Statistical Mechanics (GSM) is an umbrella term covering anumber of positions that take Gibbs’ (1902 [1981]) as theirpoint of departure. In this section, we introduce the framework anddiscuss different articulations of it along with the issues theyface.
Like BSM, GSM departs from the dynamical system \((X,\) \(\phi,\)\(\mu)\) introduced inSection 3 (although, as we will see below, it readily generalises to quantummechanics). But this is where the commonalities end. Rather thanpartitioning \(X\) into macro-regions, GSM puts a probability densityfunction \(\rho(x)\) on \(X\), often referred to as a“distribution”. This distribution evolves under thedynamics of the system through the law
\[\rho_{t}(x) = \rho_{0}(\phi_{- t}(x))\]where \(\rho_{0}\) is the distribution the initial time \(t_{0}\) and\(\phi_{- t}(x)\) is the micro-state that evolves into \(x\) during\(t\). A distribution is called stationary if it does not change overtime, i.e., \(\rho_{t}(x)= \rho_{0}(x)\) for all \(t\). If thedistribution is stationary, Gibbs says that the system is in“statistical equilibrium”.
At the macro-level, a system is characterised by macro-variables,which are functions \(f:X\rightarrow \mathbb{R}\), where\(\mathbb{R}\) are the real numbers. With the exception of entropy andtemperature (to which we turn below), GSM takes all physicalquantities to be represented by such functions. The so-calledphase average of \(f\) is
\[\left\langle f \right\rangle = \int_{X} f(x) \rho(x) dx.\]The question now is how to interpret this formalism. The standardinterpretation is in terms of what is known as an ensemble. Anensemble is an infinite collection of systems of the samekind that differ in their state. Crucially, this is a collection ofcopies of theentire system and not a collection ofmolecules. For this reason, Schrödinger characterised an ensembleas a collection of “mental copies of the one system underconsideration” (1952 [1989: 3]). Hence the members of anensemble do not interact with each other; an ensemble is not aphysical object; and ensembles have no spatiotemporal existence. Thedistribution can then be interpreted as specifying “howmany” systems in the ensemble have their state in certain region\(R\) of \(X\) at time \(t\). More precisely, \(\rho_{t}(x)\) isinterpreted as giving the probability of finding a system in \(R\) at\(t\) when drawing a system randomly from the ensemble in much thesame way in which one draws a ball from an urn:
\[p_{t}(R) = \int_{R} {\rho_{t}(x)} dx.\]What is the right distribution for a given physical situation? Gibbsdiscusses this problem at length and formulates three distributionswhich are still used today: themicrocanonical distributionfor isolated systems, thecanonical distribution for systemwith fluctuating energy, and thegrand-canonical distributionfor systems with both fluctuating energy and fluctuating particlenumber. For a discussion of the formal aspects of these distributionssee, for instance, Tolman (1938 [1979]), and for philosophicaldiscussions see Davey (2008, 2009) and Myrvold (2016).
Gibbs’ statistical equilibrium is a condition on anensemble being in equilibrium, which is different from anindividual system being in equilibrium (as introduced in§1). The question is how the two relate, and what an experimenter whomeasures a physical quantity on a system observes. A standard answerone finds in SM textbooks appeals to theaveraging principle:when measuring the quantity \(f\) on a system in thermal equilibrium,the observed equilibrium value of the property is the ensemble average\(\langle f\rangle\) of an ensemble in ensemble-equilibrium. Thepractice of applying this principle is often calledphaseaveraging. One of the core challenges for GSM is to justify thisprinciple.
The standard justification of phase averaging that one finds in manytextbooks is based on the notion of ergodicity that we have alreadyencountered inSection 4.4. In the current context, we consider theinfinite timeaverage \(f^{*}\)of the function \(f\). It is a mathematical factthat ergodicity as defined earlier is equivalent to it being the casethat \(f^{*} = \langle f \rangle\) for almost all initial states. Thisis reported to provide a justification for phase averaging as follows.Assume we carry out a measurement of the physical quantity representedby \(f\). It will take some time to carry out the measurement, and sowhat the measurement device registers is the time average over theduration of the measurement. Indeed, the time needed to make themeasurement is long compared to the time scale on which typicalmolecular processes take place, the measured result is approximatelyequal to the infinite time average \(f^{*}\). By ergodicity, \(f^{*}\)is equal to \(\langle f\rangle\), which justifies the averagingprinciple.
This argument fails for several reasons (Malament & Zabell 1980;Sklar 1993: 176–9). First, from the fact that measurements taketime it does not follow that what is measured are time averages, andeven if one could argue that measurement devices output time averages,these would befinite time averages and equating these finitetime averages with infinite time averages is problematic becausefinite and infinite averages can assume very different values even ifthe duration of the finite measurement is very long. Second, thisaccount makes a mystery of how we observe change. As we have seen inSection 1, we do observe how systems approach equilibrium, and in doing so weobserve macro-variables changing their values. If measurementsproduced infinite time averages, then no change would ever be observedbecause these averages are constant. Third, as we already notedearlier, ergodicity is a stringent condition and many systems to whichSM is successfully applied are not ergodic (Earman & Rédei1996), which makes equating time averages and phase averageswrong.
A number of approaches have been designed to either solve orcircumvent these problems. Malament and Zabell (1980) suggest a methodof justifying phase averaging that still invokes ergodicity but avoidsan appeal to time averages. Vranas (1998) offers a reformulation ofthis argument for systems that are epsilon-ergodic (see§4.4). This accounts for systems that are “almost” ergodic, butremains silent about systems that are far from being ergodic. Khinchin(1949) restricts attention to systems with a large number of degreesof freedom and so-called sum functions (i.e., functions that can are asum over one-particle functions), and shows that for such systems\(f^{*} = \langle f\rangle\) holds on the largest part of \(X\); for adiscussion of this approach see Batterman (1998) and Badino (2006).However, as Khinchin himself notes, the focus on sum-functions is toorestrictive to cover realistic systems, and the approach also has torevert to the implausible posit that observations yield infinite timeaverages. This led to a research programme now known as the“thermodynamic limit”, aiming to prove“Khinchin-like” results under more realistic assumptions.Classic statements are Ruelle (1969, 2004); for a survey and furtherreferences see Uffink (2007: 1020–8).
A different approach to the problem insists that one should take thestatus of \(\rho(x)\) as a probability seriously and seek ajustification of averaging in statistical terms. In this vein, Wallace(2015) insists that the quantitative content of statistical mechanicsis exhausted by the statistics of observables (their expectationvalues, variances, and so on) and McCoy (2020) submits that\(\rho(x)\) is the complete physical state of an individualstatistical mechanical system. Such a view renounces the associationof measurement outcomes with phase averages and insists thatmeasurements are “an instantaneous act, like taking asnapshot” (O. Penrose 1970: 17–18): if a measurement ofthe quantity associated with \(f\) is performed on a system at time\(t\) and the system’s micro-state at time \(t\) is \(x(t)\),then the measurement outcome at time \(t\) will be \(f(x(t))\). Anobvious consequence of this definition is that measurements atdifferent times can have different outcomes, and the values ofmacro-variables can change over time. One can then look at how thesevalues change over time. One way of doing this is to look atfluctuations away from the average:
\[\Delta(t) = f\left( x(t) \right) - \left\langle f \right\rangle,\]where \(\Delta(t)\) is the fluctuation away from the average at time\(t\). One can then expect that a that the outcome of a measurementwill be \(\langle f\rangle\) if fluctuations turn out to be small andinfrequent. Although this would not seem to be the received textbookposition, something like it can be identified in some, for instanceHill (1956 [1987]) and Schrödinger (1952 [1989]). A precisearticulation will have to use \(\rho\) to calculate the probability offluctuations of a certain size, and this requires the system to meetstringent dynamical conditions, namely either the masking conditionor thef-independence condition (Frigg & Werndl2021).
As discussed so far, GSM is an equilibrium theory, and this is alsohow it is mostly used in applications. Nevertheless, a comprehensivetheory of SM must also account for the approach to equilibrium. Todiscuss the approach to equilibrium, it is common to introduce theGibbs entropy
\[S_{G} = - k\int_{X} \rho(x)\log\lbrack\rho(x)\rbrack dx.\]The Gibbs entropy is a property of an ensemble characterised by adistribution \(\rho\). One might then try to characterise the approachto equilibrium as a process in which \(S_{G}\) increases monotonicallyto finally reach a maximum in equilibrium. But this idea is undercutimmediately by a mathematical theorem saying that \(S_{G}\) is aconstant of motion:
\[{S_{G}\lbrack\rho}_{t}(x)\rbrack = S_{G}\lbrack\rho_{0}(x)\rbrack\]for all times \(t\). So not only does \(S_{G}\) fail to increasemonotonically; it does not change at all! This precludes acharacterisation of the approach to equilibrium in terms of increasingGibbs entropy. Hence, either such a characterisation has to beabandoned, or the formalism has to be modified to allow \(S_{G}\) toincrease.
A second problem is a consequence of the Gibbsian definition ofstatistical equilibrium. As we have seen in§6.1, a system is in statistical equilibrium if \(\rho\) is stationary. Asystem away from equilibrium would then have to be associated with anon-stationary distribution and eventually evolve into the stationaryequilibrium distribution. But this is mathematically impossible. It isa consequence of the theory’s formalism of GSM that adistribution that is stationary at some point in time has to bestationary at all times (past and future), and that a distributionthat is non-stationary at some point in time will always benon-stationary. So an ensemble cannot evolve from non-stationarydistribution to stationary distribution. This requires either a changein the definition of equilibrium, or a change in the formalism thatwould allow distributions to change in requisite way.
In what follows we discuss the main attempts to address theseproblems. For alternative approaches that we cannot cover here seeFrigg (2008b: 166–68) and references therein.
Gibbs was aware of the problems with the approach to equilibrium andproposed coarse-graining as a solution (Gibbs 1902 [1981]: Ch. 12).This notion has since been endorsed by many practitioners (see, forinstance, Farquhar 1964 and O. Penrose 1970). We have alreadyencountered coarse-graining in§4.2. The use of it here is different, though, because we are nowputting a grid on the full state space \(X\) and not just on theone-particle space. One can then define a coarse-grained density\(\bar{\rho}\) by saying that at every point \(x\) in \(X\) thevalue of \(\bar{\rho}\) is the average of \(\rho\) over the grid cellin which \(x\) lies. The advantage of coarse-graining is that thecoarse-grained distribution is not subject to the same limitations asthe original distribution. Specifically, let us call the Gibbs entropythat is calculated with the coarse-grained distribution thecoarse-grained Gibbs entropy. It now turns out that coarse-grainedGibbs entropy is not a constant of motion and it is possible for theentropy to increase. This re-opens the avenue of understanding theapproach to equilibrium in terms of an increase of the entropy. It isalso possible for the coarse-grained distribution to evolve so that itis spread out evenly over the entire available space and thereby comesto look like a micro-canonical equilibrium distribution. Such adistribution is also known as the quasi-equilibrium equilibriumdistribution (Blatt 1959; Ridderbos 2002).
Coarse-graining raises two questions. First, the coarse-grainedentropycan increase and the systemcan approach acoarse-grained equilibrium, but under what circumstances will itactually do so? Second, is it legitimate to replace standardequilibrium by quasi-equilibrium?
As regards the first question, the standard answer (which also goesback to Gibbs) is that the system has to bemixing.Intuitively speaking, a system is mixing if every subset of \(X\) endsup being spread out evenly over the entire state space in the long run(for a more detailed account of mixing see entry onthe ergodic hierarchy). The problem is that mixing is a very demanding condition. In fact,being mixing implies being ergodic (because mixing is strictlystronger than ergodicity). As we have already noticed, many relevantsystems are not ergodic, and hencea fortiori not mixing.Even if a system is mixing, the mixed state is only achieved in thelimit for \(t \rightarrow \infty\), but real physical systems reachequilibrium in finite time (indeed, in most cases rather quickly).
As regards the second question, the first point to note is that asilent shift has occurred: Gibbs initially defined equilibrium throughstationarity while the above argument defines it through uniformity.This needs further justification, but in principle there would seem tobe nothing to stop us from redefining equilibrium in this way.
The motivation for adopting quasi-equilibrium is that \(\bar{\rho}\)and \(\rho\) are empirically indistinguishable. If the size of thegrid is below the measurement precision, no measurement will be ableto tell the difference between the two, and phase averages calculatedwith the two distributions agree. Hence, hence there is no reason toprefer \(\rho\) to \(\bar{\rho}\).
This premise has been challenged. Blatt (1959) and Ridderbos andRedhead (1998) argue that this is wrong because the spin-echoexperiment (Hahn 1950) makes it possible to empirically discernbetween \(\rho\) and \(\bar{\rho}\). The weight of this experimentcontinues to be discussed controversially, with some authors insistingthat it invalidates the coarse gaining approach (Ridderbos 2002) andothers insisting that coarse-graining can still be defended (Ainsworth2005; Lavis 2004; Robertson 2020). For further discussion see Myrvold(2020b).
The approaches we discussed so far assume that systems are isolated.This is an idealising assumption because real physical systems are notperfectly isolated from their environment. This is the starting pointfor the interventionist programme, which is based on the idea thatreal systems are constantly subject to outside perturbations, and thatit is exactly these perturbations that drive the system intoequilibrium. In other words, it’s these interventions fromoutside the system that are responsible for its approach toequilibrium, which is what earns the position the nameinterventionism. This position has been formulated by Blatt(1959) and further developed by Ridderbos and Redhead (1998). The keyinsight behind the approach is that two challenges introduced inSection 6.3 vanish once the system is not assumed to be isolated: the entropy canincrease, and a non-stationary distribution can be pushed toward adistribution that is stationary in the future.
This approach accepts that isolated systems do not approachequilibrium, and critics wonder why this would be the case. If oneplaces a gas like the one we discussed inSection 1 somewhere in interstellar space where it is isolated from outsideinfluences, will it really sit there confined to the left half of thecontainer and not spread? And even if this were the case, would addingjustany environment resolve the issue? Interventionistsometimes seem to suggest that this is the case, but in an unqualifiedform this claim cannot be right. Environments can be of very differentkinds and there is no general theorem that says that any environmentdrives a system to equilibrium. Indeed, there are reasons to assumethat there is no such theorem because while environments do drivesystems, they need not drive them to equilibrium. So it remains anunresolved question under what conditions environments drive systemsto equilibrium.
Another challenge for interventionism is that one is always free toconsider a larger system, consisting of our original system plus itsenvironment. For instance, we can consider the “gas + box”system. This system would then also approach equilibrium because ofoutside influences, and we can then again form an even larger system.So we get into a regress that only ends once the system under study isthe entire universe. But the universe has no environment that couldserve as a source of perturbations which, so the criticism goes, showsthat the programme fails.
Whether one sees this criticism as decisive depends on one’sviews of laws of nature. The argument relies on the premise that theunderlying theory is a universal theory, i.e., one that applies toeverything that there is without restrictions. The reader can find anextensive discussion in the entry onlaws of nature. At this point we just note that while universality is widely held,some have argued against it because laws are always tested in highlyartificial situations. Claiming that they equally apply outside thesesettings involves an inductive leap that is problematic; see forinstance Cartwright (1999) for a discussion of such a view. This, iftrue, successfully undercuts the above argument againstinterventionism.
The epistemic account urges a radical reconceptualization of SM. Theaccount goes back to Tolman (1938 [1979]) and has been brought toprominence by Jaynes in a string of publications between 1955 and1980, most of which are gathered in Jaynes (1983). On this approach,SM is about ourknowledge of the world and not about theworld itself, and the probability distributions in GSM represents ourstate of knowledge about a system and not some matter of fact. Thecentre piece of this interpretation is the fact that the Gibbs entropyis formally identical to the Shannon entropy in information theory,which is a measure for the lack of information about a system: thehigher the entropy, the less we know (for a discussion of the Shannonentropy see the entry oninformation, §4.2). The Gibbs entropy can therefore be seen as quantifying our lack ofinformation about a system. This has the advantage that ensembles areno longer needed in the statement of GSM. On the epistemic account,there is only one system, the one on which we are performing ourexperiments, and \(\rho\) describes what we know about it. This alsooffers a natural criterion for identifying equilibrium distributions:they are the distributions with the highest entropy consistent withthe external constraints on the system because such distributions arethe least committal distributions. This explains why we expectequilibrium to be associated with maximum entropy. This is known asJaynes’maximum entropy principle (MEP).
MEP has been discussed controversially, and, to date, there is noconsensus on its significance, or even cogency. For discussions see,for instance, Denbigh and Denbigh (1985), Howson and Urbach (2006),Lavis (1977), Lavis and Milligan (1985), Seidenfeld (1986), Shimony(1985), Uffink (1995, 1996a), and Williamson (2010). The epistemicapproach also assumes that experimental outcomes correspond to phaseaverages, but as we have seen, this is a problematic assumption (§6.1). A further concern is that the system’s own dynamics plays norole in the epistemic approach. This is problematic because if thedynamics has invariant quantities, a system cannot access certainparts of the state space even though \(\rho\) may assign a non-zeroprobability to it (Sklar 1993: 193–4).
The epistemic account’s explanation of the approach toequilibrium relies on making repeated measurements andconditionalizing on each measurement result; for a discussion seeSklar (1993: 255–257). This successfully gets around the problemthat the Gibbs entropy is constant, because the value assignments nowdepend not only on the system’s internal dynamics, but also onthe action of an experimenter. The problem with this solution is thatdepending on how exactly the calculations are done, either the entropyincrease fails to be monotonic (indeed entropy decreases are possible)or the entropy curve will become dependent on the sequence of instantsof time chosen to carry out measurements (Lavis & Milligan1985).
However, the most fundamental worry about the epistemic approach isthat it fails to realise the fundamental aim of SM, namely to explainhow and why processes in nature take place because these processescannot possibly depend on what we know about them. Surely, so theargument goes, the boiling of kettles or the spreading of gases hassomething to do with how the molecules constituting these systemsbehave and not with what we happen (or fail) to know about them(Redhead 1995; Albert 2000; Loewer 2001). For further discussions ofthe epistemic approach see Anta (forthcoming-a, forthcoming-b),Shenker (2020), and Uffink (2011).
A pressing and yet understudied question in the philosophy of SMconcerns the relation between the GSM and BSM. GSM provides the toolsand methods to carry out a wide range of equilibrium calculations, andit is the approach predominantly used by practitioners in the field.Without it, the discipline of SM would not be able to operate (Wallace2020). BSM is conceptually neat and is preferred by philosophers whenthey give foundational accounts of SM. So what we’re facing is aschism whereby the day-to-day work of physicists is in one frameworkand foundational accounts and explanations are given in anotherframework (Anta 2021a). This would not be worrisome if the frameworkswere equivalent, or at least inter-translatable in relatively clearway. As the discussion in the previous sections has made clear, thisis not the case. And what is more, in some contexts the formalisms donot even give empirically equivalent predictions (Werndl & Frigg2020b). This raises the question of how exactly the two approaches arerelated. Lavis (2005) proposes a reconciliation of the two frameworksthrough giving up on the binary property of the system being or notbeing in equilibrium, which should be replaced by the continuousproperty ofcommonness. Wallace (2020) argues that GSM is amore general framework in which the Boltzmannian approach may beunderstood as a special case. Frigg and Werndl suggest that BSM is afundamental theory and GSM is an effective theory that offers means tocalculate values defined in BSM (Frigg & Werndl 2019; Werndl &Frigg 2020a). Goldstein (2019) plays down their difference and arguesthat the conflict between them is not as great as often imagined.Finally, Goldstein, Lebowitz, Tumulka, and Zanghì (2020)compare the Boltzmann entropy and the Gibbs entropy and argue that thetwo notions yield the same (leading order) values for the entropy of amacroscopic system in thermal equilibrium.
So far we have focussed on the questions that arise in thearticulation of the theory itself. In this section we discuss somefurther issue that arise in connection with SM, explicitly excluding adiscussion of the direction of time and other temporal asymmetries,which have their own entry in this encyclopedia (see the entry onthermodynamic asymmetry in time).
How to interpret probabilities is a problem with a long philosophicaltradition (for a survey of different views see the entry oninterpretations of probability). Since SM introduces probabilities, there is a question of how theseprobabilities should be interpreted. This problem is particularlypressing in SM because, as we have seen, the underlying mechanicallaws are deterministic. This is not a problem so long as theprobabilities are interpreted epistemically as in Jaynes’account (§6.6). But, as we have seen, a subjective interpretation seems to clash withthe realist intuition that SM is a physical theory that tells us howthings are independently of what we happen to know about them. Thisrequires probabilities to be objective.
Approaches to SM that rely on ergodic theory tend to interpretprobabilities as time-averages, which is natural because ergodicityprovides such averages. However, long-run time averages are not a goodindicator for how a system behaves because, as we have seen, they areconstant and so do not indicate how a system behaves out ofequilibrium. Furthermore, interpreting long-run time averages asprobabilities is motivated by the fact the that these averages seem tobe close cousins of long-run relative frequencies. But thisassociation is problematic for a number of reasons (Emch 2005;Guttmann 1999; van Lith 2003; von Plato 1981, 1982, 1988,1994). An alternative is to interpret SM probabilities aspropensities, but many regard this as problematic becausepropensities would ultimately seem to be incompatible with adeterministic underlying micro theory (Clark 2001).
Loewer (2001) suggested that we interpret SM probabilities as Humeanobjective chances in Lewis’ sense (1980) because the Mentaculus(see§4.6) is a best system in Lewis’ sense. Frigg (2008a) identifies someproblems with this interpretation, and Frigg and Hoefer (2015)formulate an alternative Humean account that is designed to overcomethese issues. For further discussion of Humean chances in SM, seeBeisbart (2014), Dardashti, Glynn, Thébault, and Frisch (2014),Hemmo and Shenker (2022), Hoefer (2019), and Myrvold (2016,2021).
Consider the following scenario, which originates in a letter thatMaxwell wrote in 1867 (see Knott 1911). Recall the vessel with apartition wall that we have encountered inSection 1, but vary the setup slightly: rather than having one side empty, thetwo sides of the vessel are filled with gases of differenttemperatures. Additionally, there is now a shutter in the wall whichis operated by a demon. The demon carefully observes all themolecules. Whenever a particle in the cooler side moves towards theshutter the demon checks its velocity, and if the velocity of theparticle is greater than the mean velocity of the particles on thehotter side of the vessel he opens the shutter and lets the particlepass through to the hotter side. The net effect of the demon’sactions is that the hotter gas becomes even hotter and that the coldergas becomes even colder. This means that there is a heat transfer fromthe cooler to the hotter gas without doing any work because the heattransfer is solely due to the demon’s skill and intelligence insorting the molecules. Yet, according to the Second Law ofthermodynamics, this sort of heat transfer is not allowed. So wearrive at the conclusion that the demons’ action result in aviolation of the Second Law of thermodynamics.
Maxwell interpreted this scenario as a thought experiment that showedthat the Second Law of thermodynamics is not an exceptionless law andthat it has only “statistical certainty” (see Knott 1911;Hemmo & Shenker 2010). Maxwell’s demon has given rise to avast literature, some of it in prestigious physics journals. Much ofthis literature has focused on exorcising the demon, i.e., on showingthat a demon would not be physically possible. Broadly speaking, thereare two approaches. The first approach is commonly attributed toSzilard (1929 [1990]), but also goes also back to von Neumann (1932[1955]) and Brillouin (1951 [1990]). The core idea of this approach isthat gaining information that allows us to distinguish between \(n\)equally likely states comes at a necessary minimum cost inthermodynamic entropy of \(k \log(n)\), which is the entropydissipated by the system that gains information. Since the demon hasto gain information to decide whether to open the shutter, the secondlaw of thermodynamics is not violated. The second approach is based onwhat is now calledLandauer’s principle, which statesthat in erasing information that can discern between \(n\) states, aminimum thermodynamic entropy of \(k \log(n)\) is dissipated (Landauer1961 [1990]). Proponents of the principle argue that because a demonhas to erase information on memory devices, Landauer’s principleprohibits a violation of the second law of thermodynamics.
In two influential articles Earman and Norton (1998, 1999) lament thatfrom the point of view of philosophy of science the literature onexorcising the demon lacks rigour and reflection on what the goals theenterprise are, and that the demon has been discussed from variousdifferent perspectives, often leading to confusion. Earman and Nortonargue that the appeal to information theory has not resulted in adecisive exorcism of Maxwell’s demon. They pose a dilemma forthe proponent of an information theoretic exorcism of Maxwell’sdemon. Either the combined system of the vessel and the demon arealready assumed to be subject to the second law of thermodynamics, inwhich case it is trivial that the demon will fail. Or, if this is notassumed, then proponents of the information theoretic exorcism have tosupply new physical principles to guarantee the failure of the demonand they have to give independent grounds for it. Yet, in Earman andNorton’s view, such independent grounds have not beenconvincingly established.
Bub (2001) and Bennett (2003) responded to Earman and Norton that ifone assumes that the demon is subject to the Second Law ofthermodynamics, the merit of Landauer’s principle is that itshows where the thermodynamic costs arise. Norton (2005, 2017) repliesthat no general precise principle is stated how erasure and themerging of computational paths necessarily lead to an increase inthermodynamic entropy. He concludes that the literature onLandauer’s principle is too fragile and too tied to a fewspecific examples to sustain general claims about the failure ofMaxwell’s demons. Maroney (2005) argues that thermodynamicentropy and information-theoretic entropy are conceptually different,and that two widespread generalisations of Landauer’sfail. Maroney (2009) then formulates what he regards as a more precisegeneralisation of Landauer’s principle, which he argues does notfail.
The discussions around Maxwell’s demon are now so extensive thatthey defy documentation in an introductory survey of SM. Classicalpapers on the matter are collected in Leff and Rex (1990). For morerecent discussion see, for instance, Anta (2021b), Hemmo and Shenker(2012; 2019), Ladyman and Robertson (2013, 2014), Leff and Rex (1994),Myrvold (forthcoming), Norton (2013), and references therein.
So far, we have considered how one gas evolves. Now let’s lookat what happens when we mix two gases. Again, consider a containerwith a partition wall in the middle, but now imagine that there aretwodifferent gases on the left and on the right (forinstance helium and hydrogen) where both gases have thesametemperature. We now remove the shutter, and the gases start spreadingand get mixed. If we then calculate the entropy of the initial and thefinal state of the two gases, we find that the entropy of the mixtureis greater than the entropy of the gases in their initialcompartments. This is the result that we expect. The paradox arisesfrom the fact that the calculations donot depend on the factthat the gases are different: if we assume that we have air of thesame temperature on both sides of the barrier the calculations stillyield an increase in entropy when the barrier is removed. This seemswrong because it would imply that the entropy of a gas depends on itshistory and cannot be a function of its thermodynamic state alone (asthermodynamics requires). This is known as theGibbsParadox.
The standard textbook resolution of the paradox is that classical SMgets the entropy wrong because it counts states that differ only by apermutation of two indistinguishable particles as distinct, which is amistake (Huang 1963). So the problem is rooted in the notion ofindividuality, which is seen as inherent to classical mechanics.Therefore, so the argument goes, the problem is resolved by quantummechanics, which treats indistinguishable particles in the right way.This argument raises a number of questions concerning the nature ofindividuality in classical and quantum mechanics, the way of countingstates in both the Boltzmann and the Gibbs approach, and the relationof SM to thermodynamics. Classical discussions include Denbigh andDenbigh (1985: Ch. 4), Denbigh and Redhead(1989), Jaynes (1992),Landé (1965), Rosen (1964), and van Kampen (1984). For morerecent discussions, see, for instance, Huggett (1999), Saunders(2006), and Wills (forthcoming), as well as the contributions to Dieksand Saunders (2018) and references therein.
Increasingly, the methods of SM are used to address problems outsidephysics. Costantini and Garibaldi (2004) present a generalised versionof the Ehrenfest flea model and show that it can be used to describe awide class of stochastic processes, including problems in populationgenetics and macroeconomics. Colombo and Palacios (2021) discuss theapplication of the free energy principle in biology. The most prolificapplication of SM methods outside physics are in economics andfinance, where an entire field is named after them, namelyeconophysics. For discussions of different aspects of econophysics seeJhun, Palacios, and Weatherall (2018); Kutneret al. (2019),Rickles (2007, 2011), Schinckus (2018), Thébault, Bradley, andReutlinger (2017), and Voit (2005).
In the introduction we said that SM had to account for thethermodynamic behaviour of physical systems like gases in a box. Thisis a minimal aim that everybody can agree on. But many would gofurther and say that there must be stronger reductive relationsbetween SM and other parts of science, or, indeed, as in theMentaculs, that all layers of reality in the entire universe reduce toSM. For a general discussion of reductionism and inter-theoryrelations see the relevant entries in this encyclopaedia (scientific reduction andintertheory relations in physics).
A point where the issue of reduction has come to head is thediscussion of phase transitions. Consider again a container full ofgas and imagine that the container is over 100°C hot and that thegas in it is water vapour. Now you start cooling down the container.Once the temperature of the gas falls below 100°C, the gascondenses and turns in liquid water, and once it falls below 0°C,liquid water turns into solid ice. These are examples ofphasetransitions, namely, first from the gaseous phase to the liquidphase and then from the liquid to the solid phase. Thermodynamicscharacterises phase transitions as a discontinuity in a thermodynamicpotential such as the free energy, and a phase transition is said tooccur when this potential shows a discontinuity (Callender 2001). Itnow turns out that SM can reproduce this only in the so-calledthermodynamic limit, which takes the particle number in the systemtoward infinity (while keeping the number of particles per volumeconstant). This would seem to imply that phase transitions can onlyoccur in infinite systems. In this vein, physicist David Ruelle notesthat phase transitions can occur only in systems that are“idealized to be actually infinite” (Ruelle 2004: 2). Thisis sparked a heated debate over whether, and if so in what sense,thermodynamics can be reduced to SM, because in nature phasetransitions clearly do occur in finite systems (the water in a puddlefreezes!). Batterman (2002) draws the conclusion that phasetransitions areemergent phenomena that are irreducible tothe underling micro-theory, namely SM. Others push back against thisview and argue that no actual infinity is needed and that thereforelimiting behaviour is neither emergent not indicative of a failure ofreduction. Norton (2012) reaches this conclusion by demoting the limitto a mere approximation, Callender (2001) by counselling against“taking thermodynamics too seriously”, and Butterfield bydeveloping a view that reconciles reduction and emergence (2011a,2011b, 2014). For further discussions of phase transitions and theirrole in understanding whether, and if so how, thermodynamics can bereduced to SM see, Ardourel (2018), Bangu (2009), Butterfield andBuatta (2012), Franklin (2018), Liu (2001), Palacios (2018, 2019,forthcoming), and Menon and Callender (2013), and for a discussion ofinfinite idealisations in general Shech (2018).
Similar questions arise when we aim to reduce the thermodynamicentropy to one of the SM entropies (Callender 1999; Dizadji-Bahmani etal. 2010; Myrvold 2011), when we focus on the peculiar nature ofquasi-static processes in thermodynamics (Robertson forthcoming), whenwe study equilibration Myrvold (2020a), and when we give up theidealisation that SM systems are isolated and take gravity intoaccount (Callender 2011). There is also a question about what exactlyreduction is expected to achieve. Radical reductionists seem to expectthat once the fundamental level is sorted out, the rest of sciencefollows from it as a corollary (Weinberg 1992), which is also a visionthat seems to drive the Mentaculus. Others are more cautions. Lavis,Kühn, and Frigg (2021) and Yi (2003) note that even if reductionis successful, thermodynamics remains in place as an independenttheory: SM requires the framework of thermodynamics, which serves asrecipient of information from SM but without itself being derivablefrom SM.
How to cite this entry. Preview the PDF version of this entry at theFriends of the SEP Society. Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO). Enhanced bibliography for this entryatPhilPapers, with links to its database.
ergodic hierarchy |information |information processing: and thermodynamic entropy |laws of nature |physics: intertheory relations in |probability, interpretations of |quantum mechanics |quantum mechanics: collapse theories |reduction, scientific |statphys-boltzmann |supervenience |time: thermodynamic asymmetry in
View this site from another server:
The Stanford Encyclopedia of Philosophy iscopyright © 2024 byThe Metaphysics Research Lab, Department of Philosophy, Stanford University
Library of Congress Catalog Data: ISSN 1095-5054