Cellular automata (henceforth: CA) arediscrete,abstractcomputational systems that have proved useful both as generalmodels of complexity and as more specific representations ofnon-linear dynamics in a variety of scientific fields. Firstly, CA are(typically) spatially and temporallydiscrete: they arecomposed of a finite or denumerable set of homogeneous, simple units,theatoms orcells. At each time unit, the cellsinstantiate one of a finite set of states. They evolve in parallel atdiscrete time steps, following state update functions or dynamicaltransition rules: the update of a cell state obtains by taking intoaccount the states of cells in its local neighborhood (there are,therefore, no actions at a distance). Secondly, CA areabstract: they can be specified in purely mathematical termsand physical structures can implement them. Thirdly, CA arecomputational systems: they can compute functions and solvealgorithmic problems. Despite functioning in a different way fromtraditional, Turing machine-like devices, CA with suitable rules canemulate a universalTuring machine (see entry), and therefore compute, given Turing’s thesis (see entry onChurch-Turing thesis), anything computable.
The mark of CA is in their displaying complex emergent behavior,starting from simple atoms following simple local rules. Because ofthis, CA attract a growing number of researchers from the cognitiveand natural sciences willing to study pattern formation and complexityin a pure, abstract setting. This entry provides an introduction to CAand focuses on some of their philosophical applications: these rangefrom the philosophy of computation and information processing, toaccounts of reduction and emergence in metaphysics and cognition, todebates around the foundations of physics.
We will proceed as follows. In the introductory Section 1, CA arefirst explained via an example: Section 1.1 describes a simpleone-dimensional automaton displaying an intuitively manifest behavior.Sections 1.2–1.3 provide a short survey of the history and mainapplications of CA.
In Section 2, the general theory of CA is explained, together with aselection of computational and complexity-theoretic results in thefield. Section 2.1 provides a fourfold schematic definition of CA.Sections 2.2–2.3 explain the classification of one-dimensionalCA proposed by Stephen Wolfram. Section 2.4 introduces the Edge ofChaos hypothesis, a key CA-related conjecture in complexity theory.Sections 2.5–2.7 generalize to automata occupying more than onespatial dimension, and/or relaxing some parameters in the definitionof 2.1. We focus on the Game of Life—possibly the most popularCA—and its computational capabilities.
Section 3 describes four main uses of CA in philosophicalinvestigation. Firstly, since CA display complex behavioral patternsemerging from simple local rules, they have been naturally linked toemergence: this topic is dealt with in Section 3.1, wheredifferent notions of emergence are considered. Secondly, Section 3.2explores how CA have been put to work, both by philosophers and byscientists, to address the traditional philosophical problems offree will anddeterminism. Thirdly, Section 3.3describes the impact of CA theories on the philosophy of computation.Finally, Section 3.4 addresses ontological issues ranging from thesense in which CA count as modelling portions of reality, to the boldphilosophical conjecture of some scientists, who claim that thephysical world itself may be, at its bottom, a discrete, digitalautomaton.
We introduce CA using a simple example. Think of an automaton as aone-dimensional grid of simple elements (the cells). Each of them canonly instantiate one of two states; let us say that each cell can beturnedon oroff. The evolution of the system isdetermined by a transition rule, to be thought of as implemented ineach cell. At each time step, each cell updates its status in responseto what happens to its neighboring cells, following the rule.

Fig. 1
Although CA are abstract, having a concrete instance in mind can helpin the beginning. So think ofFig. 1 as representing the front row of a high school classroom. Each boxstands for a student wearing (black) or not wearing (white) a hat. Letus make the two following assumptions:
Hat rule: a student will wear the hat in the following classif one or the other—but not both—of the two classmatessitting immediately on her left and on her right has the hat in thecurrent class (if nobody wears a hat, a hat is out of fashion; but ifboth neighbors wear it, a hat is now too popular to be trendy).
Initial class: during the first class in the morning, onlyone student in the middle shows up with a hat (seeFig. 2).

Fig. 2
Fig. 3 shows what happens as time goes by. Consecutive rows represent theevolution in time through subsequent classes.

Fig. 3
Fig. 3 may be surprising. The evolutionary pattern displayed contrasts withthe simplicity of the underlying law (the “Hat rule”) andontology (for in terms of object and properties, we only need to takeinto account simple cells and two states). The global, emergentbehavior of the system supervenes upon its local, simple features, atleast in the following sense: the scale at which the decision to wearthe hat is made (immediate neighbors) is not the scale at which theinteresting patterns become manifest.
This example is a paradigmatic illustration of what makes CA appealingto a vast range of researchers:
even perfect knowledge of individual decision rules does not alwaysallow us to predict macroscopic structure. We get macro-surprisesdespite complete micro-knowledge. (Epstein 1999: 48)
Since the notion ofemergence and the micro-macro interplayhave such an important role in science and philosophy (see the entriesonsupervenience andemergent properties; for a sample of scientific applications, see Mitchell 2009:2–13; Gell-Mann 1994: Ch. 9), it has been suggested that manyscientific as well as conceptual puzzles can be addressed by adoptingthe CA perspective. Stephen Wolfram has gone as far as claiming thatCA may help us to solve longstanding issues in philosophy:
Among them [the fundamental issues philosophers address] are questionsabout the ultimate limits of knowledge, free will, the uniqueness ofthe human condition and the inevitability of mathematics. Much hasbeen said over the course of philosophical history about each ofthese. Yet inevitably it has been informed only by current intuitionsabout how things are supposed to work. But my discoveries in this book[A New Kind of Science] lead to radically new intuitions.(Wolfram 2002: 10)
These are very bold claims. In order to assess them, let us take acloser look at the field.
The surprising patterns in the aforementioned classroom example weregenerated by boxes in a line with just two states and a simple rule.One may wonder how many variations are possible on such a basicframework. To address this issue, let us begin by considering howAndrew Ilachinski, in his review of the literature, narrows down CAapplications to four main areas, which will be referred to in the restof this entry (Ilachinski 2001: 7):
(CA1) emphasizes that CA perform computations. Just like Turing machines,they can be specified in mathematical terms, and implemented indifferent physical systems. However, CA are peculiar in two importantways. First, unlike Turing machines and von Neumann-architectureconventional computers, CA compute in aparallel, distributedfashion. Second, computation is pretty much “in the eye of thebeholder”: there is no tape, but the evolution of thecells’ states can often be interpreted as a meaningfulcomputational procedure (e.g., bits can be encoded using thewhite/black cell states). Computational hardwareinspired by CA can help solve important technological problems (seeIlachinski 2001: 8), but apart from engineering issues,(CA1) also points to major conceptual questions, such as how exactly auniversal Turing machine and an automaton can be rigorously compared(see Beraldo-de-Araújo & Baravalle forthcoming) and whatare, if any, the philosophical implications of this comparison (seeWolfram 2002: Ch. 12).
(CA2) comprises scientific applications of CA to the modelling of specificproblems—to mention just a few: urban evolution (Batty 2005),Ising models (Creutz 1986), neural networks (Franceschetti, et al.1992: 124–128), lattice fluids (Barberousse & Imbert 2013),security (Ray et al. 2023 [Other Internet Resources]), bioinformatics(Xiao et al. 2011), andeven turbulence phenomena (Chen et al. 1983). As Ilachinski remarks,for instance, discrete models of turbulence show that
very simple finite dynamical implementations of local conservationlaws are capable of exactly reproducing continuum system behavior onthe macroscale. (Ilachinski 2001: 8)
(CA3) and(CA4) enter very directly into the philosophical arena: as for(CA3), Daniel Dennett has resorted to a famous automaton we describe below,Conway’sGame of Life, to make his point on determinismand the attribution of high-level concepts to emergent patterns(Dennett 1991, 2003). As for(CA4), CA can provide an account of microphysical dynamics by representingdiscrete counterparts of quantum field theories (see entry onQuantum Field Theory) alternative to the standard continuous frames. But the morephilosophical, and quite bolder, claim in this area is that natureitself may be a CA: Edward Fredkin, for instance, has advanced his“Finite Nature” hypothesis that our universe is anautomaton which, at each time step, digitally and locally processesits state for the next time step (see Fredkin 1993). Apart from theinterest generated by Fredkin’s claim, entertaining thehypothesis raises a number of questions at the crossroads of physicsand metaphysics (what is a natural law?), epistemology (what are thelimits of physical systems predictability?) and the philosophy ofinformation (what is the role of information in the physical world?).We will address each of these questions in the third Section of thisentry.
The father of CA is John von Neumann (von Neumann 1951). Working onself-replication and attempting to provide a reductionist theory ofbiological development, von Neumann was trying to conceive a systemcapable of producing exact copies of itself. Now biologyprimafacie appears to be the realm of fluidity and continuousdynamics. But following a suggestion of his colleague Stanislaw Ulam,von Neumann decided to focus on a discrete, two-dimensional system.Instead of justblack-or-white cells, vonNeumann’s automaton used 29 different states and rathercomplicated dynamics, and was capable of self-reproduction. VonNeumann’s CA was also the first discrete parallel computationalmodel in history formally shown to be a universal computer, i.e.,capable of emulating a universal Turing machine and computing allrecursive functions (see entry).
In the early Sixties, E.F. Moore (1962) and Myhill (1963) proved theGarden-of-Eden theorems stating conditions for the existence ofso-called Gardens of Eden, i.e., patterns that cannot appear on thelattice of a CA except as initial conditions. Gustav Hedlund (1969)investigated cellular automata within the framework of symbolicdynamics. In 1970 the mathematician John Conway introduced hisaforementionedLife game (Berkelamp, Conway, & Guy 1982),arguably the most popular automaton ever, and one of the simplestcomputational models ever proved to be a universal computer. In 1977,Tommaso Toffoli used cellular automata to directly model physicallaws, laying the foundations for the study of reversible CA (Toffoli1977).
Stephen Wolfram’s works in the 1980s contributed to putting thegrowing community of CA followers on the scientific map. In a seriesof papers, Wolfram extensively explored one-dimensional CA, providingthe first qualitative taxonomy of their behavior and laying thegroundwork for further research. A particular transition rule forone-dimensional CA, known asRule 110, was conjectured to beuniversal by Wolfram. Some twenty years after the conjecture, MatthewCook proved thatRule 110 is capable of universal computation(Cook 2004; Wolfram 2002 also contains a sketch of the proof).
We are now taking a closer look at CA, focusing on models and resultsof philosophical interest. Although the variety of systems to be foundin the CA literature is vast, one can generate virtually all CA bytuning the four parameters that define their structure:
One can exhaustively describe, for instance, the automaton of ourclassroom example:
A rule for a CA can be expressed as a conditional instruction:“If the neighborhood is this-and-this, then turn to states”. One can write the general form of the rule forone-dimensional CA:
\[\tag{Rule1D}\sigma_i (t + 1) = \phi(\sigma_{i - r} (t), \sigma_{i - r + 1} (t), \ldots, \sigma_{i + r - 1} (t), \sigma_{i + r} (t))\]Where \(\sigma_{i}(t) \in \Sigma = \{0, 1, \ldots, k - 1\}\) is thestate of cell numberi at time stept;rspecifies therange, that is, how many cells on any sidecount as neighbors for a given cell; and \(\phi\) is definedexplicitly by assigning values in \(\Sigma\) to each of the \(k^{2r+1}(2r + 1)\)-tuples representing all the possible neighborhoodconfigurations. For example, with \(r = 1, \Sigma = \{1, 0\}\), apossible transition rule \(\phi\) can be expressed as inFig. 4 (with 1 being represented asblack, 0 aswhite):

Fig. 4
For a given cell, each triple at the top represents a possibleneighborhood configuration att, the cell at issue being theone in the middle: for each configuration, the square at the bottomspecifies the cell’s state at \(t + 1\). This is our classroomexample: you will have a black cell just in case precisely one of theneighbors was black.
This simple representation is also at the core of the widely adoptedWolfram code (Wolfram 1983), assigning to each rule a number: withblack = 1 andwhite = 0, the bottom row can be readas a binary number (01011010); converting to decimal gives you therule’s name (in this case,Rule 90). Since rules for CAwith \(r = 1\) and \(k = 2\) differ just in the bottom row of thediagram, this encoding scheme effectively identifies each possiblerule in the class. One-dimensional CA with \(r = 1\) and \(k = 2\) areamong the simplest CA one can define, yet their behavior is at timequite interesting. When Stephen Wolfram started to explore this fieldin the Eighties, that class seemed a perfect fit. With \(r = 1\),there are 8 possible neighbors (seeFig. 4 above) to be mapped to \({1, 0}\), giving a total of \(2^8 = 256\)rules. Starting with random initial conditions, Wolfram went on toobserve the behavior of each rule in many simulations. As a result, hewas able to classify the qualitative behavior of each rule in one offour distinct classes. Repeating the original experiment, we simulatedthe evolution of two rules for each class of Wolfram’sscheme.
Class1 rules leading tohomogeneous states, all cells stably ending up with the same value:
Rule 250

Rule 254

Class2 rules leading tostable structures or simple periodic patterns:
Rule 4

Rule 108

Class3 rules leading toseemingly chaotic, non-periodic behavior:
Rule 30

Rule 90

Class4 rules leading tocomplex patterns and structures propagating locally in the lattice:
Rule 54

Rule 110

Class1 comprises the rules that quickly produce uniformconfigurations. Rules in Class2 produce a uniform finalpattern, or a cycling between final patterns, depending on the initialconfigurations. The configurations produced by members ofClass3 are pretty much random-looking, although someregular patterns and structures may be present.
Class4 deserves special attention. If we observe theuniverse generated byRule 110 we see regular patterns(although not as regular as inRule 108) as well as somechaotic behavior (although not as noisy as inRule 90). Nowthe basic feature a CA needs to perform computations is the capacityof its transition rule of producing “particle-like persistentpropagating patterns” (Ilachinski 2001: 89), that is, localized,stable, but non-periodic configurations of groups of cells, sometimescalledsolitons in the literature, that can preserve theirshape. These configurations can be seen asencoding packetsof information,preserving them through time, andmoving them from one place to another: information canpropagate in time and space without undergoing important decay. Theamount of unpredictability in the behavior of Class4 rulesalso hints at computationally interesting features: by the HaltingTheorem (see section in entry onTuring machines), it is a key feature of universal computation that one cannot inprinciple predict whether a given computation will halt given acertain input. These insights led Wolfram to conjecture thatClass4 CA were (the only ones) capable of universalcomputation. Intuitively, if we interpret the initial configuration ofa Class4 CA as its input data, a universalClass4 CA can evaluate any effectively computable functionand emulate a universal Turing machine. As we mentioned above,Rule 110 was indeed proved to be computationallyuniversal.
(See the supplementary documentThe 256 Rules.)
The intermediate nature of Class4 rules is connected to theidea thatinteresting complexity, such as the one displayedby biological entities and their dynamics, lies in a middle areabetween the two extremes of boring regularities and noisy chaos:
Perhaps the most exciting implication [of CA representation ofbiological phenomena] is the possibility that life had its origin inthe vicinity of a phase transition and that evolution reflects theprocess by which life has gained local control over a successivelygreater number of environmental parameters affecting its ability tomaintain itself at a critical balance point between order and chaos.(Langton 1990: 13)
CA provided not just the intuition, but a formal framework toinvestigate the hypothesis. In the late Eighties the “Edge ofChaos” picture received considerable interest by CApractitioners. Packard 1988 and Langton 1990 were the first studies togive to the Edge of Chaos a now well-known interpretation in the CAcontext. As Miller and Page put it, “these early experimentssuggested that systems poised at the edge of chaos had the capacityfor emergent computation” (Miller & Page 2007: 129). Theidea is simple enough: what happens if we take a rule likeRule110 and introduce a small perturbation? If we are to believe theEdge of Chaos hypothesis, we should expect the rules obtained by smallchanges inRule 110 to exhibit either simple or chaoticbehavior. Let us consider a single switch from 1 to 0 or 0 to 1 in thecharacteristic mapping ofRule 110. The results are thefollowing eight neighboring rules, each differing fromRule110 by a single bit (the diagonal in the array, with numbers initalics):
| 110 | 111 | 108 | 106 | 102 | 126 | 78 | 46 | 228 | |
| 000 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 001 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 1 |
| 010 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 1 |
| 011 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 1 |
| 100 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 |
| 101 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 |
| 110 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 1 |
| 111 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
| Class | 4 | 2 | 2 | 3 | 3 | 3 | 1 | 2 | 1 |
At a first approximation, the Edge of Chaos hypothesis is confirmed:three of the eight neighbors are Class3, three areClass2, two are Class1:Rule 110 is theonly Class4 in the table. To generalize these findings tothe entire class of rules for one-dimensional CA, Langton introduced aparameter, \(\lambda\), that applies to each \(\phi\): for \(k = 2\),\(r = 1\) (binary-state, unary-range) CA, \(\lambda(\phi)\) can becomputed as the fraction of entries of the transition rule table thatare mapped to a non-zero output (see Langton 1990: 14 for the generaldefinition). In our case this means: \(\lambda(\phi)\) will be equalto the number of ones in the rule column—e.g., \(\lambda(\phi) =5/8\) for \(\phi\) =Rule 110 and \(\lambda(\phi) = 1/2\) for\(\phi\) =Rule 46. Langton’s major finding was that asimple measure such as \(\lambda\) correlates with the systembehavior: as \(\lambda\) goes from 0 to 1, the average behavior of thesystems goes from freezing to periodic patterns to chaos. Langtonsingled out 1/2 as the value of \(\lambda\) at which the averagebehavior first shows evidence of chaos: rules \(\phi\) with\(\lambda(\phi) \sim 1/2\) were highlighted as being on the Edge (seeMiller & Page 2001: 133).
| \(\lambda\) | All Rules | Chaotic Rules | Complex Rules |
| 0 | 1 | 0 | 0 |
| 1/8 | 8 | 0 | 0 |
| 1/4 | 28 | 2 | 0 |
| 3/8 | 56 | 4 | 1 |
| 1/2 | 70 | 20 | 4 |
| 5/8 | 56 | 4 | 1 |
| 3/4 | 28 | 3 | 0 |
| 7/8 | 8 | 0 | 0 |
| 1 | 1 | 0 | 0 |
Both chaotic and complex rules have an average \(\lambda\) valuearound 1/2, thus apparently supporting the Edge of Chaos hypothesis.It is fair to say, though, that some have cast doubts on theexplanatory role of parameter \(\lambda\) and the inferences drawnfrom it. In particular, the transition region of the Edge of Chaosseems to be itself complex. Miller and Page note that “there aremultiple edges, not just a single one” (Miller & Page 2007:133). Aggregate results do not hold when we analyze individual rules,even paradigmatic ones:
| 110 | 111 | 108 | 106 | 102 | 126 | 78 | 46 | 228 | |
| \(\lambda\) | 5/8 | 3/4 | 1/2 | 1/2 | 1/2 | 3/4 | 1/2 | 1/2 | 3/4 |
As the table shows, among theRule 110 neighbors, somechaotic rules \(\phi\) have \(\lambda(\phi) = 3/4\), some cyclic oneshave \(\lambda(\phi) = 1/2\) and, indeed,
every one of the rules classified as complex in this space has atleast one chaotic neighbor with a lower \(\lambda\) value and one withhigher value. (Miller & Page 2007: 135)
Melanie Mitchell, Peter Hraber and James Crutchfield replicatedLangton and Packard’s experiments, reporting very differentresults (Mitchell, Hraber, & Crutchfield 1994). In particular,they report that serious computational phenomena take place muchcloser to a chaotic \(\lambda(\phi) = 1/2\) than it was previouslythought. Apart from technical points, a conceptual flaw in theoriginal findings is the use of aggregate statistics, which aredifficult to interpret in a high variance context:
if, instead, the hypotheses are concerned with generic, statisticalproperties of CA rule space—the “average” behaviorof an “average CA” at a given \(\lambda\)—then thenotion of “average behavior” must be better defined.(Mitchell, Hraber, & Crutchfield 1994: 14).
While it is fair to conclude that complex behavior does not lie at theEdge of Chaos taken in a simplistic sense (i.e., it is notstraightforwardly correlated with the simple \(\lambda\)), theinterest in the connection between computational capabilities andphase transitions in the CA rule space has been growing since then. Wewill consider such developments below, specifically in the context ofCA and the philosophy of computation.
Notwithstanding the computational interest of one-dimensional CA,philosophical issues have been discussed more often in connection withtwo-dimensional CA. The first CA, von Neumann’s self-reproducingautomaton, inhabited a two-dimensional grid. Besides, two-dimensionalCA are suitable for representing many physical, biological, and evenhuman phenomena, ranging from the dynamics of perfect gases to themovements of birds in a storm and soldiers on a battlefield. The mostcommon configurations have either square or hexagonal cells, giventheir translational and rotational symmetries. Moving to twodimensions, of course, also expands the possibly interestingcombinations of rules and neighborhoods. As for the latter, the twomost common options in a grid of squares are thevon Neumannneighborhood, where each cell interacts only with its four horizontaland vertical adjacent mates, and theMoore neighborhood,comprising all the eight immediately adjacent cells.
By way of example, we introduce the famousGame of Life (or,more briefly,Life) by John Conway (see Berkelamp, Conway,& Guy 1982).Life fits well with our usual schema:
Life would definitely be considered a Class4 CA byWolfram’s taxonomy. In this simple setting, periodic structures,stable blocks and complex moving patterns come into existence, evenstarting from a very simple initial configuration. Conway remarked:
It’s probable, given a large enoughLife space,initially in a random state, that after a long time, intelligent,self-reproducing animals will emerge and populate some parts of thespace. (cited in Ilachinski 2001: 131)
Life-fans explored the CA’s possible patterns ofevolution, and shared their findings in what has been calledLife’s zoology (Dennett 2003: 41). Here is a smallgallery of samples together with snapshots of a typical simulation(for more pictures and animations, seeOther Internet Resources at the end).Gliders are the most popular among the basicLife inhabitants: a simple 5-bit structure, a glider cantravel theLife grid in a 4-time step cycle:
Glider

\(t_{0}\)

\(t_{1}\)

\(t_{12}\)
Toads are period 2 blinking configurations: together withBlinkers andBeacons they are the simplestoscillators of the universe.
Toad

\(t_{0}\)

\(t_{1}\)

\(t_{3}\)
Eaters have the feature of devouring other configurations,e.g., gliders, maintaining intact their own form (because of this,they play an important role forLife’s computationalabilities).
An Eater devouring a Glider

\(t_{0}\)

\(t_{2}\)

\(t_{4}\)
A typical evolution ofLife starting from random initialconditions may contain all of the above notable figures and much more.Some initial configuration may end up, even after few time steps, intostatic or simple periodic structures. Other configurations, though,can produce non-periodic, increasingly complex environments whosedevelopment can be unpredictable (even in the computational sense weare about to explore). As Ilachinski has suggestively conjectured fromthis:
Upon observing the seemingly unlimited complexity and variety ofLife’s evolving patterns, it becomes almost impossibleto refrain from imagining, along with Conway, that, were the gamereally to be played on an infinite lattice, there must surely arisetrue living “life-forms”, perhaps themselves evolving intomore complex, possibly sentient, “organisms”. (Ilachinski2001: 133)

\(t_{0}\)

\(t_{10}\)

\(t_{20}\)

\(t_{30}\)

\(t_{40}\)

\(t_{175}\)
The mathematical literature on CA does not refrain from describing theLife configurations using the same imaginative vocabulary weused: items areborn,live,move around,eat other figures,die, etc. The universe thesepatterns inhabit may also be described, though, as a collection ofindividual cells, each of which does not directly depend on what ishappening on the macro-scale. And the life onLife can alsobe described in the simple mathematical language of matrices anddiscrete sequences. But if one is only told the basicLiferule, one could hardly imagine the complexity it cangenerate—until one sees it.Life’s reputationamong scientists and philosophers arguably comes from its challengingnaive intuitions about complexity, pattern formation and reality,persistence, and continuity: as a toy universe we ourselves built, wefeel we should know in advance what dynamics are allowed. This hasbeen shown to be impossible, in a mathematically precise sense.
Like any other CA,Life can be considered a computationaldevice: an initial configuration of the automaton can encode an inputstring. One can let the system run and, at some point, read thecurrent configuration as the result of the computation performed sofar, decoding it into an output string. But exactly what canLife compute? It turns out thatLife can computeeverything a universal Turing machine can and therefore, taking onboard Turing’s Thesis, function as a general purpose computer: asuitable selection of initial conditions can ensure that the systemcarry out arbitrary algorithmic procedures.
The proof of the universal computational capacities ofLifepresented in Berkelamp, Conway, and Guy 1982 consists in showing thatthe basic building blocks or primitives of standard digitalcomputation can be emulated by appropriate patterns generated byLife—in particular: (a) data storage or memorization,(b) data transmission requiring wires and an internal clock, and (c)data processing requiring a universal set of logic gates, likenegation, conjunction and disjunction—an actual Turing machinewas later explicitly implemented inLife by Paul Rendell (seeOther Internet Resources).
This finding is not of great engineering importance (no one wouldspend their time translating “\(24 + 26 / 13\)” intoLife). However, it raises a conceptual issue about anyuniverse sharing the capacity of producing and hosting universalcomputers: because of the aforementioned Halting Theorem, no generalalgorithm is to decide whether, given some initial configuration asinput,Life will eventually die out or halt. It is in thissense that the evolution of the automaton is unpredictable. Given thatthe development of CA that are computationally universal cannot bepredicted by direct mathematical analysis alone, it is no surprisethat CA practitioners have adopted the language of philosophy andtalked ofphenomenological studies of CA (we will come backto this terminology in more detail inSection 3.4 below, discussing how CA model whatever they can model). Here theautomaton is realized as a computer software, and the observableemergent properties of its evolution are empirically registered as thecomputer simulation advances. In Wolfram’s turn of phrase,Life isalgorithmically irreducible: no algorithmicshortcut is available to anticipate the outcome of the system givenits initial input. “Life—like all computationallyuniversal systems—defines the most efficient simulation of itsown behavior” (Ilachinski 2001: 15). This raises the importantphilosophical question of the limits of the predictability of anyuniverse capable, just asLife is, of producing and hostinguniversal computers.
Notwithstanding the historical and conceptual centrality of the CAdescribed in this section, many important developments in the fieldcould not be presented in the space allowed for this entry. One canrelax some of the assumptions in the general characterization of CAprovided inSection 2.1 and get interesting results. The transition rule can beprobabilistic and take into consideration more than just onetime step (see Richards, Meyer, & Packard 1990: probabilisticautomata are widely used to represent the stochastic dynamics ofmicrophysical systems); the cell state updating can beasynchronous (see Ingerson & Buvel 1984); the lattice canbe made ofnon-homogeneous cells following differenttransition rules (see Kauffman 1984); even the discreteness constraintcan be relaxed by having the set of states be the set ofrealnumbers (see Wolfram 2002: 155–157).
CA are also being fruitfully used in connection to the issue of thethermodynamic limits of computation: is there a minimum amount ofenergy needed to perform a logical operation? Landauer (1961) arguedthat irreversible logical operations (i.e., operations that, notcorresponding to bijections, cannot be run backwards as they entailsome information loss) necessarily dissipate energy. The invention ofthe Fredkin reversible logical gate and of the Billiard Ball Model ofreversible computation (Fredkin & Toffoli 1982) strengthened theimportance of the link between universal reversible automata and thephysical properties of computation (for an overview, see Ilachinski2001: 309–323; for a sample reversible CA, see Berto, Rossi,& Tagliabue 2016).
In recent years, the growth of Artificial Intelligence (AI) as aprominent sub-field in computer science led to interestingcontamination between AI and CA. On the one hand, prominent AIresearchers explicitly mentioned contributions from the complex systemliterature – such as CA – as alternative ways to modelcollective behavior (Ha & Tang 2022). On the other, traditionalCA-based modeling has been extended to leverage the “powerfullanguage of loss functions” (Mordvintsev et al. 2020) fordifferentiable rules, and take advantage of the extensive toolingbuilt for gradient-based numerical optimization: CA built on top ofneural networks showcase (learned) asynchronous rules (Mordvintsev etal. 2020) for morphogenesis, as well as (learned) variableneighborhood composition (Grattarola et al. 2021).
Finally, it is worth mentioning thatgenetic algorithms havebeen used with CA to study how evolution creates computation (for asurvey of important results, see Mitchell, Crutchfield, & Das1996). While the aforementioned sources further explore thesepossibilities, the sample CA models discussed so far will besufficient for the philosophical arguments we are going to addresshenceforth.
A growing number of CA-related philosophical arguments are beingproduced, both by philosophers and by scientists interested in theconceptual implications of their work. Among the interesting issuesaddressed through the CA approach in the philosophical market are thestructure of emergence, free will, the nature of computation, and thephysical plausibility of a digital world.
CA can be considered a paradigmatic locus for the study of phenomenarelated toemergence (for an introduction, see the entry onemergent properties). One can initially divide the problem of emergence into two separatequestions, roughly corresponding to epistemological and ontologicalissues: How can werecognize emergence? What is theontological status of the high-level properties and features?As a matter of historical fact, CA have been invoked mainly to addressthe former, but we will see inSection 3.4 below that there is work for CA also on the ontological side.
Epistemological issues have often been raised in connection withcomplex systems in general. In their open agenda for complex socialsystems, Miller and Page include the following question: “Isthere an objective basis for recognizing emergence andcomplexity?” (Miller & Page 2007: 233–234). Theliterature on CA has variously addressed the issue. On the one hand,being a low-level simple and controllable environment, CA presentthemselves as a natural framework for tackling the problem in itspurest form. On the other hand, CA researchers have recognized how thesystemic and global features of a complex CA system can be hard topredict even with perfect knowledge of low-level entities and laws:
Over and over again we will see the same kind of thing: that eventhough the underlying rules for a system are simple, and even thoughthe system is started from simple initial conditions, the behaviorthat the system shows can nevertheless be highly complex. (Wolfram2002: 28)
Due to the local nature of a CA’s operations, it is typicallyvery hard, if not impossible, to understand the CA’s globalbehavior (…) by directly examining either the bits in thelookup table or the temporal sequence of raw 1–0 spatialconfigurations of the lattice. (Hordijk, Crutchfield, & Mitchell1996: 2)
Now the issue of detecting emergence is connected to the conceptualproblems of defining what an emerging feature is: we needsome idea of what we are looking for in order to scan thespace-time evolution of a system and recognize its patterns. We maystart with the fourfold characterization of emergence provided inClark 2013. Emergence may be taken:
In the first sense, an emergent feature is “any interestingbehaviour that arises as a direct result of multiple self-organizing… interactions occurring in a system of simple elements”(Clark 2013: 132). CA clearly fit theE1 bill, but that is because this is a rather generic characterization(What counts as interesting? What is self-organization?). Things get abit more precise withE2: emergent features here are taken asunprogrammed, that is,as such that there is no program explicitly encoding the relevantphenomena, features, or processes in the target system (a typicalexample is cricket phonotaxis, see Clark 2013: 120: female cricketsmove towards males via a mechanical body system directing them to thesource of sounds of particular wavelengths; one can describe this asfemales aiming towards males after hearing their sound, but what isencoded in the cricket’s body functions is just an automaticearlier activation of the side of the body first reached by thesounds). According to the concept embodied inE3, we get emergence as interactive complexity when “complex,cyclic interactions give rise to stable and salient patterns ofsystemic behavior” (Clark 2013: 134); in particular, theinteractions are supposed to be nonlinear (a typical example areconvection rolls in a fluid heated in a pan: see Kelso 1995: 5).
Now the emergent properties attracting CA scholars’ attention,unsurprisingly, have mainly been computational properties, i.e., thefeatures enabling a system to perform complex calculations in spite ofnot being explicitly computationally encoded at the base level (whichsits in the vicinity ofE2). Additionally, as we have seen during our discussion of the Edge ofChaos hypothesis, CA scholars have focused on the study of nonlinearglobal dynamics emerging from the local interactions of the CA cells(which sits in the vicinity ofE3). In order to introduce the formal work on emergent CA computation, andto compare these findings with the available philosophical accounts,we can start again with a concrete example. This is the“classification problem”.
We want to design a one-dimensional automaton that answers a simplequestion: Are there more white or black cells at a given time \(t_0\)?Starting from any initial conditions with more white (black) cells,the ideal automaton will halt having all its cells turned white(black) after a given number of time steps (designing an automatonthat always gives the right answer is not feasible in practice; so theperformance is judged by the fraction of random initial conditionsthat are correctly classified).

\(t_0\)

\(t_n\)
The task is far from trivial in a CA, involving the intricacy of theemergence in bothE2 andE3 sense. The answer requires a global perspective (how many cells arewhite (black) in the lattice as a whole?). However, the cells workwith only local rules explicitly encoded in them: no single cell cando the counting. The ideal automaton should find a way to aggregateinformation from several parts of its own lattice to give the finalanswer. A kind of emergent computation is needed to successfully solvethis density classification task. It has been proved that no CA cansolve this problem precisely (see Land & Belew 1995). Since,however, CA with larger neighborhoods achieve better results in tasksof this kind,genetic algorithms are used to efficientlysearch the solution space (genetic algorithms are computationalanalogues of genetic evolution, in which a computational problem isaddressed by producing, combining and selecting solutions to it; thedetails of the procedure as applied to the classification problem arenot important for our purpose, but, for an accessible presentation,see Mitchell 2009: 22; for a general introduction see Mitchell 1998).The following is a diagram of the “Rule \(\phi_{17083}\)”,discovered by James Crutchfield and Melanie Mitchell (1995). The CAimplementing the rule starts from an initial state with more whitecells:

Fig. 5
At time step 250 (not shown inFig. 5), the grey areas disappear and all cells end up white, i.e., theclassification made by the automaton is correct. A high-leveldescription of what is going on would have it that the white, blackand grey regions are “expanding”,“contracting”, “moving”, these being nonlineareffects of the low-level working of the cells computing their localstates, and by so doing manage to carry signals along the lattice. Buthow are we to explain the emergent computation CA of this kind performvia such nonlinear dynamics? Building on previous works on computationtheory applied to CA (Hanson & Crutchfield 1992; Crutchfield &Hanson 1993), Mitchell and Crutchfield filtered out from the originaldiagram what they call “domains”, i.e., dynamicallyhomogeneous spatial regions (Crutchfield & Mitchell 1995: 10745):

Fig. 6
Although in this case the “domains” are manifest, it iscrucial to point out that their definition is rigorously mathematical.The whole process of domain-detection can be carried outalgorithmically (see Hanson & Crutchfield 1992 for details). Whenthe boundary of a domain remains spatially localized over time, thenthe domain becomes a “particle”:
Embedded particles are a primary mechanism for carrying information(…) over long space-time distances. (…) Logicaloperations on the signals are performed when the particles interact.The collection of domains, domain walls, particles and particleinteractions for a CA represents the basic information processingelements embedded in the CA’s behavior—the CA’s“intrinsic” computation. (Crutchfield & Mitchell 1995:10744)
There are five stable particles (termed α, γ, δ,ε, µ) and one unstable particle (β) for thisautomaton: their interaction (annihilation, decay, reaction) supportsthe emergent logic of the system. The two circles in the image aboveare examples of what may happen during an interaction. In the firstcase, \(\alpha + \delta \rightarrow \mu\), a spatial configurationrepresenting high, low, and then ambiguous densities is mapped to ahigh-density signal µ; in the second, \(\mu + \gamma \rightarrow\alpha\), a spatial configuration representing high, ambiguous, andthen low density is mapped to an ambiguous-density signal α(Crutchfield & Mitchell 1995: 10745). The whole computationalmechanics is worth being explored in more detail, but we can alreadysafely generalize to the basic philosophical point of this and relatedworks.
According to O’Connor and Wong 2015, in the context of dynamicsystems and the study of complexity, emergence is characterized bymost authors
strictly in terms of limits on human knowledge of complex systems.Emergence for such theorists is fundamentally an epistemological, notmetaphysical, category. (O’Connor & Wong 2015: Sec. 2)
But the Crutchfield-Mitchell approach suggests a differentperspective. Firstly, the emergent (both in theE2- and in theE3- sense) computational properties in CA can in an important sense beobjectively defined (see Crutchfield 1994a and the more accessibleCrutchfield 1994b): although it is customary in this setting to talkof emergent computation being “in the eye of thebeholder”, because not explicitly encoded at the base level, thedetection and classification of patterns is itself algorithmic.Secondly, Crutchfield characterizes CA-emergence of this kind as, in asense, intrinsic: the emerging patterns “are importantwithin the system” (Crutchfield 1994b: 3), not merelyimportant for an observeroutside the system. More precisely:they are mathematically grounded on the basic features of the system,despite not being explicitly mentioned in the standard abstractcharacterization of the program, that is, the transition ruleimplemented in the CA cells (Crutchfield mentions as non-intrinsicemergent phenomena the patterns in the Belousov-Zhabotinsky reactionand the energy recurrence in an harmonic oscillator chains reported byFermi, Pasta and Ulam—see Crutchfield 1994b).
Crutchfield infers from this that many cases of emergence are indeednot reducible to some interaction with an observer. They are genuineinstances of an intrinsic phenomenon, not the results of somehuman-biased discovery (Crutchfield 1994b: 2). If emergence was notintrinsic, scientific activity would indeed be a subjective enterpriseof “pattern discovery”:
Why? Simply because emergence defined without this closure leads to aninfinite regress of observers detecting patterns of observersdetecting patterns… (Crutchfield 1994b: 10)
Summarizing Crutchfield’s work, Miller and Page say that theconcept of emergence
has thus made the transition from metaphor to a measure, fromsomething that could only be identified by ocular magic to somethingthat can be captured using standard statistics. (Miller & Page2007: 234)
Such remarks may sound philosophically naive to a trainedepistemologist. It is not obvious, for instance, that the existence ofa mathematical or specifically algorithmic emergent pattern wouldblock a supposed regress entailed by a non-mathematical emergence. Whywould science as an activity of (occasionally) non-algorithmic patterndiscovery be a merely subjective enterprise? If these claims can bere-phrased in a philosophically sophisticated way, though, they maychallenge standard definitions ofweakly emergent properties(in the sense of Chalmers 2002). Teller 1992, Clark 1996, and Bedau1997, for instance, all run together instances of “patterndiscovery”—, taken as a subjective, observer-dependentactivity—with instances of intrinsic emergence—aphenomenon that, as we have just seen, can be characterized in thecontext of CA as objective and statistically significant (see therelevant section of the entry onemergent properties).
Finally, on toE4: emergence as incompressible unfolding. Bedau 1997 defines amacro-stateemergent in this sense just in case it can bederived from knowledge of the system’s micro-components only bydirect simulation of the overall system evolution. Here the idea isone of “emergent phenomena as those for whichprediction requiressimulation” (Clark 2013:134):E4-emergent macro-features would be those that can only be predicted by directlymodelling the micro-features, with no computational shortcut tocompress the information at micro-level. The first thing to notice,according to Clark, is that this notion of emergence is at odds withat least some of the previous ones:E3-emergence has it that
emergent phenomena are oftenprecisely those phenomena inwhich complex interactions yield robust, salient patterns capable ofsupporting prediction. (ibid)
that is, patterns that deliver compressible information. Next, whilethe characterization ofE4, Bedau-style emergence may work pretty well in the case of completelychaotic systems, it does not sit well with such CA as Rule\(\phi_{17083}\). According to the proposed definition, the answer tothe classification problem given by Rule \(\phi_{17083}\) is anemergent phenomenon just in case the only way to go from \(t_0\) to\(t_n\) is by explicitly simulating the system evolution.

\(t_0\)

\(t_n\)
As it turns out, however, this is not the case. UsingCrutchfield’s particle model it is possible to predict theresult of the classification by simply making particle-calculations,without bothering about the underlying dynamics (Hordijk, Crutchfield,& Mitchell 1996). Here then, the emerging computation in the CAwould not be a case of emergence for Bedau.
What about the ontological side of emergence? The issue of the realityof emerging patterns in CA has been examined both by reductionist(Dennett 2003) and by emergentist philosophers (Thompson 2007). It isfair to say that the CA literature so far has not significantlycontributed to the ongoing philosophical debate on the purelyontological side of reductionism. CA patterns that are“objectively” detected via computation, as per theMitchell-Crutchfield approach, are notipso facto newprimitives to be included in an ontology. It may well be that featuresof a CA that are objective, in the sense of not depending on theinteraction with an observer, are nevertheless ontologically reducibleto more basic entities via suitable definitions (see Kim 1999; Dennett1991).
Philosophers have debated the relationship between determinism andfree will for over two millennia. Two opposite stances can be takentowards the problem:compatibilism maintains that free willis compatible with a deterministic world, whichincompatibilism denies (see the entries onfree will andcompatibilism). Surprisingly enough, both Daniel Dennett and Stephen Wolfram arguedthat adopting the CA perspective can provide a solution, or perhaps adissolution, of the longstanding mystery of free will.
A major obstacle to accepting compatibilism is our persuasion thatdeterminism implies inevitability (Dennett 2003: 25). We may thus makecompatibilism palatable by exhibiting an intuitive counterexample tothat conviction: a deterministic world in which, however, noteverything is inevitable, i.e., something is avoidable (Ibid: 56).Dennett maintains that CA can do this. He takesLife as avivid illustration of how, in a deterministic but complex enoughworld, we can abstract away from the bottom level and the micro-lawsdeterministically governing it, and describe what is happening bytaking the emergent level seriously. Recall the eater-glider dynamics:
An Eater devouring a Glider

\(t_0\)

\(t_2\)

\(t_4\)
At \(t_0\), an observer aiming at predicting the evolution of thisspace-time region has basically two choices: she can take into account(what Dennett calls) thephysical level, and compute pixel bypixel the value of each cell state at each time step; or, she canfocus on thedesign level, and employ high-level conceptssuch as those ofglider andeater to ground herpredictions (Dennett 2003: 39). The first option is perfectlydeterministic, but has a flaw: it is time consuming, in such a waythat, by the time you have made the required computation, the worldhas already evolved (this is especially true of auniversalCA—as we have already hinted at above, and shall expand onsoon). The second option is much faster: you know without muchcalculation what is going to happen to a glider meeting an eater. Thepredictions though, cannot be 100% reliable:
Whereas at the physical level, there are absolutely no exceptions tothe general law, at the design level our generalizations have to behedged: they require “usually” clauses (…). Straybits of debris from earlier events can “break” or“kill” one of the objects in the ontology at this level.Their salience as real things is considerable, but not guaranteed.(Dennett 2003: 40)
Dennett’s point is thatavoidance itself is ahigh-level concept. As such, it is compatible with a deterministicbottom level (because the concepts at the emergent level are, bydesign, independent from the micro-laws). Thephysicaldescription and thedesign description ofLifeare different interpretations of the same basic ontology, namely, thesparse ontology of CA. While in theory we could avoid the introductionof emergent concepts, in practice it is only by speaking of gliders,movements and avoidance that we can make sense of the evolution of thesystem (Dennett 2003: 43–44). Even without knowingLife’s physics, we could do a good job if we predictedthe future by mentioning only high level patterns.Life isjust a toy universe but, Dennett claims, these new intuitions aresufficient to see that, in some deterministic worlds, something isavoidable. For instance, it is true at the design level that glidersactually avoid eaters. Thus, the inference from determinism toinevitability can be blocked.
A reply to Dennett’s argument consists in denying thatLife-avoidance is real avoidance. Dennett himself puts aversion of this argument in the mouth of Conrad, a fictional skepticphilosopher discussing Dennett’s idea throughout his book:
It may look like avoidance, but it’s not real avoidance. Realavoidance involves changing something that was going to happen intosomething that doesn’t happen. (Dennett 2003: 58)
Rephrasing Dennett’s example, we can identify an ambiguity inConrad’s argument. Imagine that a baseball isgoing tohit you in the face—but you dodge it: a clear case ofreal human avoidance. In what sense was the baseball“going to” hit you in the face? (Dennett 2003: 59) Onemight say that it wasnever really going to hit you,precisely because it triggered the reaction of whatever“avoidance system” you have. What is the differencebetween this avoidance andLife-avoidance? For Dennett, thisis not a difference in kind, but in complexity: gliders and humansboth have avoidance systems, but human systems are much moresophisticated. The choice of a universal CA as a toy universe allowsus to draw a stronger conclusion: since we know thatLife isequivalent to a universal Turing machine, as explained above, somepatterns in that universe may display avoidance systems at least ascomplex as ours. Dennett claims that compatibilism has thus won thefirst round:
you agree that (…) I’ve shifted the burden of proof:there shall be no inferring inevitability in any sense fromdeterminism without mounting a supporting argument. (Dennett 2003: 61)
Stephen Wolfram addresses the phenomenon of free will in his book onCA, with an ambitious tone:
From the discoveries in this book it finally now seems possible togive an explanation for this [free will]. And the key, I believe, isthe phenomenon of computational irreducibility. (Wolfram 2002:750)
We have introduced the issue of computational (or algorithmic)irreducibility when we explained the philosophical consequence of auniversality proof for an automaton, namely that, although a systemfollows definite underlying laws, “its overall behavior canstill have aspects that fundamentally cannot be described byreasonable laws” (Wolfram 2002: 750). This is again the issue ofpredictability via step by step micro-computations. In this“separation between the underlying rules for the system and itsoverall behavior” (Wolfram 2002: 751) lies the secret of freewill, since it seems that we attribute free will to a system just when“we cannot readily make predictions about the behavior of thesystem” (Wolfram 2002: 751). According to Wolfram, CA play aleading role in providing a new framework to understand thephenomenon. While explanations from chaos theory and quantumrandomness have recently been proposed (see the entry onchaos), “nothing like this is actually needed” (Wolfram 2002:752). By observing CA, we can understand how something with simple anddefinite micro-rules, like those governing our neurons, can produce abehavior free of obvious rules:
the crucial point is that this happens just through the intrinsicevolution of the system—without the need for any additionalinput from outside or from any sort of explicit source of randomness.(Wolfram 2002: 752)
Wolfram’s point is similar to some of Dennett’s remarks,namely: taking some sort of “design stance”, Wolframsuggests that one can talk about a cellular automaton as if it just“decides” to do this or that—“therebyeffectively attributing to it some sort of free will” (Wolfram2002: 752). One easily sees the closeness to Dennett’s famousintentional stance (Dennett 1987; see the entry onintentionality, esp. Section 9).
How important are CA to these accounts? Dennett and Wolfram both useCA as intuition pumps. However, their positions seem to be slightlydifferent. For while the former sees CA as a “usefultoolkit” to develop intuitions and vividly illustrate hisarguments (Dennett 2003: 40), the latter claims that CA provides a“new kind of intuition”, one that “no branch ofeveryday experience” could provide (Wolfram 2002: 41).
The importance Wolfram attaches to CA seems to rely on a single,generic “indispensability argument” to the conclusion thatCA justify the foundation of “a new kind of science”(Wolfram 2002: 7–16). We can reconstruct this argument asfollows:
(NKS1) is to be taken at face value. It entails that the concepts involvedwere not previously known. Wolfram talks in terms of “the singlemost surprising scientific discovery I have ever made” (Wolfram2002: 27). Is (NKS1) true? For sure, the idea that a deterministic and simple system mayproduce unpredictable behavior started circulating in the scientificcommunity well before Wolfram’s work. Signs of what is now chaostheory can be traced back to the 19th and early 20th century, e.g., tothe work of Poincaré 1914. One might grant that CA allowed thediscovery thatsimple systems may producecomplexbehavior via the proof that they have unpredictable emergentcomputational complexity (although this discovery itself, as outlinedin our brief historical section, was not made but only greatlypublicized by Wolfram). Why was this discovery not made earlier?Wolfram’s own diagnosis is twofold: on the one hand, we have the“engineering” intuition that to produce something complexwe should build something complex—that is because this is howordinary machines work. On the other, CA were not obviously connectedto any established discipline, and therefore they were not studied inacademic circles.
As for (NKS2), we just examined the case of free will. In Wolfram’sperspective, free will looks just like another puzzling philosophicalphenomenon explained away by the advance of (a new kind of) science.Just as life was puzzling before the discovery of the double helix,free will was puzzling before the discovery of a suitable scientifictheory, one that can finally account for the separation between microand macro level. Many reductionist philosophers are no strangers tothis kind of argument. The concepts and intuitions used incontemporary philosophy are often rooted in current scientificpractices. When groundbreaking discoveries are made, old arguments maybe revised: troublesome concepts become harmless and new challengesare introduced. From this perspective, Wolfram’s account of thefree will problem may lack philosophical rigor, but it is a promisingstart to re-address the challenge armed with new scientific models ofdeterminism and complexity—pretty much as Dennett does. Whilemany successful applications are needed to fully vindicate (NKS2), our first assessment concludes that, at the very least, it is notobviously false. As for the “new regularities” promised by (NKS2), we will address them in the next section.
CA are computational systems that perform complex tasks on the basisof the collective behavior of simple items. What, if anything, does ittell us about the importance of computation for systems in nature?
Different conclusions have been drawn by practitioners in the field.Some have endorsed the more modest claim that the computationalfeatures of CA are important to understand and compare social,biological, and physical systems modeled by them; but others havetaken CA to support the view that computation and informationprocessing in a discrete setting lie at the very foundations ofreality. We will explore the stronger claim inSection 3.4 below. As for the weaker claim, it is not possible here to addressthe general importance of computational properties across the sciences(see Mitchell 2009: 169–185). We will focus instead on aspecific, and controversial, principle put forward by Stephen Wolfram,the so-called “Principle of ComputationalEquivalence”:
There are various ways to state the Principle of ComputationalEquivalence, but probably the most general is just to say that almostall processes that are not obviously simple can be viewed ascomputations of equivalent sophistication. (Wolfram 2002:716–717)
The Principle is the most fundamental law of Wolfram’sNewKind of Science, as well as a prominent regularity featured by (NKS2): “its implications are broad and deep, addressing a host oflongstanding issues not only in science, but also in mathematics,philosophy and elsewhere” (Wolfram 2002: 715). Contrary toWolfram’s claims, the Principle may not be new to philosophy atall. That “all processes can be viewed as computations”(Wolfram 2002: 715) has often been argued for in the history ofphilosophy, just as the claim that universal computation is awidespread phenomenon in the natural world (see, e.g., Searle 1992;Putnam 1988 and the entry oncomputation in physical systems). However, Wolfram’s explanation of the Principle includes twofurther, and more specific, statements:i) No natural systemcan compute more things than a universal digital computer (see Wolfram2002: 730), that is, “universal computation is an upper limit onthe complexity of computation” (Mitchell 2009: 157); andii) The computations performed by natural systems areessentially equivalent in sophistication (see Wolfram 2002:719–726).
The first point is relevant once we compare digital computation withthe idea of a computer working with real numbers in continuous time.It has been proved (see C. Moore 1996) that such a device would beable to compute more functions than a traditional Turing machine.However, proponents of a discrete space-time like Wolfram treatcontinuous processes as, in a sense, epiphenomenal, since they alreadyhave independent reasons (some of which will be addressedbelow) to believe in a fundamentally discrete universe. As for the secondpoint, its main problem is that the interpretation of“equivalent sophistication” is not straightforward. Foreven assuming that universal computation is widespread, it does notseem to follow that all computation is equivalent in sophistication.Complexity scientists, even after having agreed with Wolfram on theimportance of computation for social, biological and physical systems,and even on the extent to which universal computation is supported innature, are puzzled by his claim:
I find it plausible that my brain can support universal computation(…) and that the brain of the wormC. elegans is also(approximately) universal, but I don’t buy the idea that theactual computations we engage in, respectively, are equivalent insophistication. (Mitchell 2009: 158)
It is not clear what to make of computational equivalence. Yes, thereis a threshold in which systems are related to one another, but giventhe difficulty of moving among them, is this any more useful thansaying that skateboards and Ferraris are equivalent means of movingabout? (Miller & Page 2007: 232)
Miller and Page argue that, for all scientific purposes,“representations do matter, and what can be easily computed inone system is often difficult (but still possible) to compute inanother”. Even if Wolfram is right when he claims that a simpleautomaton can calculate the first few prime numbers (Wolfram 2002:640), the calculation we have to do to encode the input and decode theoutput is very complex:
This latter calculation could be much more “difficult” tocompute than the original problem, just as the complexity of acompiler can far exceed the complexity of the programs it produces.(Miller & Page 2007: 232)
Taking these objections ever further, the crucial consideration isthat any system with a large enough state space could be shown to be(in Wolfram sense) equivalent to “intelligent systems”.Far from supporting some form of universality, Aaronson argues thatthis type of “equivalence” stems from a misunderstandingof the role of computational reductions:
Suppose we want to claim, for example, that a computation that playschess is “equivalent” to some other computation thatsimulates a waterfall. Then our claim is only non-vacuous ifit’s possible to exhibit the equivalence (i.e., give thereductions) within a model of computation that isn’t itselfpowerful enough to solve the chess or waterfall problems. (2011:285–286)
In other words, unless it can be proved that the encoding/decodingfunctions are not doing all the heavy lifting (and just use asecondary system, a waterfall or a CA, to vacuously transmitinformation), it is hard to consider the alleged“equivalence” meaningful at all: “we are merelytrading an infeasible search among programs for an infeasible searchamong input encoding schemes” (Aaronson 2002: 413). Moreover, arationale for studying CA (Ilachinski 2001: 8) is that theirimplementation can be massively optimized for specific problems withsignificant performance gain on standard computers (see for exampleZaheer et al. 2016). Unless Wolfram’s notion of“equivalent sophistication” simply means “theycompute the same functions”—in which case, the claim is atruism—, the Principle cannot explain this empirical difference.The Principle may have a more substantive reading if understood as ametaphysical thesis about the universe in general, not as a scientificgeneralization having merely heuristic value. Under this strongerreading, the Principle is no more concerned with particular systemsthat can be fruitfully analyzed via computation theory, but with thefact that (different epistemological properties notwithstanding) theworld itself is a computer. In a sense, any system would just be theemergent manifestation of a unique, underlying computational reality.Which naturally leads us to finally address the boldest question: Whatif the universe itself was a CA?
When discussing CA as models of reality we need to carefullydistinguish the different meanings ofmodelling.(CA1) above (section 1.2) discussed CA as “models of computation”: CA modelparallel computations in the rather trivial sense that theyperform them; for that is what their cells do: they associateinputs to outputs by implementing algorithmic functions, together withtheir mates. In other words, they model computation as Turing machinesdo (but, of course, with different underlying ideas).
(CA2) introduced a different sense ofmodelling for CA, i.e., theidea that CA are fruitfully used in current scientific practices tostudy an incredible variety of phenomena: chemical systems (e.g.,Kier, Seybold, & Cheng 2005), urban growth (e.g., Aburas et al.2016), traffic flow (e.g., Lárragaa et al. 2005), even warfare(e.g., Ilachinski 2004). According to the characterization ofBarberousse, Franceschelli, and Imbert 2007 (Other Internet Resources), a common technique is the “phenomenological” modelling.Phenomenological modelling happens when one models in a direct way,that is, without making use of a previous explanatory theory: onelooks at how traffic flows and tries to build a CA that reproduces asufficiently similar behaviour and allows to make useful predictions.The key question for modellers here is,
Are there well established correspondence rules that I can use totranslate features of the system I want to model into specificationsfor an adequate cellular automaton model of it? (Toffoli &Margolus 1990: 244)
In this sense, CA modelling is a special case of “agent-basedmodelling” (Miller & Page 2007): the modeller starts withmicro-rules to explore macro-behavior: e.g. in social sciences, seethe classic Schelling 1978, in decision theory see Grim et al. 1997,in political theory Grim et al. 2005.
Starting from(CA2), it is natural to ask whether it is possible to push the boundarieseven further, i.e., using CA to model more “fundamental”parts of reality. For example Toffoli 1984 conjectures that CA mayallow us toreplace physical modelling with differentialequations (and related notions of real variables, continuity, etc.).Computations with differential equations, Toffoli claims, are:
at least three levels removed from the physical world that they try torepresent. That is, first (a) we stylize physics into differentialequations, then (b) we force these equations into the mold of discretespace and time and truncate the resulting power series, so as toarrive at finite difference equations, and finally, in order to committhe latter to algorithms, (c) we project real-valued variables ontofinite computer words (“round-off”). At the end of thechain we find the computer – again a physical system;isn’t there a less roundabout way to make nature model itself?(Toffoli 1984: 121)
Such a less roundabout way can be provided, so the proposal goes, byCA. We see here a path, from the view that CA are useful as aphenomenological heuristic to predict the behaviour of some aspects ofreality, to the claim that CA modelling may, in a sense, be closer tothe underlying physics than any non-discrete alternative, asanticipated in(CA4) above.
We are now ready to move to the final step, taking us into speculativemetaphysics of physics. In the last fifty years various scientists(see Zuse 1982; Fredkin 1993; Wolfram 2002) have advanced a boldconjecture: that the physical universeis, fundamentally, adiscrete computational structure. Everything in ourworld—quarks, trees, human beings, remote galaxies—is justa pattern in a CA, much like a glider inLife.
One may dispute the very meaningfulness of claims of this kind,concerning the world as a whole: something that, to speak Kantian, isnever given to us in experience. Floridi 2009 has argued against suchdigital ontology, not by defending a continuous picture of reality,but by arguing that the world is not the right kind of thing to whichsuch notions as discreteness and continuity can meaningfully apply.These concern rather, in Kantian fashion, our ways of modellingreality, or “modes of presentation of being”. If one, onthe contrary, thinks that there must be a fact of the matter about thediscrete vs. continuous nature of the world (as argued in Berto &Tagliabue 2014, on the basis of considerations from cardinality andgeneral mereology [see entry onmereology]), then the next issue is: what does (the philosophy of) fundamentalphysics have to say about this? It is fair to claim that the issue isopen. Scholars such as Nobel prize winner ’t Hooft (1997)seriously explore discretist views, and approaches based on so-calledcausal set theory (see Dowker 2003; Malament 2006) take the geometryof real-world spacetime as such that at the Planck length(\(10^{-33}\) cm) it is discrete. Cognate strategies take spacetime asmade of polysimplexes, usually polydimensional counterparts oftetrahedra (see Ambjorn et al. 2004) ; adding the claim that suchpolysimplexes compute functions takes us already in the vicinity ofCA. Other scholars, instead, are against the idea of a digital world.Deutsch (2005) and Hardy (2005) reject the view that quantumprobabilities and quantum computing vindicate a discrete structure ofspace-time, and claim that quantum mechanics complies with the ideathat the world is continuous even more than classical physics. Whilewe are, thus, in the realm of speculation, we can nevertheless singleout two main reasons to investigate the provocative claim that theworld is a discrete CA. First, the arguments put forward to supportthe view may be philosophically interesting in themselves. Second, theontological structure of a CA-world can be fruitfully compared toexisting metaphysical accounts. Let us take each point in turn.
The picture of nature as a CA is supported by an epistemologicaldesideratum, i.e., having exact computational models of thephysical world (see for instance the discussion ofonticpancomputationalism in the entry oncomputation in physical systems). While this is certainly one of the arguments involved, it is not theonly one and probably not the strongest. As Piccinini points out, evensomeone who shares that desire “may well question why we shouldexpect nature to fulfill it” (Piccinini 2010: Section 3.4).
Ilachinski proposes a different “argument fromepistemology” (Ilachinski 2001: 661–2). Let us consideragain the space-time diagram ofRule 110:

Fig. 7
Let us imagine we ignore its being generated by the iteration of asimple local rule, or even that it is an automaton. Then, saysIlachinski:
Noticing that the figure consists of certain particle-like objectssprinkled on a more-or-less static background, the simplest (mostnatural?) thing for you to do (…) is to begin cataloging thevarious “particles” and their “interactions.”(…) What you almost assuredly will not have, is any idea thatthe underlying physics really consists of a single—verysimple—local deterministic rule(…).How different isthis alien two-dimensional world from our own? (Ilachinski 2001:662).
This highlights how CA may generate situations that we view asphysically realistic. But one may consider this as a mere suggestion:that we cannot rule outa priori our universe’s being,at its bottom level, a CA does not entail that it actuallyisa CA.
A firmer ground to explore the hypothesis comes from some independentreasons of theoretical dissatisfaction with contemporary physics. Wewill limit ourselves to what we may callconceptualcomplaints, as opposed to ones more closely related to scientificpractice, such as the failure of reductions of quantum mechanics andrelativity to aTheory of Everything. We will examine thefollowing three: (i) the problem of infinity, (ii)the need for a transparent ontology, (iii) the physical roleof information.
As for complaint (i): while infinite and infinitesimalquantities provide us with powerful tools to model and advancepredictions on the physical world, it remains controversial whatontological conclusions should be drawn from this fact. Since thediscovery ofZeno’s Paradox (see entry), the continuity of space-time, as well as other fundamental physicalvariables, have puzzled philosophers and scientists alike. In thewords of the physicist Richard Feynman:
It bothers me that, according to the laws as we understand them today,it takes a computing machine an infinite number of logical operationsto figure out what goes on in no matter how tiny a region of space,and no matter how tiny a region of time. How can all that be going onin that tiny space? Why should it take an infinite amount of logic tofigure out what a tiny piece of space-time is going to do? So I haveoften made the hypothesis that ultimately physics will not require amathematical statement, that in the end the machinery will be revealedand the laws will turn out to be simple, like the checker board withall its apparent complexities. (Feynman 1965)
One way theoretical physicists have approached the problem is toconjecture a fundamental layer of reality along the lines of EdwardFredkin’s “Finite Nature Hypothesis”:
Finite Nature is a hypothesis that ultimately every quantity ofphysics, including space and time, will turn out to be discrete andfinite; that the amount of information in any small volume ofspace-time will be finite and equal to one of a small number ofpossibilities. (…) We take the position that Finite Natureimplies that the basic substrate of physics operates in a mannersimilar to the workings of certain specialized computers calledcellular automata. (Fredkin 1993: 116)
If a cellular automaton is a model satisfying this hypothesis, then“underneath the laws of physics as we know them today it couldbe that there lies a simple program from which all the known laws(…) emerge” (Wolfram 2002: 434). If, as we have seenabove, currently there is no agreement on the issue whether physicalreality is fundamentally continuous or discrete, at least the FiniteNature Hypothesis seems to be a no less falsifiable prediction (seeFredkin 1990) than many speculative metaphysical pictures.Unfortunately, although we have attempts to recapture field theorywithin CA theory (see, e.g., Svozil 1987, Lee 1986), there is noagreed-upon derivation of today’s continuous physics within a CAframework; it is therefore safe to say that no party has a clearadvantage here.
As for complaint (ii): one reason to adopt the view of CA asmodels of a fundamentally discrete world is the desire for atransparent ontology. Take a materialist philosopher for whom the taskof physics is to provide an ultimate description of reality on thebasis of a handful of basic physical properties and relations. Asargued in Beraldo-de-Araújo & Baravalle forthcoming, adigital ontology may take different models of computation at itsfoundation: by analyzing the ontological commitments of CA (vs.traditional Turing machines), they conclude that CA are very close tosupporting a traditional form of physicalism. In this perspective, aCA-based physics may provide a neat and elegant ontological picture:one that would be describable in a first-order formal theory includingthe axioms of standardmereology (see entry) (even of mereotopology, as presented, e.g., in Casati, Varzi 1999),and whose theorems would be computable in finite time (see Berto,Rossi, Tagliabue 2010: 73–87). Besides, CA make easier toreconcile prima facie contradictory properties of different physicallaws, such as thereversibility of micro-laws and theirreversibility of the Second Law of Thermodynamics (see forexample Wolfram 2002: 441–457; Berto, Rossi, Tagliabue 2010:43–46). There is no agreement on whether the Second Law gives usa fundamental feature of physical reality, or it is a spin-off ofunderlying principles which are time-reversal invariant and act on aninitial state of the universe at low entropy (see Albert 2000). If theworld is discrete and temporal reversibility is fundamental,reversible CA like, e.g., that of Berto, Rossi, Tagliabue (2016) maybe more than mere computational tools achieving some degree ofcomputational efficiency via their reversibility.
As for point (iii), concerning the physical role ofinformation: CA can accommodate a speculative hypothesis entertainedby a number of scientists (Wheeler 1990, Ilachinski 2001) andphilosophers (Chalmers 1996), namely that information is not just oneaspect of the physical world, but, in a sense, the most fundamental.For instance, Fredkin’s Finite Nature Hypothesis not onlystresses the importance of the informational aspects of physics, but“it insists that the informational aspects are all there is tophysics at the most microscopic level” (see Fredkin 1993).
One way in which this idea has been developed is the so-called“it from bit” theory (see again Wheeler 1990). InDavid Chalmers’ words, this approach “stems from theobservation that in physical theories, fundamental physical states areindividuated asinformational states” (Chalmers 1996:302). Physics is silent on what accomplishes the specified functionalroles, so “any realization of these information states willserve as well for the purpose of a physical theory” (Chalmers1996: 302). The “it from bit” approach is particularlyappealing to Chalmers as a philosopher of mind committed toqualia being intrinsic, non-reducible properties, because itallows for a simple unification: we need intrinsic properties to makesense of conscious experience, and we need intrinsic properties toground the informational states that make up the world’sphysics. If we claim that all the informational states are grounded inphenomenal or proto-phenomenal properties, we “get away with acheap and elegant ontology, and solve two problems in a singleblow” (Chalmers 1996: 305). The cell states in a CA fit thebill: if we interpret them as proto-phenomenal properties, we obtainthe intrinsic structure of some sort of computational neutral monism(for an historical introduction, see the entry onneutral monism).
Albeit individually controversial, taken together the three pointssupport a simple and elegant metaphysical picture which is notevidently false or incoherent.
Supposing that the actual physical world is a giant, discreteautomaton, are there philosophically interesting conclusions to bedrawn? A first one has already been partially explored in connectionwithLife: if nature is a CA, it has to be auniversal CA, given that universal computers (e.g., the oneon which you are probably reading this entry) uncontroversially existin the physical world. Then its evolution is algorithmicallyirreducible, given the Halting Problem. Notwithstanding theopportunity of devising approximate forecast tools, we are left with auniverse whose evolution is unpredictable for a reason quite differentfrom the ones commonly adduced by resorting to standard physics, suchas quantum effects or random fluctuations: it is unpredictable justbecause of its computational complexity.
A second philosophical topic is the connection between a CA-world andone of the most famous and controversial contemporary metaphysicaltheses, namely David Lewis’Humean Supervenience (HS).Using Ned Hall’s characterization (see section onHumean supervenience in the entry on David Lewis’s metaphysics), we can state HS as the collection of these four claims about thestructure of our world:
If we substitute “space-time points” with“cells”, HS gets very close to a CA ontology: cells arearranged in a lattice, have various spatiotemporal relations to oneanother (e.g.,being a neighborhood of), and have monadicproperties (states) which can be consideredperfectlynatural, i.e., the basic properties out of which any other can beconstrued. A CA universe is thus aprima facie abstract modelof Lewis’ HS and can be fruitfully used to illustrateLewis’ original point, which was a reductionist one:
The point of defending Humean Supervenience is not to supportreactionary physics, but rather to resist philosophical arguments thatthere are more things in heaven and earth than physics has dreamt of.(Lewis 1994: 474).
There are, however, two differences between the ontology naturallysuggested by CA theories and Lewis’ view. First, for Lewisspace-time is an essentially continuous, four-dimensional manifold,eternalistically conceived (see the section on eternalism in the entryontime and the discussion on four-dimensionalism in the entry ontemporal parts for an introduction), while in a standard CA-driven ontology, it isnot. Second, Lewis reduces laws of nature to particulars while, as wehave seen in Section 2 above, CA rules are always included as afurther specification of the model.
The first disagreement may not be very substantial. A CA-world iscompatible, for instance, with an eternalist conception. The idea thatthe world’s next state is computed at any time step can be seenas a merely heuristic device, a “shortcut” for a moreproper eternalist description in which the cell’s states areonce and for all “stuck” in their space-time position (wedid this ourselves when describing the first picture in this sectionas the complete space-time evolution of a micro-universe).
The second disagreement concerning the laws of nature may instead be athorny issue. According to Lewis 1973, laws of nature are the truegeneralizations found in the deductive system that best describes ourworld (where “best” basically refers to the optimaltrade-off between strength and simplicity; see the entry onlaws of nature for an introduction to, and further details on, this debate): laws ofnature supervene on the four-dimensional arrangement of particularsand their properties. To the contrary, the standard description of aCA does not take the space-time evolution for granted: it takes theautomaton’s initial conditions as given, and generates thesystem evolution over time via the CA transition function. Particularsdepend on laws, notvice versa. A CA world is not laid out inadvance, but it grows as long as the laws are applied to particulars(a similar point is also made in Wolfram 2002: 484–486).
On the other hand, one may tentatively interpret, in Lewisian fashion,the laws of a CA as the generalizations contained within the deductivesystem that best describes the CA behavior. Let us consider one lasttime our micro-universe fromRule 110:

Fig. 7
A suitable deductive system for this space-time diagram may beobtained with just two axioms, one stating the initial conditions ofthe system, the other phrased as a conditional expressing the CAtransition rule. If this conditional is a true generalization embeddedin the deductive system that best describes our toy universe, thenRule 110 can count as a Lewisian law of nature in thisuniverse, as expected.
Although some CA topics are still relatively untouched by philosophers(e.g., the nature of space and time (see Wolfram 2002: 481–496),the representation of knowledge in Artificial Intelligence (see Berto,Rossi, Tagliabue: 15–26), the relationship between informationand energy (see Fredkin & Toffoli 1982)), there are manyconceptual challenges raised in connection with CA. While in somecases the CA contribution was indeed overrated by practitioners, inothers CA proved to be useful models of important phenomena.
As a final comment: what is left, from a purely scientificperspective, of theNKS Argument? Let us go through itagain:
Even granting the truth of the two premises (that is, even grantingthe troublesome Principle of Computational Equivalence), it isdoubtful the desired conclusion would follow. Surely, CA have providednew intuitions and explanations for a set of phenomena—Wolframquite successfully applies his “discovery” to biology,computer science, physics, finance. However, there is no evidence thatmany of our best scientific explanations will soon be reduced to theCA framework, and indeed many aspects of complexity itself still lieoutside the CA paradigm, with no unification in sight. Paradigm shiftsusually require the new paradigm to explain the same phenomena the oldone did, and some more. CA are a promising field, but manydevelopments are still needed for (NKS3) to be true.
Miller & Page 2007 and Mitchell 2009 both contain a chapterdevoted to CA: they are accessible introductions written by notablescholars. Ilachinski 2001 is an excellent starting point for theexploration of the CA literature: although not up-to-date on sometechnical points, the volume nicely introduces the field and coversits most important applications. Wolfram 2002 took some twenty-yearsand 1200 pages to be finished and is a passionate journey includingbold speculations on the role of CA for understanding the universe andour place in it.
How to cite this entry. Preview the PDF version of this entry at theFriends of the SEP Society. Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO). Enhanced bibliography for this entryatPhilPapers, with links to its database.
chaos |Church-Turing Thesis |compatibilism |computability and complexity |computation: in physical systems |emergent properties |free will |laws of nature |quantum theory: quantum field theory |recursive functions |supervenience |Turing machines
The authors would like to thank three anonymous referees, ScottAaronson, Anouk Barberousse, Matteo Colombo, Michele Di Francesco,Cyrille Imbert, Giulia Livio, Massimo Mastrangeli, Mattia Pavoni,Andrea Polonioli, Gabriele Rossi, Marta Rossi, Katherine Yoshida forhelpful comments, discussions, suggestions, and references, and Dr.Robert Plant for checking our English.
View this site from another server:
The Stanford Encyclopedia of Philosophy iscopyright © 2025 byThe Metaphysics Research Lab, Department of Philosophy, Stanford University
Library of Congress Catalog Data: ISSN 1095-5054