Quantum mechanics is generally regarded as the physical theory that isour best candidate for a fundamental and universal description of thephysical world. The conceptual framework employed by this theorydiffers drastically from that of classical physics. Indeed, thetransition from classical to quantum physics marks a genuinerevolution in our understanding of the physical world.
One striking aspect of the difference between classical and quantumphysics is that whereas classical mechanics presupposes that exactsimultaneous values can be assigned to all physical quantities,quantum mechanics denies this possibility, the prime example being theposition and momentum of a particle. According to quantum mechanics,the more precisely the position (momentum) of a particle is given, theless precisely can one say what its momentum (position) is. This is (asimplistic and preliminary formulation of) the quantum mechanicaluncertainty principle for position and momentum. The uncertaintyprinciple played an important role in many discussions on thephilosophical implications of quantum mechanics, in particular indiscussions on the consistency of the so-called Copenhageninterpretation, the interpretation endorsed by the founding fathersHeisenberg and Bohr.
This should not suggest that the uncertainty principle is the onlyaspect of the conceptual difference between classical and quantumphysics: the implications of quantum mechanics for notions as(non)-locality, entanglement and identity play no less havoc withclassical intuitions.
The uncertainty principle is certainly one of the most famous aspectsof quantum mechanics. It has often been regarded as the mostdistinctive feature in which quantum mechanics differs from classicaltheories of the physical world. Roughly speaking, the uncertaintyprinciple (for position and momentum) states that one cannot assignexact simultaneous values to the position and momentum of a physicalsystem. Rather, these quantities can only be determined with somecharacteristic “uncertainties” that cannot becomearbitrarily small simultaneously. But what is the exact meaning ofthis principle, and indeed, is it really a principle of quantummechanics? (In his original work, Heisenberg only speaks ofuncertainty relations.) And, in particular, what does it mean to saythat a quantity is determined only up to some uncertainty? These arethe main questions we will explore in the following, focusing on theviews of Heisenberg and Bohr.
The notion of “uncertainty” occurs in several differentmeanings in the physical literature. It may refer to a lack ofknowledge of a quantity by an observer, or to the experimentalinaccuracy with which a quantity is measured, or to some ambiguity inthe definition of a quantity, or to a statistical spread in anensemble of similarly prepared systems. Also, several different namesare used for such uncertainties: inaccuracy, spread, imprecision,indefiniteness, indeterminateness, indeterminacy, latitude, etc. As weshall see, even Heisenberg and Bohr did not decide on a singleterminology for quantum mechanical uncertainties. Forestalling adiscussion about which name is the most appropriate one in quantummechanics, we use the name “uncertainty principle” simplybecause it is the most common one in the literature.
Heisenberg introduced his famous relations in an article of 1927,entitledUeber den anschaulichen Inhalt der quantentheoretischenKinematik und Mechanik. A (partial) translation of this title is:“On theanschaulich content of quantum theoreticalkinematics and mechanics”. Here, the termanschaulichis particularly notable. Apparently, it is one of those German wordsthat defy an unambiguous translation into other languages.Heisenberg’s title is translated as “On the physicalcontent …” by Wheeler and Zurek (1983). Hiscollected works (Heisenberg 1984) translate it as “On theperceptible content …”, while Cassidy’sbiography of Heisenberg (Cassidy 1992), refers to the paper as“On the perceptual content …”. Literally,the closest translation of the termanschaulich is“visualizable”. But, as in most languages, words that makereference to vision are not always intended literally. Seeing iswidely used as a metaphor for understanding, especially for immediateunderstanding. Hence,anschaulich also means“intelligible” or “intuitive”.[1]
Why was this issue of theAnschaulichkeit of quantummechanics such a prominent concern to Heisenberg? This question hasalready been considered by a number of commentators (Jammer 1974;Miller 1982; de Regt 1997; Beller 1999). For the answer, it turns out,we must go back a little in time. In 1925 Heisenberg had developed thefirst coherent mathematical formalism for quantum theory (Heisenberg1925). His leading idea was that only those quantities that are inprinciple observable should play a role in the theory, and that allattempts to form a picture of what goes on inside the atom should beavoided. In atomic physics the observational data were obtained fromspectroscopy and associated with atomic transitions. Thus, Heisenbergwas led to consider the “transition quantities” as thebasic ingredients of the theory. Max Born, later that year, realizedthat the transition quantities obeyed the rules of matrix calculus, abranch of mathematics that was not so well-known then as it is now. Ina famous series of papers Heisenberg, Born and Jordan developed thisidea into the matrix mechanics version of quantum theory.
Formally, matrix mechanics remains close to classical mechanics. Thecentral idea is that all physical quantities must be represented byinfinite self-adjoint matrices (later identified with operators on aHilbert space). It is postulated that the matrices \(\bQ\)and \(\bP\) representing the canonical position andmomentum variables of a particle satisfy the so-called canonicalcommutation rule
where \(\hslash = h/2\pi\), \(h\) denotesPlanck’s constant, and boldface type is used to representmatrices (or operators). The new theory scored spectacular empiricalsuccess by encompassing nearly all spectroscopic data known at thetime, especially after the concept of the electron spin was includedin the theoretical framework.
It came as a big surprise, therefore, when one year later, ErwinSchrödinger presented an alternative theory, that became known aswave mechanics. Schrödinger assumed that an electron in an atomcould be represented as an oscillating charge cloud, evolvingcontinuously in space and time according to a wave equation. Thediscrete frequencies in the atomic spectra were not due todiscontinuous transitions (quantum jumps) as in matrix mechanics, butto a resonance phenomenon. Schrödinger also showed that the twotheories were equivalent.[2]
Even so, the two approaches differed greatly in interpretation andspirit. Whereas Heisenberg eschewed the use of visualizable pictures,and accepted discontinuous transitions as a primitive notion,Schrödinger claimed as an advantage of his theory that it wasanschaulich. In Schrödinger’s vocabulary, thismeant that the theory represented the observational data by means ofcontinuously evolving causal processes in space and time. Heconsidered this condition ofAnschaulichkeit to be anessential requirement on any acceptable physical theory.Schrödinger was not alone in appreciating this aspect of histheory. Many other leading physicists were attracted to wave mechanicsfor the same reason. For a while, in 1926, before it emerged that wavemechanics had serious problems of its own, Schrödinger’sapproach seemed to gather more support in the physics community thanmatrix mechanics.
Understandably, Heisenberg was unhappy about this development. In aletter of 8 June 1926 to Pauli he confessed that “The more Ithink about the physical part of Schrödinger’s theory, themore disgusting I find it”, and: “What Schrödingerwrites about theAnschaulichkeit of his theory, … IconsiderMist” (Pauli 1979: 328). Again, this lastGerman term is translated differently by various commentators: as“junk” (Miller 1982) “rubbish” (Beller 1999)“crap” (Cassidy 1992), “poppycock”(Bacciagaluppi & Valentini 2009) and perhaps more literally, as“bullshit” (Moore 1989; de Regt 1997). Nevertheless, inpublished writings, Heisenberg voiced a more balanced opinion. In apaper inDie Naturwissenschaften (1926) he summarized thepeculiar situation that the simultaneous development of two competingtheories had brought about. Although he argued thatSchrödinger’s interpretation was untenable, he admittedthat matrix mechanics did not provide theAnschaulichkeitwhich made wave mechanics so attractive. He concluded:
to obtain a contradiction-freeanschaulich interpretation, westill lack some essential feature in our image of the structure ofmatter.
The purpose of his 1927 paper was to provide exactly this lackingfeature.
Let us now look at the argument that led Heisenberg to his uncertaintyrelations. He started by redefining the notion ofAnschaulichkeit. Whereas Schrödinger associated thisterm with the provision of a causal space-time picture of thephenomena, Heisenberg, by contrast, declared:
We believe we have gainedanschaulich understanding of aphysical theory, if in all simple cases, we can grasp the experimentalconsequences qualitatively and see that the theory does not lead toany contradictions. Heisenberg 1927: 172)
His goal was, of course, to show that, in this new sense of the word,matrix mechanics could lay the same claim toAnschaulichkeitas wave mechanics.
To do this, he adopted an operational assumption: terms like“the position of a particle” have meaning only if onespecifies a suitable experiment by which “the position of aparticle” can be measured. We will call this assumption the“measurement=meaning principle”. In general, there is nolack of such experiments, even in the domain of atomic physics.However, experiments are never completely accurate. We should beprepared to accept, therefore, that in general the meaning of thesequantities is also determined only up to some characteristicinaccuracy.
As an example, he considered the measurement of the position of anelectron by a microscope. The accuracy of such a measurement islimited by the wave length of the light illuminating the electron.Thus, it is possible, in principle, to make such a positionmeasurement as accurate as one wishes, by using light of a very shortwave length, e.g., \(\gamma\)-rays. But for \(\gamma\)-rays, theCompton effect cannot be ignored: the interaction of the electron andthe illuminating light should then be considered as a collision of atleast one photon with the electron. In such a collision, the electronsuffers a recoil which disturbs its momentum. Moreover, the shorterthe wave length, the larger is this change in momentum. Thus, at themoment when the position of the particle is accurately known,Heisenberg argued, its momentum cannot be accurately known:
At the instant of time when the position is determined, that is, atthe instant when the photon is scattered by the electron, the electronundergoes a discontinuous change in momentum. This change is thegreater the smaller the wavelength of the light employed, i.e., themore exact the determination of the position. At the instant at whichthe position of the electron is known, its momentum therefore can beknown only up to magnitudes which correspond to that discontinuouschange; thus, the more precisely the position is determined, the lessprecisely the momentum is known, and conversely. (Heisenberg 1927:174–5)
This is the first formulation of the uncertainty principle. In itspresent form it is an epistemological principle, since it limits whatwe canknow about the electron. From “elementaryformulae of the Compton effect” Heisenberg estimated the“imprecisions” to be of the order
He continued: “In this circumstance we see the directanschaulich content of the relation \(\boldsymbol{QP} -\boldsymbol{PQ} = i\hslash\).”
He went on to consider other experiments, designed to measure otherphysical quantities and obtained analogous relations for time andenergy:
and action \(J\) and angle \(w\)
whichhe saw as corresponding to the “well-known” relations
However, these generalisations are not as straightforward asHeisenberg suggested. In particular, the status of the time variablein his several illustrations of relation(3) is not at all clear (Hilgevoord 2005; see alsoSection 2.5).
Heisenberg summarized his findings in a general conclusion: allconcepts used in classical mechanics are also well-defined in therealm of atomic processes. But, as a pure fact of experience (reinerfahrungsgemäß), experiments that serve to providesuch a definition for one quantity are subject to particularindeterminacies, obeying relations(2)–(4) which prohibit them from providing a simultaneous definition of twocanonically conjugate quantities. Note that in this formulation theemphasis has slightly shifted: he now speaks of a limit on thedefinition of concepts, i.e., not merely on what we canknow,but what we can meaningfullysay about a particle. Of course,this stronger formulation follows by application of the abovemeasurement=meaning principle: if there are, as Heisenberg claims, noexperiments that allow a simultaneous precise measurement of twoconjugate quantities, then these quantities are also notsimultaneously well-defined.
Heisenberg’s paper has an interesting “Addition inproof” mentioning critical remarks by Bohr, who saw the paperonly after it had been sent to the publisher. Among other things, Bohrpointed out that in the microscope experiment it is not the change ofthe momentum of the electron that is important, but rather thecircumstance that this change cannot be precisely determined in thesame experiment. An improved version of the argument,responding to this objection, is given in Heisenberg’s Chicagolectures of 1930.
Here (Heisenberg 1930: 16), it is assumed that the electron isilluminated by light of wavelength \(\lambda\) and that the scatteredlight enters a microscope with aperture angle \(\varepsilon\).According to the laws of classical optics, the accuracy of themicroscope depends on both the wave length and the aperture angle;Abbe’s criterium for its “resolving power”, i.e.,the size of the smallest discernable details, gives
On the other hand, the direction of a scattered photon, when it entersthe microscope, is unknown within the angle \(\varepsilon\), renderingthe momentum change of the electron uncertain by an amount
leading again to the result(2).
Let us now analyse Heisenberg’s argument in more detail. Notethat, even in this improved version, Heisenberg’s argument isincomplete. According to Heisenberg’s “measurement=meaningprinciple”, one must also specify, in the given context, whatthe meaning is of the phrase “momentum of the electron”,in order to make sense of the claim that this momentum is changed bythe position measurement. A solution to this problem can again befound in the Chicago lectures (Heisenberg 1930: 15). Here, he assumesthat initially the momentum of the electron is precisely known, e.g.,it has been measured in a previous experiment with an inaccuracy\(\delta p_{i}\), which may be arbitrarily small. Then, its positionis measured with inaccuracy \(\delta q\), and after this, its finalmomentum is measured with an inaccuracy \(\delta p_{f}\). All threemeasurements can be performed with arbitrary precision. Thus, thethree quantities \(\delta p_{i}, \delta q\), and \(\delta p_{f}\) canbe made as small as one wishes. If we assume further that the initialmomentum has not changed until the position measurement, we can speakof a definite momentum until the time of the position measurement.Moreover we can give operational meaning to the idea that the momentumis changed during the position measurement: the outcome of the secondmomentum measurement (say \(p_{f}\) will generally differ from theinitial value \(p_{i}\). In fact, one can also show that this changeis discontinuous, by varying the time between the threemeasurements.
Let us try to see, adopting this more elaborate set-up, if we cancomplete Heisenberg’s argument. We have now been able to giveempirical meaning to the “change of momentum” of theelectron, \(p_{f} - p_{i}\). Heisenberg’s argument claims thatthe order of magnitude of this change is at least inverselyproportional to the inaccuracy of the position measurement:
However, can we now draw the conclusion that the momentum is onlyimprecisely defined? Certainly not. Before the position measurement,its value was \(p_{i}\), after the measurement it is \(p_{f}\). Onemight, perhaps, claim that the value at the very instant of theposition measurement is not yet defined, but we could simply settlethis by a convention, e.g., we might assign the mean value \((p_{i} +p_{f})/2\) to the momentum at this instant. But then, the momentum isprecisely determined at all instants, and Heisenberg’sformulation of the uncertainty principle no longer follows. The aboveattempt of completing Heisenberg’s argument thus overshoots itsmark.
A solution to this problem can again be found in the Chicago Lectures.Heisenberg admits that position and momentum can be known exactly. Hewrites:
If the velocity of the electron is at first known, and the positionthen exactly measured, the position of the electron for times previousto the position measurement may be calculated. For these past times,\(\delta p\delta q\) is smaller than the usual bound. (Heisenberg1930: 15)
Indeed, Heisenberg says: “the uncertainty relation does not holdfor the past”.
Apparently, when Heisenberg refers to the uncertainty or imprecisionof a quantity, he means that the value of this quantity cannot begivenbeforehand. In the sequence of measurements we haveconsidered above, the uncertainty in the momentum after themeasurement of position has occurred, refers to the idea that thevalue of the momentum is not fixed justbefore the finalmomentum measurement takes place. Once this measurement is performed,and reveals a value \(p_{f}\), the uncertainty relation no longerholds; these values then belong to the past. Clearly, then, Heisenbergis concerned withunpredictability: the point is not that themomentum of a particle changes, due to a position measurement, butrather that it changes by an unpredictable amount. It is, howeveralways possible to measure, and hence define, the size of this changein a subsequent measurement of the final momentum with arbitraryprecision.
Although Heisenberg admits that we can consistently attribute valuesof momentum and position to an electron in the past, he sees littlemerit in such talk. He points out that these values can never be usedas initial conditions in a prediction about the future behavior of theelectron, or subjected to experimental verification. Whether or not wegrant them physical reality is, as he puts it, a matter of personaltaste. Heisenberg’s own taste is, of course, to deny theirphysical reality. For example, he writes,
I believe that one can formulate the emergence of the classical“path” of a particle succinctly as follows:the“path” comes into being only because we observe it.(Heisenberg 1927: 185)
Apparently, in his view, a measurement does not only serve to givemeaning to a quantity, itcreates a particular value for thisquantity. This may be called the “measurement=creation”principle. It is an ontological principle, for it states what isphysically real.
This then leads to the following picture. First we measure themomentum of the electron very accurately. By “measurement=meaning”, this entails that the term “the momentum of theparticle” is now well-defined. Moreover, by the“measurement=creation” principle, we may say that thismomentum is physically real. Next, the position is measured withinaccuracy \(\delta q\). At this instant, the position of the particlebecomes well-defined and, again, one can regard this as a physicallyreal attribute of the particle. However, the momentum has now changedby an amount that is unpredictable by an order of magnitude \(\abs{p_{f} - p_{i}} \sim h/\delta q\). The meaning and validity ofthis claim can be verified by a subsequent momentum measurement.
The question is then what status we shall assign to the momentum ofthe electron just before its final measurement. Is it real? Accordingto Heisenberg it is not. Before the final measurement, the best we canattribute to the electron is some unsharp, or fuzzy momentum. Theseterms are meant here in an ontological sense, characterizing a realattribute of the electron.
Heisenberg’s relations were soon considered to be a cornerstoneof the Copenhagen interpretation of quantum mechanics. Just a fewmonths later, Kennard (1927) already called them the “essentialcore” of the new theory. Taken together with Heisenberg’scontention that they provide the intuitive content of the theory andtheir prominent role in later discussions on the Copenhageninterpretation, a dominant view emerged in which the uncertaintyrelations were regarded as a fundamental principle of the theory.
The interpretation of these relations has often been debated. DoHeisenberg’s relations express restrictions on the experimentswe can perform on quantum systems, and, therefore, restrictions on theinformation we can gather about such systems; or do they expressrestrictions on the meaning of the concepts we use to describe quantumsystems? Or else, are they restrictions of an ontological nature,i.e., do they assert that a quantum system simply does not possess adefinite value for its position and momentum at the same time? Thedifference between these interpretations is partly reflected in thevarious names by which the relations are known, e.g., as“inaccuracy relations”, or: “uncertainty”,“indeterminacy” or “unsharpness relations”.The debate between these views has been addressed by many authors, butit has never been settled completely. Let it suffice here to make onlytwo general observations.
First, it is clear that in Heisenberg’s own view all the abovequestions stand or fall together. Indeed, we have seen that he adoptedan operational “measurement=meaning” principle accordingto which the meaningfulness of a physical quantity was equivalent tothe existence of an experiment purporting to measure that quantity.Similarly, his “measurement=creation” principle allowedhim to attribute physical reality to such quantities. Hence,Heisenberg’s discussions moved rather freely and quickly fromtalk about experimental inaccuracies to epistemological or ontologicalissues and back again.
However, ontological questions seemed to be of somewhat less interestto him. For example, there is a passage (Heisenberg 1927: 197), wherehe discusses the idea that, behind our observational data, there mightstill exist a hidden reality in which quantum systems have definitevalues for position and momentum, unaffected by the uncertaintyrelations. He emphatically dismisses this conception as an unfruitfuland meaningless speculation, because, as he says, the aim of physicsis only to describe observable data. Similarly, in the ChicagoLectures, he warns against the fact that the human language permitsthe utterance of statements which have no empirical content, butnevertheless produce a picture in our imagination. He notes,
One should be especially careful in using the words“reality”, “actually”, etc., since these wordsvery often lead to statements of the type just mentioned. (Heisenberg1930: 11)
So, Heisenberg also endorsed an interpretation of his relations asrejecting a reality in which particles have simultaneous definitevalues for position and momentum.
The second observation is that although for Heisenberg experimental,informational, epistemological and ontological formulations of hisrelations were, so to say, just different sides of the same coin, thisis not so for those who do not share his operational principles or hisview on the task of physics. Alternative points of view, in whiche.g., the ontological reading of the uncertainty relations is denied,are therefore still viable. The statement, often found in theliterature of the thirties, that Heisenberg hadproved theimpossibility of associating a definite position and momentum to aparticle is certainly wrong. But the precise meaning one cancoherently attach to Heisenberg’s relations depends ratherheavily on the interpretation one favors for quantum mechanics as awhole. And because no agreement has been reached on this latter issue,one cannot expect agreement on the meaning of the uncertaintyrelations either.
Let us now move to another question about Heisenberg’srelations: do they express aprinciple of quantum theory?Probably the first influential author to call these relations a“principle” was Eddington, who, in his Gifford Lectures of1928 referred to them as the “Principle of Indeterminacy”.In the English literature the name uncertainty principle became mostcommon. It is used both by Condon and Robertson in 1929, and also inthe English version of Heisenberg’s Chicago Lectures (Heisenberg1930), although, remarkably, nowhere in the original German version ofthe same book (see also Cassidy 1998). Indeed, Heisenberg never seemsto have endorsed the name “principle” for his relations.His favourite terminology was “inaccuracy relations”(Ungenauigkeitsrelationen) or “indeterminacyrelations” (Unbestimmtheitsrelationen). We know onlyone passage, in Heisenberg’s own Gifford lectures, delivered in1955–56 (Heisenberg 1958: 43), where he mentioned that hisrelations “are usually called relations of uncertainty orprinciple of indeterminacy”. But this can well be read as hisyielding to common practice rather than his own preference.
But does the relation(2) qualify as a principle of quantum mechanics? Several authors,foremost Karl Popper (1967), have contested this view. Popper arguedthat the uncertainty relations cannot be granted the status of aprinciple on the grounds that they are derivable from the theory,whereas one cannot obtain the theory from the uncertainty relations.(The argument being that one can never derive any equation, say, theSchrödinger equation, or the commutation relation(1), from an inequality.)
Popper’s argument is, of course, correct but we think it missesthe point. There are many statements in physical theories which arecalled principles even though they are in fact derivable from otherstatements in the theory in question. A more appropriate departingpoint for this issue is not the question of logical priority butrather Einstein’s distinction between “constructivetheories” and “principle theories”.
Einstein proposed this famous classification in Einstein 1919.Constructive theories are theories which postulate the existence ofsimple entities behind the phenomena. They endeavour to reconstructthe phenomena by framing hypotheses about these entities. Principletheories, on the other hand, start from empirical principles, i.e.,general statements of empirical regularities, employing no or only abare minimum of theoretical terms. The purpose is to build up thetheory from such principles. That is, one aims to show how theseempirical principles provide sufficient conditions for theintroduction of further theoretical concepts and structure.
The prime example of a theory of principle is thermodynamics. Here therole of the empirical principles is played by the statements of theimpossibility of various kinds of perpetual motion machines. These areregarded as expressions of brute empirical fact, providing theappropriate conditions for the introduction of the concepts of energyand entropy and their properties. (There is a lot to be said about thetenability of this view, but that is not our topic here.)
Now obviously, once the formal thermodynamic theory is built, one canalsoderive the impossibility of the various kinds ofperpetual motion. (They would violate the laws of energy conservationand entropy increase.) But this derivation should not misguide oneinto thinking that they were no principles of the theory after all.The point is just that empirical principles are statements that do notrely on the theoretical concepts (in this case entropy and energy) fortheir meaning. They are interpretable independently of these conceptsand, further, their validity on the empirical level still provides thephysical content of the theory.
A similar example is provided by special relativity, another theory ofprinciple, which Einstein deliberately designed after the ideal ofthermodynamics. Here, the empirical principles are the light postulateand the relativity principle. Again, once we have built up the moderntheoretical formalism of the theory (Minkowski space-time), it isstraightforward to prove the validity of these principles. But againthis does not count as an argument for claiming that they were noprinciples after all. So the question whether the term“principle” is justified for Heisenberg’s relations,should, in our view, be understood as the question whether they areconceived of as empirical principles.
One can easily show that this idea was never far fromHeisenberg’s intentions. We have already seen that Heisenbergpresented the relations as the result of a “pure fact ofexperience”. A few months after his 1927 paper, he wrote apopular paper “Über die Grundprincipien derQuantenmechanik” (“On the fundamental principles ofquantum mechanics”) where he made the point even more clearly.Here Heisenberg described his recent break-through in theinterpretation of the theory as follows: “It seems to be ageneral law of nature that we cannot determine position and velocitysimultaneously with arbitrary accuracy”. Now actually, and inspite of its title, the paper does not identify or discuss any“fundamental principle” of quantum mechanics. So, it musthave seemed obvious to his readers that he intended to claim that theuncertainty relation was a fundamental principle, forced upon us as anempirical law of nature, rather than a result derived from theformalism of the theory.
This reading of Heisenberg’s intentions is corroborated by thefact that, even in his 1927 paper, applications of his relationfrequently present the conclusion as a matter of principle. Forexample, he says “In a stationary state of an atom its phase isin principle indeterminate” (Heisenberg 1927: 177,[emphasis added]). Similarly, in a paper of 1928, he described thecontent of his relations as:
It has turned out that it isin principle impossible to know,to measure the position and velocity of a piece of matter witharbitrary accuracy. (Heisenberg 1984: 26, [emphasis added])
So, although Heisenberg did not originate the tradition of calling hisrelations a principle, it is not implausible to attribute the view tohim that the uncertainty relations represent an empirical principlethat could serve as a foundation of quantum mechanics. In fact, his1927 paper expressed this desire explicitly:
Surely, one would like to be able to deduce the quantitative laws ofquantum mechanics directly from theiranschaulichfoundations, that is, essentially, relation [(2)]. (ibid: 196)
This is not to say that Heisenberg was successful in reaching thisgoal, or that he did not express other opinions on otheroccasions.
Let us conclude this section with three remarks. First, if theuncertainty relation is to serve as an empirical principle, one mightwell ask what its direct empirical support is. In Heisenberg’sanalysis, no such support is mentioned. His arguments concernedthought experiments in which the validity of the theory, at least at arudimentary level, is implicitly taken for granted. Jammer (1974: 82)conducted a literature search for high precision experiments thatcould seriously test the uncertainty relations and concluded they werestill scarce in 1974. Real experimental support for the uncertaintyrelations in experiments in which the inaccuracies are close to thequantum limit have come about only more recently (see Kaiser, Werner,and George 1983; Uffink 1985; Nairz, Andt, and Zeilinger 2002).
A second point is the question whether the theoretical structure orthe quantitative laws of quantum theory can indeed be derived on thebasis of the uncertainty principle, as Heisenberg wished. Seriousattempts to build up quantum theory as a full-fledged Theory ofPrinciple on the basis of the uncertainty principle have never beencarried out. Indeed, the most Heisenberg could and did claim in thisrespect was that the uncertainty relations created “room”(Heisenberg 1927: 180) or “freedom” (Heisenberg 1931: 43)for the introduction of some non-classical mode of description ofexperimental data, not that they uniquely lead to the formalism ofquantum mechanics. A serious proposal to approach quantum mechanics asa theory of principle was provided more recently by Bub (2000) andChiribella & Spekkens (2016). But, remarkably, this proposal doesnot use the uncertainty relation as one of its fundamental principles.Third, it is remarkable that in his later years Heisenberg put asomewhat different gloss on his relations. In his autobiographyDer Teil und das Ganze of 1969 he described how he had foundhis relations inspired by a remark by Einstein that “it is thetheory which decides what one can observe”—thus givingprecedence to theory above experience, rather than the other wayaround. Some years later he even admitted that his famous discussionsof thought experiments were actually trivial since
… if the process of observation itself is subject to the lawsof quantum theory, it must be possible to represent its result in themathematical scheme of this theory. (Heisenberg 1975: 6)
When Heisenberg introduced his relation, his argument was based onlyon qualitative examples. He did not provide a general, exactderivation of his relations.[3] Indeed, he did not even give a definition of the uncertainties\(\delta q\), etc., occurring in these relations. Of course, this wasconsistent with the announced goal of that paper, i.e., to providesome qualitative understanding of quantum mechanics for simpleexperiments.
The first mathematically exact formulation of the uncertaintyrelations is due to Kennard. He proved in 1927 the theorem that forall normalized state vectors \(\ket{\psi}\) the followinginequality holds:
Here, \(\Delta_{\psi}\bP\) and\(\Delta_{\psi}\bQ\) are standard deviations of positionand momentum in the state vector \(\ket{\psi}\), i.e.,
where \(\expval{\cdot}_{\psi} = \expvalexp{\cdot}{\psi}\)denotes the expectation valuein state \(\ket{\psi}\). Equivalently we can use the wavefunction \(\psi(q)\) and its Fourier transform:
to write
\[\begin{align*} (\Delta_\psi {\bQ})^2 & = \! \int\!\! dq\, \abs{\psi(q)}^2 q^2 - \left(\int \!\!dq \, \abs{\psi(q)}^2 q \right)^2 \\ (\Delta_\psi {\bP})^2 & = \! \int \!\!dp \, \abs{\tilde{\psi}(p)}^2 p^2 - \left(\int\!\!dp \, \abs{\tilde{\psi}(p)}^2 p \right)^2 \end{align*}\]The inequality(9) was generalized by Robertson (1929) who proved that for allobservables (self-adjoint operators) \(\bA\) and\(\bB\):
where \([\bA,\bB] := \bA\bB - \bB\bA\) denotes the commutator.
Since the above inequalities(9) and(12) have the virtue of being exact, in contrast to Heisenberg’soriginal semi-quantitative formulation, it is tempting to regard themas the exact counterpart of Heisenberg’s relations(2)–(4). Indeed, such was Heisenberg’s own view. In his Chicago Lectures(Heisenberg 1930: 15–19), he presented Kennard’sderivation of relation(9) and claimed that “this proof does not differ at all inmathematical content” from his semi-quantitative argument, theonly difference being that now “the proof is carried throughexactly”.
But it may be useful to point out that both in status and intendedrole there is a difference between Kennard’s inequality andHeisenberg’s previous formulation(2). The inequalities discussed here are not statements of empirical fact,but theorems of the quantum mechanical formalism. As such, theypresuppose the validity of this formalism, and in particular thecommutation relation(1), rather than elucidating its intuitive content or to create“room” or “freedom” for the validity of thisformalism. At best, one should see the above inequalities as showingthat the formalism is consistent with Heisenberg’s empiricalprinciple.
This situation is similar to that arising in other theories ofprinciple where, as noted inSection 2.4, one often finds that, next to an empirical principle, the formalismalso provides a corresponding theorem. And similarly, this situationshould not, by itself, cast doubt on the question whetherHeisenberg’s relation can be regarded as a principle of quantummechanics.
There is a second notable difference between(2) and(9). Heisenberg did not give a general definition for the“uncertainties” \(\delta p\) and \(\delta q\). The mostdefinite remark he made about them was that they could be taken as“something like the mean error”. In the discussions ofthought experiments, he and Bohr would always quantify uncertaintieson a case-to-case basis by choosing some parameters which happened tobe relevant to the experiment at hand. By contrast, the inequalities(9) and(12) employ a single specific expression as a measure for“uncertainty”: the standard deviation. At the time, thischoice was not unnatural, given that this expression is well-known andwidely used in error theory and the description of statisticalfluctuations. However, there was very little or no discussion ofwhether this choice was appropriate for a general formulation of theuncertainty relations. A standard deviation reflects the spread orexpected fluctuations in a series of measurements of an observable ina given state. It is not at all easy to connect this idea with theconcept of the “inaccuracy” of a measurement, such as theresolving power of a microscope. In fact, even though Heisenberg hadtaken Kennard’s inequality as the precise formulation of theuncertainty relation, he and Bohr never relied on standard deviationsin their many discussions of thought experiments, and indeed, it hasbeen shown (Uffink and Hilgevoord 1985; Hilgevoord and Uffink 1988)that these discussions cannot be framed in terms of standarddeviations.
Another problem with the above elaboration is that the“well-known” relations(5) are actually false if energy \(\boldsymbol{E}\) and action\(\boldsymbol{J}\) are to be positive operators (Jordan 1927). In thatcase, self-adjoint operators \(\boldsymbol{t}\) and \(\boldsymbol{w}\)do not exist and inequalities analogous to(9) cannot be derived. Also, these inequalities do not hold for angle andangular momentum (Uffink 1990). These obstacles have led to a quiteextensive literature on time-energy and angle-action uncertaintyrelations (Busch 1990; Hilgevoord 1996, 1998, 2005; Muga et al. 2002;Hilgevoord and Atkinson 2011; Pashby 2015).
In spite of the fact that Heisenberg’s and Bohr’s views onquantum mechanics are often lumped together as (part of) “theCopenhagen interpretation”, there is considerable differencebetween their views on the uncertainty relations.
Long before the development of modern quantum mechanics, Bohr had beenparticularly concerned with the problem of particle-wave duality,i.e., the problem that experimental evidence on the behaviour of bothlight and matter seemed to demand a wave picture in some cases, and aparticle picture in others. Yet these pictures are mutually exclusive.Whereas a particle is always localized, the very definition of thenotions of wavelength and frequency requires an extension in space andin time. Moreover, the classical particle picture is incompatible withthe characteristic phenomenon of interference.
His long struggle with wave-particle duality had prepared him for aradical step when the dispute between matrix and wave mechanics brokeout in 1926–27. For the main contestants, Heisenberg andSchrödinger, the issue at stake was which view could claim toprovide a single coherent and universal framework for the descriptionof the observational data. The choice was, essentially between adescription in terms of continuously evolving waves, or else one ofparticles undergoing discontinuous quantum jumps. By contrast, Bohrinsisted that elements from both views were equally valid and equallyneeded for an exhaustive description of the data. His way out of thecontradiction was to renounce the idea that the pictures refer, in aliteral one-to-one correspondence, to physical reality. Instead, theapplicability of these pictures was to become dependent on theexperimental context. This is the gist of the viewpoint he called“complementarity”.
Bohr first conceived the general outline of his complementarityargument in early 1927, during a skiing holiday in Norway, at the sametime when Heisenberg wrote his uncertainty paper. When he returned toCopenhagen and found Heisenberg’s manuscript, they got into anintense discussion. On the one hand, Bohr was quite enthusiastic aboutHeisenberg’s ideas which seemed to fit wonderfully with his ownthinking. Indeed, in his subsequent work, Bohr always presented theuncertainty relations as the symbolic expression of hiscomplementarity viewpoint. On the other hand, he criticized Heisenbergseverely for his suggestion that these relations were due todiscontinuous changes occurring during a measurement process. Rather,Bohr argued, their proper derivation should start from theindispensability of both particle and wave concepts. He pointed outthat the uncertainties in the experiment did not exclusively arisefrom the discontinuities but also from the fact that in the experimentwe need to take into account both the particle theory and the wavetheory. It is not so much the unknown disturbance which renders themomentum of the electron uncertain but rather the fact that theposition and the momentum of the electron cannot be simultaneouslydefined in this experiment (see the “Addition in Proof” toHeisenberg’s paper).
We shall not go too deeply into the matter of Bohr’sinterpretation of quantum mechanics since we are mostly interested inBohr’s view on the uncertainty principle. For a more detaileddiscussion of the former we refer to Scheibe (1973), Folse (1985),Honner (1987) and Murdoch (1987). It may be useful, however, to sketchsome of the main points. Central in Bohr’s considerations isthelanguage we use in physics. No matter how abstract andsubtle the concepts of modern physics may be, they are essentially anextension of our ordinary language and a means to communicate theresults of our experiments. These results, obtained underwell-defined experimental circumstances, are what Bohr calls the“phenomena”. A phenomenon is “the comprehension ofthe effects observed under given experimental conditions” (Bohr1939: 24), it is the resultant of a physical object, a measuringapparatus and the interaction between them in a concrete experimentalsituation. The essential difference between classical and quantumphysics is that in quantum physics the interaction between the objectand the apparatus cannot be made arbitrarily small; the interactionmust at least comprise one quantum. This is expressed by Bohr’squantum postulate:
[… the] essence [of the formulation of the quantum theory] maybe expressed in the so-called quantum postulate, which attributes toany atomic process an essential discontinuity or rather individuality,completely foreign to classical theories and symbolized byPlanck’s quantum of action. (Bohr 1928: 580)
A phenomenon, therefore, is an indivisible whole and the result of ameasurement cannot be considered as an autonomous manifestation of theobject itself independently of the measurement context. The quantumpostulate forces upon us a new way of describing physicalphenomena:
In this situation, we are faced with the necessity of a radicalrevision of the foundation for the description and explanation ofphysical phenomena. Here, it must above all be recognized that,however far quantum effects transcend the scope of classical physicalanalysis, the account of the experimental arrangement and the recordof the observations must always be expressed in common languagesupplemented with the terminology of classical physics. (Bohr 1948:313)
This is what Scheibe (1973) has called the “bufferpostulate” because it prevents the quantum from penetrating intothe classical description: A phenomenon must always be described inclassical terms; Planck’s constant does not occur in thisdescription.
Together, the two postulates induce the following reasoning. In everyphenomenon the interaction between the object and the apparatuscomprises at least one quantum. But the description of the phenomenonmust use classical notions in which the quantum of action does notoccur. Hence, the interaction cannot be analysed in this description.On the other hand, the classical character of the description allowsus to speak in terms of the object itself. Instead of saying:“the interaction between a particle and a photographic plate hasresulted in a black spot in a certain place on the plate”, weare allowed to forgo mentioning the apparatus and say: “theparticle has been found in this place”. The experimentalcontext, rather than changing or disturbing pre-existing properties ofthe object, defines what can meaningfully be said about theobject.
Because the interaction between object and apparatus is left out inour description of the phenomenon, we do not get the whole picture.Yet, any attempt to extend our description by performing themeasurement of a different observable quantity of the object, orindeed, on the measurement apparatus, produces a new phenomenon and weare again confronted with the same situation. Because of theunanalyzable interaction in both measurements, the two descriptionscannot, generally, be united into a single picture. They are what Bohrcalls complementary descriptions:
[the quantum of action]…forces us to adopt a new mode ofdescription designated as complementary in the sense that any givenapplication of classical concepts precludes the simultaneous use ofother classical concepts which in a different connection are equallynecessary for the elucidation of the phenomena. (Bohr 1929: 10)
The most important example of complementary descriptions is providedby the measurements of the position and momentum of an object. If onewants to measure the position of the object relative to a givenspatial frame of reference, the measuring instrument must be rigidlyfixed to the bodies which define the frame of reference. But thisimplies the impossibility of investigating the exchange of momentumbetween the object and the instrument and we are cut off fromobtaining any information about the momentum of the object. If, on theother hand, one wants to measure the momentum of an object themeasuring instrument must be able to move relative to the spatialreference frame. Bohr here assumes that a momentum measurementinvolves the registration of the recoil of some movable part of theinstrument and the use of the law of momentum conservation. Thelooseness of the part of the instrument with which the objectinteracts entails that the instrument cannot serve to accuratelydetermine the position of the object. Since a measuring instrumentcannot be rigidly fixed to the spatial reference frame and, at thesame time, be movable relative to it, the experiments which serve toprecisely determine the position and the momentum of an object aremutually exclusive. Of course, in itself, this is not at all typicalfor quantum mechanics. But, because the interaction between object andinstrument during the measurement can neither be neglected nordetermined the two measurements cannot be combined. This means that inthe description of the object one must choose between the assignmentof a precise position or of a precise momentum.
Similar considerations hold with respect to the measurement of timeand energy. Just as the spatial coordinate system must be fixed bymeans of solid bodies so must the time coordinate be fixed by means ofunperturbed, synchronised clocks. But it is precisely this requirementwhich prevents one from taking into account of the exchange of energywith the instrument if this is to serve its purpose. Conversely, anyconclusion about the object based on the conservation of energyprevents following its development in time.
The conclusion is that in quantum mechanics we are confronted with acomplementarity between two descriptions which are united in theclassical mode of description: the space-time description (orcoordination) of a process and the description based on theapplicability of the dynamical conservation laws. The quantum forcesus to give up the classical mode of description (also called the“causal” mode of description by Bohr[4]: it is impossible to form a classical picture of what is going on whenradiation interacts with matter as, e.g., in the Compton effect.
Any arrangement suited to study the exchange of energy and momentumbetween the electron and the photon must involve a latitude in thespace-time description sufficient for the definition of wave-numberand frequency which enter in the relation [\(E = h\nu\) and \(p =h\sigma\)]. Conversely, any attempt of locating the collision betweenthe photon and the electron more accurately would, on account of theunavoidable interaction with the fixed scales and clocks defining thespace-time reference frame, exclude all closer account as regards thebalance of momentum and energy. (Bohr 1949: 210)
A causal description of the process cannot be attained; we have tocontent ourselves with complementary descriptions. “Theviewpoint of complementarity may be regarded”, according toBohr, “as a rational generalization of the very ideal ofcausality”.
In addition to complementary descriptions Bohr also talks aboutcomplementary phenomena and complementary quantities. Position andmomentum, as well as time and energy, are complementary quantities.[5]
We have seen that Bohr’s approach to quantum theory puts heavyemphasis on the language used to communicate experimentalobservations, which, in his opinion, must always remain classical. Bycomparison, he seemed to put little value on arguments starting fromthe mathematical formalism of quantum theory. This informal approachis typical of all of Bohr’s discussions on the meaning ofquantum mechanics. One might say that for Bohr the conceptualclarification of the situation has primary importance while theformalism is only a symbolic representation of this situation.
This is remarkable since, finally, it is the formalism which needs tobe interpreted. This neglect of the formalism is one of the reasonswhy it is so difficult to get a clear understanding of Bohr’sinterpretation of quantum mechanics and why it has aroused so muchcontroversy. We close this section by citing from an article of 1948to show how Bohr conceived the role of the formalism of quantummechanics:
The entire formalism is to be considered as a tool for derivingpredictions, of definite or statistical character, as regardsinformation obtainable under experimental conditions described inclassical terms and specified by means of parameters entering into thealgebraic or differential equations of which the matrices or thewave-functions, respectively, are solutions. These symbols themselves,as is indicated already by the use of imaginary numbers, are notsusceptible to pictorial interpretation; and even derived realfunctions like densities and currents are only to be regarded asexpressing the probabilities for the occurrence of individual eventsobservable under well-defined experimental conditions. (Bohr 1948:314)
In his Como lecture, published in 1928, Bohr gave his own version of aderivation of the uncertainty relations between position and momentumand between time and energy. He started from the relations
which connect the notions of energy \(E\) and momentum\(p\) from the particle picture with those of frequency \(\nu\) andwavelength \(\lambda\) from the wave picture. He noticed that a wavepacket of limited extension in space and time can only be built up bythe superposition of a number of elementary waves with a large rangeof wave numbers and frequencies. Denoting the spatial and temporalextensions of the wave packet by \(\Delta x\) and \(\Delta t\), andthe extensions in the wave number \(\sigma := 1/\lambda\) andfrequency by \(\Delta \sigma\) and \(\Delta \nu\), it follows fromFourier analysis that in the most favorable case \(\Delta x \Delta\sigma \approx \Delta t \Delta \nu \approx 1\), and, using (13), oneobtains the relations
Note that \(\Delta x, \Delta \sigma\), etc., are not standarddeviations but unspecified measures of the size of a wave packet. (Theoriginal text has equality signs instead of approximate equalitysigns, but, since Bohr does not define the spreads exactly the use ofapproximate equality signs seems more in line with his intentions.Moreover, Bohr himself used approximate equality signs in laterpresentations.) These equations determine, according to Bohr:
the highest possible accuracy in the definition of the energy andmomentum of the individuals associated with the wave field. (Bohr1928: 571).
He noted,
This circumstance may be regarded as a simple symbolic expression ofthe complementary nature of the space-time description and the claimsof causality. (ibid).[6]
We note a few points about Bohr’s view on the uncertaintyrelations. First of all, Bohr does not refer todiscontinuouschanges in the relevant quantities during the measurementprocess. Rather, he emphasizes the possibility ofdefiningthese quantities. This view is markedly different fromHeisenberg’s view. A draft version of the Como lecture is evenmore explicit on the difference between Bohr and Heisenberg:
These reciprocal uncertainty relations were given in a recent paper ofHeisenberg as the expression of the statistical element which, due tothe feature of discontinuity implied in the quantum postulate,characterizes any interpretation of observations by means of classicalconcepts. It must be remembered, however, that the uncertainty inquestion is not simply a consequence of a discontinuous change ofenergy and momentum say during an interaction between radiation andmaterial particles employed in measuring the space-time coordinates ofthe individuals. According to the above considerations the question israther that of the impossibility of defining rigorously such a changewhen the space-time coordination of the individuals is alsoconsidered. (Bohr 1985: 93)
Indeed, Bohr not only rejected Heisenberg’s argument that theserelations are due to discontinuous disturbances implied by the act ofmeasuring, but also his view that the measurement processcreates a definite result:
The unaccustomed features of the situation with which we areconfronted in quantum theory necessitate the greatest caution asregard all questions of terminology. Speaking, as it is often done ofdisturbing a phenomenon by observation, or even of creating physicalattributes to objects by measuring processes is liable to beconfusing, since all such sentences imply a departure from conventionsof basic language which even though it can be practical for the sakeof brevity, can never be unambiguous. (Bohr 1939: 24)
Nor did he approve of an epistemological formulation or one in termsof experimental inaccuracies:
[…] a sentence like “we cannot know both the momentum andthe position of an atomic object” raises at once questions as tothe physical reality of two such attributes of the object, which canbe answered only by referring to the mutual exclusive conditions foran unambiguous use of space-time concepts, on the one hand, anddynamical conservation laws on the other hand. (Bohr 1948: 315; alsoBohr 1949: 211)
It would in particular not be out of place in this connection to warnagainst a misunderstanding likely to arise when one tries to expressthe content of Heisenberg’s well-known indeterminacy relation bysuch a statement as “the position and momentum of a particlecannot simultaneously be measured with arbitrary accuracy”.According to such a formulation it would appear as though we had to dowith some arbitrary renunciation of the measurement of either the oneor the other of two well-defined attributes of the object, which wouldnot preclude the possibility of a future theory taking both attributesinto account on the lines of the classical physics. (Bohr 1937:292)
Instead, Bohr always stressed that the uncertainty relations are firstand foremost an expression of complementarity. This may seem odd sincecomplementarity is a dichotomic relation between two types ofdescription whereas the uncertainty relations allow for intermediatesituations between two extremes. They “express” thedichotomy in the sense that if we take the energy and momentum to beperfectly well-defined, symbolically \(\Delta E = \Delta p\) = 0, theposition and time variables are completely undefined, \(\Delta x =\Delta t = \infty\), and vice versa. But they also allow intermediatesituations in which the mentioned uncertainties are all non-zero andfinite. This more positive aspect of the uncertainty relation ismentioned in the Como lecture:
At the same time, however, the general character of this relationmakes it possible to a certain extent to reconcile the conservationlaws with the space-time coordination of observations, the idea of acoincidence of well-defined events in space-time points being replacedby that of unsharply defined individuals within space-time regions.(Bohr 1928: 571)
However, Bohr never followed up on this suggestion that we might beable to strike a compromise between the two mutually exclusive modesof description in terms of unsharply defined quantities. Indeed, anattempt to do so, would take the formalism of quantum theory moreseriously than the concepts of classical language, and this step Bohrrefused to take. Instead, in his later writings he would be contentwith stating that the uncertainty relations simply defy an unambiguousinterpretation in classical terms:
These so-called indeterminacy relations explicitly bear out thelimitation of causal analysis, but it is important to recognize thatno unambiguous interpretation of such a relation can be given in wordssuited to describe a situation in which physical attributes areobjectified in a classical way. (Bohr 1948: 315)
Finally, on a more formal level, we note that Bohr’s derivationdoes not rely on the commutation relations(1) and(5), but on Fourier analysis. These two approaches are equivalent as faras the relationship between position and momentum is concerned, butthis is not so for time and energy since most physical systems do nothave a time operator. Indeed, in his discussion with Einstein (Bohr1949), Bohr considered time as a simple classical variable. This evenholds for his famous discussion of the “clock-in-the-box”thought-experiment where the time, as defined by the clock in the box,is treated from the point of view of classical general relativity.Thus, in an approach based on commutation relations, theposition-momentum and time-energy uncertainty relations are not onequal footing, which is contrary to Bohr’s approach in terms ofFourier analysis. For more details see (Hilgevoord 1996 and 1998).
In the previous two sections we have seen how both Heisenberg and Bohrattributed a far-reaching status to the uncertainty relations. Theyboth argued that these relations place fundamental limits on theapplicability of the usual classical concepts. Moreover, they bothbelieved that these limitations were inevitable and forced upon us.However, we have also seen that they reached such conclusions bystarting from radical and controversial assumptions. This entails, ofcourse, that their radical conclusions remain unconvincing for thosewho reject these assumptions. Indeed, the operationalist-positivistviewpoint adopted by these authors has long since lost its appealamong philosophers of physics.
So the question may be asked what alternative views of the uncertaintyrelations are still viable. Of course, this problem is intimatelyconnected with that of the interpretation of the wave function, andhence of quantum mechanics as a whole. Since there is no consensusabout the latter, one cannot expect consensus about the interpretationof the uncertainty relations either. Here we only describe a point ofview, which we call the “minimal interpretation”, thatseems to be shared by both the adherents of the Copenhageninterpretation and of other views.
In quantum mechanics a system is supposed to be described by its wavefunction, also called its quantum state or state vector. Given thestate vector \(\ket{\psi}\), one can derive probabilitydistributions for all the physical quantities pertaining to thesystem, usually called its observables, such as its position,momentum, angular momentum, energy, etc. The operational meaning ofthese probability distributions is that they correspond to thedistribution of the values obtained for these quantities in a longseries of repetitions of the measurement. More precisely, one imaginesa great number of copies of the system under consideration, allprepared in the same way. On each copy the momentum, say, is measured.Generally, the outcomes of these measurements differ and adistribution of outcomes is obtained. The theoretical momentumdistribution derived from the quantum state is supposed to coincidewith the hypothetical distribution of outcomes obtained in an infiniteseries of repetitions of the momentum measurement. The same holds,mutatis mutandis, for all the other physical quantitiespertaining to the system. Note that no simultaneous measurements oftwo or more quantities are required in defining the operationalmeaning of the probability distributions.
The uncertainty relations discussed above can be considered asstatements about the spreads of the probability distributions of theseveral physical quantities arising from the same state. For example,the uncertainty relation between the position and momentum of a systemmay be understood as the statement that the position and momentumdistributions cannot both be arbitrarily narrow—in some sense ofthe word “narrow”—in any quantum state. Inequality(9) is an example of such a relation in which the standard deviation isemployed as a measure of spread. From this characterization ofuncertainty relations follows that a more detailed interpretation ofthe quantum state than the one given in the previous paragraph is notrequired to study uncertainty relations as such. In particular, afurther ontological or linguistic interpretation of the notion ofuncertainty, as limits on the applicability of our concepts given byHeisenberg or Bohr, need not be supposed.
Of course, this minimal interpretation leaves the question openwhether it makes sense to attribute precise values of position andmomentum to an individual system. Some interpretations of quantummechanics, e.g., those of Heisenberg and Bohr, deny this; whileothers, e.g., the interpretation of de Broglie and Bohm insist thateach individual system has a definite position and momentum (see theentry onBohmian mechanics). The only requirement is that, as an empirical fact, it is notpossible to prepare pure ensembles in which all systems have the samevalues for these quantities, or ensembles in which the spreads aresmaller than allowed by quantum theory. Although interpretations ofquantum mechanics, in which each system has a definite value for itsposition and momentum are still viable, this is not to say that theyare without strange features of their own; they do not imply a returnto classical physics.
We end with a few remarks on this minimal interpretation. First, itmay be noted that the minimal interpretation of the uncertaintyrelations is little more than filling in the empirical meaning ofinequality(9). As such, this view shares many of the limitations we have noted aboveabout this inequality. Indeed, it is not straightforward to relate thespread in a statistical distribution of measurement results with theinaccuracy of this measurement, such as, e.g., the resolvingpower of a microscope, or of adisturbance of the system bythe measurement. Moreover, the minimal interpretation does not addressthe question whether one can makesimultaneous accuratemeasurements of position and momentum.
As a matter of fact, one can show that the standard formalism ofquantum mechanics does not allow such simultaneous measurements. Butthis is not a consequence of relation(9). Rather, it follows from the fact that this formalism simply does notcontain any observable that would accomplish such a task. Theextension of this formalism that allows observables to be representedby positive-operator-valued measures or POVM’s, does allow theformal introduction of observables describing joint measurements (seealsosection 6.1). But even here, for the case of position and momentum, one finds thatsuch measurements have to be “unsharp”, which entails thatthey cannot be regarded as simultaneous accurate measurements.
If one feels that statements about inaccuracy of measurement, or thepossibility of simultaneous measurements, belong to any satisfactoryformulation of the uncertainty principle, one will need to look forother formulations of the uncertainty principle. Some candidates forsuch formulations will be discussed inSection 6. First, however, we will look at formulations of the uncertaintyprinciple that stay firmly within the minimal interpretation, anddiffer from(9) only by using measures of uncertainty other than the standarddeviation.
While the standard deviation is the most well-known quantitativemeasure for uncertainty or the spread in the probability distribution,it is not the only one, and indeed it has distinctive drawbacks thatother such measures may lack. For example, in the definition of thestandard deviations(11) one can see that that the probability density function \(\abs{\psi(q)}^2\) is weighed by a quadratic factor \(q^2\) thatputs increasing emphasis on its tails. Therefore, the value of\(\Delta_\psi \bQ\) will depend predominantly at how thisdensity behaves at the tails: if these falls off very quickly, e.g.,like a Gaussian, it will be small, but if the tails drop off onlyslowly the standard deviation may be very large, even when mostprobability is concentrated in a small interval.
The upshot of this objection is that having a lower bound on theproduct of the standard deviations of position and momentum, as theHeisenberg-Kennard uncertainty relation(9) gives, does not by itself rule out a state whereboth theprobability densities for position and momentum are extremelyconcentrated, in the sense of having more than \((1- \epsilon)\) oftheir probability concentrated in a a region of size smaller than\(\delta\), for any choice of \(\epsilon, \delta >0\). This means,in our view, that relation(9) actually fails to express what most physicists would take to be thevery core idea of the uncertainty principle.
One way to deal with this objection is to consider alternativemeasures to quantify the spread or uncertainty associated with aprobability density. Here we discuss two such proposals.
The most straightforward alternative is to pick some value \(\alpha\)close to one, say \(\alpha = 0.9\), and ask for the width of thesmallest interval that supports the fraction \(\alpha\) of the totalprobability distribution in position and similarly for momentum:
\[\begin{align*} \tag{15} W_{\alpha}(\bQ, \psi) &:= \inf_{\abs{I}} \left\{ I: \int_I {\abs{\psi(q)}}^2 dq \geq \alpha \right\} \\ \notag W_{\beta}(\bP,\psi) &:= \inf_I \left\{\int_I \abs{\tilde\psi(p)}^2 dp \geq \beta \right\} \end{align*}\]In a previous work (Uffink and Hilgevoord 1985) we called suchmeasuresbulk widths, because they indicate how concentratedthe ”bulk” (i.e., fraction \(\alpha\) or \(\beta\)) of theprobability distribution is. Landau and Pollak (1961) obtained anuncertainty relation in terms of these bulk widths.
This Landau-Pollak inequality shows that if the choices of \(\alpha,\beta\) are not too low, there is a state-independent lower bound onthe product of the bulk widths of the position and momentumdistribution for any quantum state.
Note that bulk widths are not so sensitive to the behavior of thetails of the distributions and, therefore, the Landau-Pollakinequality is immune to the objection above.Thus, this inequalityexpresses constraints on quantum mechanical states not contained inrelation(9). Further, by the well-known Bienaymé-Chebyshev inequality, onehas
\[\begin{align*} \tag{17} W_\alpha (\bQ,\psi) &\leq \frac{2}{\sqrt {1- \alpha}} \Delta_\psi \bQ \\ \notag W_\beta (\bP, \psi) &\leq \frac{2}{\sqrt {1- \beta}} \Delta_\psi \bP \end{align*}\]so that inequality(16) implies (by choosing \(\alpha,\beta\) optimal) that \( \Delta_\psi\bQ \Delta_\psi \bP \geq 0.12 \hbar \). This,obviously, is not the best lower bound for the product of standarddeviations, but the important point is here that the Landau-Pollakinequality(16) in terms of bulk widthsimplies the existence of a lowerbound on the product of standard deviations, while conversely, theHeisenberg-Kennard equality(9) doesnot imply any bound on the product of bulk widths. Ageneralization of this approach to non-commuting observables in afinite-dimensional Hilbert space is discussed in Uffink 1990.
Another approach to express the uncertainty principle is to useentropic measures of uncertainty. The foremost example of these is theShannon entropy, which for the position and momentumdistribution of a given state vector \(\ket{\psi}\) may bedefined as:
\[\begin{align*} \tag{18} H(\bQ, \psi) &:= -\int \abs{\psi(q)}^2 \ln \abs{\psi(q)}^2 dq \\ \notag H(\bP, \psi) &:= -\int \abs{\tilde{\psi}(p)}^2 \ln \abs{\tilde{\psi}(p)}^2 dp \end{align*}\]One can then show (see Beckner 1975;Białinicki-Birula and Micielski 1975) that
A nice feature of this entropic uncertainty relation is that itprovides a strict improvement of the Heisenberg-Kennard relation. Thatis to say, one can show (independently of quantum theory) that for anyprobability density function \(p(x)\)
\[\tag{20} -\int\! p(x) \ln p(x) dx \leq \ln (\sqrt{2 \pi e} \Delta x )\]Applying this to the inequality(19) we get:
\[\tag{21}\frac{\hslash}{2} \leq(2\pi e)^{-1} \exp (H(\bQ, \psi) + H(\bP,\psi)) \leq \Delta_\psi \bQ\Delta_\psi \bP\]showing that the entropic uncertainty relationimplies the Heisenberg-Kennard uncertainty relation. A drawback ofthis relation is that it does not completely evade the objectionmentioned above, (i.e., these entropic measures of uncertainty canbecome as large as one pleases while \(1-\epsilon\) of the probabilityin the distribution is concentrated on a very small interval), but theexamples needed to show this are admittedly more far-fetched.
For non-commuting observables in a \(n\)-dimensional Hilbert space,one can similarly define an entropic uncertainty in the probabilitydistribution \(\abs{\braket{a_i}{\psi}}^2\) for agiven state \(\ket{\psi}\) and a complete set ofeigenstates \(\ket{a_i}\), \( (i= 1, \ldots n)\), of theobservable \(\bA\):
\[\tag{22} H(\bA ,\psi) := -\sum_{i=1}^n \abs{\braket{a_i}{\psi}}^2 \ln \abs{\braket{a_i}{\psi}}^2\]and \(H(\bB,\psi)\) similarly in terms of the probability distribution \(\abs{\braket{b_j}{\psi}}^2\) for a complete setof eigenstates \(\ket{b_j}\), (\(j =1, \ldots, n\)) ofobservable \(\bB\). Then we obtain the uncertainty relation(Maassen and Uffink 1988):
\[\tag{23} H( bA, \psi) + H(\bB, \psi) \geq 2 \ln \max_{i,j} \abs{\braket{a_i}{b_j}},\]which was further generalized and improvedby (Frank and Lieb 2012). The most important advantage of theserelations is that, in contrast to Robertson’s inequality(12), the lower bound is a positive constant, independent of the state.
Both the standard deviation and the alternative measures ofuncertainty considered in the previous subsection (and many more thatwe have not mentioned!) are designed to indicate the width or spreadof a single given probability distribution. Applied to quantummechanics, where the probability distributions for position andmomentum are obtained from a given quantum state vector, one can usethem to formulate uncertainty relations that characterize the spreadin those distribution for any given state. The resulting inequalitiesthen express limitations on what state-preparations quantum mechanicsallows. They are thus expressions of what may be called apreparation uncertainty principle:
In quantum mechanics, it is impossible to prepare any system in astate \(\ket{\psi}\) such that its position and momentumare both precisely predictable, in the sense of having both theexpected spread in a measurement of position and the expected spreadin a momentum measurement arbitrarily small.
The relations (9,16,19) all belong to this category; the mere difference being that theyemploy different measures of spread: viz. the standard deviation, thebulk width or the Shannon entropy.
Note that in this formulation, there is no reference to simultaneousor joint measurements, nor to any notion of accuracy like theresolving power of the measurement instrument, nor to the issue of howmuch the system in the state that is being measured isdisturbed by this measurement. This section is devoted toattempts that go beyond the mold of this preparation uncertainprinciple.
We have seen that in 1927 Heisenberg argued that the measurement of(say) position must necessarily disturb the conjugate variable (i.e.,momentum) by an amount that is inversely proportional to theinaccuracy of measurement of the former. We have also seen that thisidea was not maintained in the Kennard’s uncertainty relation(9), a relation that was embraced by Heisenberg (1930) and most textbooks.
A rather natural question thus arises whether there are furtherinequalities in quantum mechanics that would addressHeisenberg’s original thinking more directly, i.e., that do dealwith how much one variable is disturbed by the accurate measurement ofanother. That is, we will look at attempts that would establish aclaim which may be called ameasurement uncertaintyprinciple.
In quantum mechanics, there is no measurement procedure by which onecan accurately measure the position of a system without disturbing itmomentum, in the sense that some measure of inaccuracy in position andsome measure of the disturbance of momentum of the system by themeasurement cannot both be arbitrarily small.
This formulation of the uncertainty principle has always remainedcontroversial. Uncertainty relations that would express this allegedprinciple are often called “error-disturbance” relationsor “noise-disturbance” relations We will look at tworecent proposals to search for such relations: Ozawa (2003) and Busch,Lahti, and Werner (2013).
In Ozawa’s approach, we assume that a system \(\cal S\) ofinterest in state \(\ket{\psi}\) is coupled to ameasurement device \(\cal M\) in state \(\ket{\chi}\), andtheir interaction is governed by a unitary operator \(U\). On theHilbert space of the joint system the observable \(\bQ\) ofthe system \(\cal S\) we are interested in is represented by
\[\tag{24} \bQ_{\rm in} = \bQ \otimes \mathbb{1}\]The measurement interaction will allow us to perform an(inaccurate) measurement of this quantity by reading off a pointerobservable \(\boldsymbol{Q'}\) of the measurement device after theinteraction. Hence this inaccurate observable may be represented as
\[\tag{25}\bQ'_{\rm out} = U^\dagger( \mathbb{1} \otimes \bQ') U\]The measure of noise in the measurement of \(\bQ\) is thenchosen as:
\[\tag{26} \epsilon_\psi(\bQ) := \expval{(\bQ'_{\rm out} - \bQ_{\rm in})^2}_{\psi \otimes \chi}^{1/2}\]A comparison of the initial momentum \(\bP_{\rm in} = \bP \otimes \mathbb{1}\) and the final momentum after the measurement \(\bP_{\rm out} = U^\dagger (\bP \otimes \mathbb{1})U\) is made by choosing a measure of the disturbance of \(\bP\) by the measurement procedure:
\[\tag{27} \eta_\psi(\bP):= \expval{(\bP_{\rm in} - \bP_{\rm out})^2}_{\psi\otimes\chi}^{1/2}\]Ozawaobtained an inequality involving those two measures, which, however,is more involved than previous uncertainty relations. For ourpurposes, however, the important point is that Ozawa showed that theproduct \(\epsilon_\psi (\bQ) \eta_\psi (\bP)\)has no positive lower bound. His conclusion from this was thatHeisenberg’s noise-disturbance relation is violated.
Yet, whether Ozawa’s result indeed succeeds in formulatingHeisenberg’s qualitative discussion of disturbance and accuracyin the microscope example has come under dispute. See Busch, Lahti andWerner (2013, and 2014 (Other Internet Resources)), and Ozawa (2013,Other Internet Resources).
An objection raised in this dispute is that a quantity like\(\expval{(\bQ'_{\rm out} - \bQ_{\rm in})^2}^{1/2}\) tells us verylittle about how good the observable \({\bQ'}_{\rm out}\) can stand inas an inaccurate measurement of \(\bQ_{\rm in}\). The main point toobserve here is that these operators generally do not commute, andthat measurements of \(\bQ'_{\rm out}\), of \(\bQ_{\rm in}\) and oftheir difference will require altogether three different measurementcontexts. To require that \(\epsilon_\psi(\bQ)\) vanishes, forexample, means only that the state prepared belongs to the linearsubspace corresponding to the zero eigenvalue of the operator \(\bQ'_{\rm out} - {\bQ}_{\rm in}\), and therefore that \(\expval{\bQ'_{\rm out}}_\psi = \expval{\bQ_{\rm in}}_\psi\), but this does not preclude that the probabilitydistribution of \(\bQ_{\rm out}\) in state \(\psi\) might be wildlydifferent from that of \(\bQ_{\rm in}\). But then no one would thinkof \(\bQ_{\rm out}\) as an accurate measurement of \(\bQ_{\rm in}\)so that the definition of \(\epsilon_\psi(\bQ)\) does not express whatit is supposed to express. A similar objection can also be raisedagainst \(\eta_\psi (\bP)\).
Another observation is that Ozawa’s conclusion that there is nolower bound for his error-disturbance product for is not at allsurprising. That is, even without probing the system by a measurementapparatus, one can show that such a lower bound does not exist. If theinitial state of a system is prepared at time \(t=0\) as a Gaussianquasi-monochromatic wave packet with \(\expval{\bQ_0}_\psi =0\) and evolves freely, we can use a time-of-flightmeasurement to learn about its later position. Ehrenfest’stheorem tells us: \(\expval{\bQ_t}_\psi = \frac{t}{m} \expval{\bP}_\psi\).
Hence, as an approximative measurement of the position\(\bQ_t\), one could propose the observable\(\bQ'_t = \frac{t}{m}\bP\). It is known thatunder the stated conditions (and with \(m\) and \(t\) large) thisapproximation holds very well, i.e., we do not only have \(\expval{\bQ'_t -\bQ_t}_\psi =0\), but also \(\expval{(\bQ'_t -\bQ_t)^2} \approx 0\),as nearly as we please. But since \(\bQ'_t\) is just themomentum multiplied by a constant, its measurement will obviously notdisturb the momentum of the system. In other words, for this example,one has \(\epsilon_\psi (\bQ)\) as small as we please withzero disturbance of the momentum. Therefore, any hopes that therecould be a positive lower bound for the product \(\epsilon_\psi(\bQ) \eta_\psi (\bP)\) seem to be dashed, evenwith the simplest of measurement schemes, i.e. a free evolution.
Ozawa’s results do not show that Heisenberg’s analysis ofthe microscope argument was wrong. Rather, they throw doubt on theappropriateness of the definitions he used to formalizeHeisenberg’s informal argument.
An entirely different analysis of the problem of substantiating ameasurement uncertainty relation was offered by Busch, Lahti, andWerner (2013). These authors consider a measurement device \(\cal M\)that makes a joint unsharp measurement of both position and momentum.To describe such joint unsharp measurements, they employ the extendedmodern formalism that characterizes obervables not by self-adjointoperators but by positive-operator-valued measures (POVM’s). Inthe present case, this means that the measurement procedure ischaracterized by a collection of positive operators, \(M(p,q)\), wherethe pair \(p,q\) represent the outcome variables of the measurement,with
The two marginals of this POVM,
\[\tag{29}\begin{align*}M_1(q) &= \int\! dp M(p,q)\\M_2(p) &= \int\! dq M(p,q)\end{align*}\]are also POVM’s intheir own right and represent the unsharp position \(Q'\) and unsharpmomentum \(P'\) observables respectively. (Note that these donot refer to a self-adjoint operator!)
For a system prepared in a state \(\ket{\psi}\), the jointprobability density of obtaining outcomes \((p,q)\) in the jointunsharp measurement(28) is then
while the marginals of this joint probabilitydistribution give the distributions for \(Q'\) and \(P'\).
Since a joint sharp measurement of position and momentumis impossible in quantum mechanics, these marginal distributions (31) obtained from \(M\) will differ from that of ideal measurements of\(\bQ\) and of \(\bP\) on the system of interestin state \(\ket{\psi}\). However, one can indicate howmuch these marginals deviate from separate exact position and momentummeasurements on the state \(\ket{\psi}\) by a pairwisecomparison of (31) to the exact distributions
In order to do so, BLW propose a distance function \(D\) betweenprobability distributions, such that \(D(\mu, \mu')\) tells us howclose the marginal position distribution \(\mu'(q)\) for the unsharpposition \(Q'\) is to the exact distribution \(\mu(q)\) in a sharpposition measurement, and likewise, \(D(\nu ,\nu')\) tells us howclose the marginal momentum distribution \(\nu'(p)\) for \(P'\) is tothe exact momentum distribution \(\nu(p)\).
The distance they chose is the Wasserstein-2 distance, a.k.a. (avariation on) the earth-movers distance.
Definition (Wasserstein-2 distance)
Let \(\mu(x)\) and \(\mu'(y)\) be any two probability distributions onthe real line, and \(\gamma(x,y)\) any joint probability distributionthat has \(\mu'\) and \(\mu\) as its marginals. Then:
Applying this definition to the case at hand, i.e. pairwise to thequantum mechanical distributions \(\mu'(q)\) and \(\mu(q)\) and to\(\nu'(p)\) and \(\nu(p)\) in (31) and(32), BLW’s final step is to take a supremum over all possible inputstates \(\ket{\psi}\) to obtain
\[\tag{34}\begin{align*}\Delta(Q, Q') & = \sup_{\ket{\psi}} D(\mu, \mu') \\\Delta(P, P') & = \sup_{\ket{\psi}} D(\nu, \nu')\end{align*}\]From thesedefinitions, they obtain
\[\tag{35}\Delta(Q, Q') \Delta (P,P') \geq \frac{\hbar}{2}\]Arguing that \(\Delta(Q, Q')\) provides a sensible measure for theinaccuracy or noise about position, and \(\Delta(P, P')\) for thedisturbance of momentum by any such joint unsharp measurement, theauthors conclude, in contrast to Ozawa’s analysis, that anerror-disturbance uncertainty relation does hold, which they take as“a remarkable vindication of Heisenberg’sintuitions” in the microscope thought experiment.
In comparison of the two, there are a few positive remarks to makeabout the Busch-Lahti-Werner (BLW) approach. First of all, by focusingon the distance(33) this approach is comparing entireprobability distributionsrather than just the expectations of operator differences. When thisdistance is very small, one is justified to conclude that thedistribution has changed very little under the measurement procedure.This brings us closer to the conclusion that the error or disturbanceintroduced is small. Secondly, by introducing a supremum over allstates to obtain \(\Delta( Q, Q')\), it follows that when this latterexpression is small, the measured distribution \(\mu'\) differs onlylittle from the exact distribution \(\mu\)whatever the state ofthe system is. As the authors argue, this means that\(\Delta(Q,Q')\) can be seen as a figure-of-merit of the measurementdevice alone, and in this sense analogous to the resolving power of amicroscope.
But we also think there is an undesirable feature of the BLW approach.This is due to the supremum over states appearingtwice, bothin \(\Delta(Q,Q')\) and in \(\Delta(P,P')\). This feature, we argue,deprives their result from practical applicability.
To elucidate: In concrete applications, one would prepare a system insome state (not exactly known) and perform a given joint measurement\(M\) of \(Q'\) and \(P'\). If it is given that, say, \(\Delta(Q,Q')\)is very small, one can safely infer that \(Q\) has been measured withsmall inaccuracy, since this guarantees that the measured positiondistribution differs very little from what an exact positionmeasurement would give, regardless of the state of the system. Now,one would like to be able to infer that in this case the disturbanceof the momentum \(P\) from \(P'\) must be considerablefor thestate prepared. But the BLW only gives us:
\[\Delta(P, P') = \sup_{\ket{\psi}} D(\nu, \nu') \geq \frac{\hbar}{2 \Delta(Q, Q')}\]and thisdoes not imply anything for the state in question! Thus, the BLWuncertainty relation does not rule out that for some states it mightbe possible to perform a joint measurement in which both \(D(\mu,\mu')\) and \(D(\nu, \nu')\) are very small, and in this sense havenegligibe error and disturbance. It seems premature to say that thisvindicates Heisenberg’s intuitions.
Summing up, we emphasize that there is no contradiction between theBLW analysis and the Ozawa analysis: where Ozawa claims that theproduct of two quantities might for some states be less than the usuallimit, BLW show that product of different quantities will satisfy thislimit. The dispute is not about mathematically validity, but about howreasonable these quantities are to capture Heisenberg’squalitative considerations. The present authors feel that, in thisdispute, Ozawa’s analysis fail to be convincing. On the otherhand, we also think that the BLW uncertainty relation is notsatisfactory. Also, we would like to remark that both protagonistsemploy measures that are akin to standard deviations in being verysensitive to the tail behavior of probability distributions, and thusface a similar objection as raised insection 5. The final word in this dispute on whether a measurement uncertaintyprinciple holds has not been reached, in our view.
How to cite this entry. Preview the PDF version of this entry at theFriends of the SEP Society. Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO). Enhanced bibliography for this entryatPhilPapers, with links to its database.
View this site from another server:
The Stanford Encyclopedia of Philosophy iscopyright © 2023 byThe Metaphysics Research Lab, Department of Philosophy, Stanford University
Library of Congress Catalog Data: ISSN 1095-5054