Philosophy of perception typically centered on colors, as did themetaphysics of mind when discussing the mind-dependence of secondaryqualities. Possibly, the philosophical privilege of the visible justreflects the cognitive privilege of the visible—as vision isconsidered to account for most of useful sensory informationgathering.
What makes sounds worth of philosophical analysis is that they are notonly an important element of the perceptual scene but are alsophilosophically idiosyncratic in many intriguing ways; in particular,their temporal and spatial unfolding, as presented in perception, hasinteresting metaphysical and epistemological aspects. There is,however, an advantage of the neglect. Many philosophical aspects ofsound and sound perception are not idiosyncratic and indeed make forgeneral issues in philosophy of perception. Hence in this article wewill take advantage of the many discussions that have used othersensory features such as colors as a paradigm of a sensory feature.For instance, we shall not rehearse the discussion about thesubjectivity of secondary qualities, as the example of sounds does notseem to introduce new philosophically interesting elements that couldchallenge generalizations obtained, say, from the example ofcolors.
The main issues which are on the table concern the nature of sounds.Sounds enter the content of auditory perception. But what are they?Are sounds individuals? Are they events? Are they properties ofsounding objects? If they are events, what type of event are they?What is the relation between sounds and sounding objects? Temporal andcausal features of sounds will be important in deciding these andrelated questions. However, it turns out that a fruitful way toorganize these issues deals with the spatial properties of sounds.
Indeed, the various philosophical pronouncements about the nature ofsounds can be rather neatly classified according to the spatial statuseach of them assigns to sounds. Where are sounds? Are they anywhere?The main relevant families of answers include proximal, medial,distal, and aspatial theories. Proximal theories would claim thatsounds are where the hearer is. Medial theories—exemplified bymainstream acoustics—locate sounds in the medium between theresonating object and the hearer. Distal theories consider sounds tobe located at the resonating object. Finally, aspatial theories denyspatial relevance to sounds. There are significant variants of each ofthese. Sound theories can also be classified according to otherdimensions, such as the metaphysical status they accord to sounds (forinstance, as occurring events as opposed to properties ordispositions). We shall see some of the interactions between thesedifferent accounts. For a discussion that is more focused on perception,see the entry onauditory perception.
Proximal theories of sounds construe sounds as located at or beneaththe bodily surface of the hearer. We distinguish two main strands.
Modern philosophical accounts of sounds informed by psychology ofperception construe sounds as “sensations”, orstates/properties of hearers. Consider,
It seems…reasonable to suggest that the sounds directlyperceived are sensations of some sort produced in the observer whenthe sound waves strike the ear. (Maclachlan 1989: 26)
The sound-as-sensations theory is justified by some facts aboutauditory experience. People report hearing voices and bells even whenno one is speaking and no bell is ringing. Various examples ofsubjective sounds are documented under the label of tinnitus. In ananechoic chamber, most subjects experience subjective buzzing andwhistlings. Some subjects undergo pathologic tinnitus when they hearsounds that disrupt their normal auditory capabilities. When Russiancomposer Shostacovich turned his head on a side, he was subject to aflow of melodies (Sacks 2008). Tinnitus and other subjective auditoryphenomena have different causes, that can be related to the mechanicalproperties of the inner ear or to more central features. The objectsof these experiences are naturally and spontaneously categorized assounds. If sounds are simply defined as the objects of audition, thenthey are easily identified with the qualitative aspects of auditoryperception. Various strands of indirect realism in perception wouldmake this view mandatory. According to them, it is by hearing theimmediate, proximal items that we hear some distal events or objects.In such a case sounds would be defined as the immediate objects ofauditory perception.
A bit more peripheral, albeit still proximal, is a position, defendedby (O’Shaughnessy 2000) claiming that the sound heard is wherethe hearer is:
…while the sound originates at a distance and we can hear thatit is coming from a direction and even place, and while there is noauditory experience of hearing that the sound is where we are, thesound that we hear is nonetheless where we are (2000: 447).
This leaves open the possibility that an unheard sound be located awayfrom the hearer.
Support for this position comes from the following example. If I hearthe noise of a motorcycle far away, the physical event at my ear isqualitatively different from the physical event that is produced atthe motorcycle’s place. If I was close to the motorcycle, thephysical event at my ear would be completely different from, and wouldnot correspond to that of a motorcycle driving far away. A partyupstairs does not sound the way it sounds to participants to theparty. However, these arguments can be resisted on two counts. First,one should distinguish between source and informational channel.Consider a visual analogue. From the fact that reflected light is theonly light that hits the eye, it does not follow that one does not seereflecting surfaces or is not aware of incident light. Reflected lightcontains information about the reflecting object and the illumination,which is unpacked by the brain. In the case of sound, distance,echoes, reverberations, and filtering affect the informational channelin a way that informs about the position of the source. Second, theexamples used to support the proximal theory are unable to account forthe perceived constancy of auditory features of distant objects. Themotorcycle and the party can be judged to be very noisy, even if thephysical events at the hearer are faint because of the distance fromthe respective sources. We do possess a notion of distal volume of asound.
The locatedness of the sound at the hearer’s position entailsthat there are as many sounds as there are actual (or potential)hearers around. An alternate account would consider that one and thesame sound is present which is, however, multiply located.
These examples, and the related difficulties, thus suggest that amajor shortcoming of proximal theories is that they do not locatesounds where an untutored description of what is perceived suggeststhey are. This means that if sounds were inner sensations, ormechanical events at the ear, we would be almost always mistaken inour aural perceptions, at least on important aspects of the sound. Inturn, this amounts to accounting for auditory perception in terms of amassive error theory. We shall see that proximal theories are notalone in endorsing it.
Medial theories construe sounds as features of the medium in which asounding object and a hearer are immersed. The identification ofsounds with sound waves is the major example of medial theories.
When speaking about voice in his treatiseDe Anima (Onthe Soul), Aristotle wrote that sound is a “certainmovement of air” (De Anima II.8 420b12) but, eventhough he claimed that sound and motion are tightly connected, he didnot seem to identify them (Pasnau 2000: 32). The natural scientistsof the seventeenth century refined the intuition that sound is amovement of air into the wave theory of sounds, which appeared to bean obvious competitor for the quality or sensation (proximal)view. Galileo registered that
sounds are made and heard by us when…the air…isruffled…and moves certain cartilages of a tympanum in ourear.…high tones are produced by frequent waves and low tones bysparse ones. (1623 [1957: 276])
Descartes joined in and in hisPassions of the Soulconsidered that what we actually hear are not the objects themselves,but some “movements coming from them” (1649: XXIII).Indeed, around 1636, Mersenne measured the speed of propagation ofsound waves.
Both Galileo and Descartes were aware that the medial account wasrevisionary relative to a common sense view of sounds, or at least asrevisionary as is the sensation view. Sounds for the wave view or thesensation view are not what we unreflectively take them to be on thebasis of the content of auditory perception. (Indeed, Galileo himselfendorsed both a proximal theory—sounds as sensations—and amedial theory, thereby possibly originating a dualist account.) At thesame time Galileo and Descartes, as well as other modern philosophers,were not particularly keen in detailing the phenomenological contentof auditory perception.
The wave account is, of course, endorsed by modern acoustics. Soundsare construed as mechanical vibrations transmitted by an elasticmedium. They are thus described as longitudinal waves, defined bytheir frequency and amplitude. A vibrating object (the sound source,such as a moving vocal chord or a vibrating tuning fork) creates adisturbance in the surrounding medium (say, air, or water). Eachparticle of the medium is set in back-and-forth motion at a givenfrequency and with a given amplitude, and the motion propagates toneighboring particles at the same frequency, undergoing an energy lossthat entails a decrease in amplitude. Seen macroscopically, thepropagation of sound is the propagation of a compression in the mediumfollowed by a depression, that is, the propagation of a wave. Thebehavior of each particle is described by a sinusoid that maps thecyclical pattern of compressions and depressions against time. A cycleis the complete path of the sinusoid from crease to crease, at the endof which the particle is back to its starting position. Amplitude isthe distance between creases and valleys in the sinusoid, period isthe distance between a crease and its successor, and frequency is thenumber of cycles per time unit.
Contemporary philosophers of perception of the physicalist strand tendto align themselves on the wave theory (Nudds 2010b, 2018; Kalderon2017; Meadows 2018). Perkins thus summarizes the view:“…the sound we hear is identical with the train ofairwaves that stretches from the distant sounding object to ourear” (1983: 168). And indeed, the physicalist account of soundsseems to make a good claim to successful reduction of key auditoryphenomena based on the identification of sounds with sound waves in amedium in which a sounding object (and possibly a hearer) ispresent.
Many perceptual properties of sounds are neatly explained by thepresence of strong correlations with properties of waves, inparticular pitch and intensity (i.e., volume).
Interestingly enough, the reduction of sounds to waves in a medium isarguably more successful than the corresponding attempt to reducecolors to properties of electromagnetic waves. The latter attempt isaffected by some major problems, such as the existence of non-spectralcolors like purple, or the fact that some spectral monochromaticcolors such as orange are seen as being composed colors.
However, for all its merits, the medial identification of sounds withsound waves raises some objections and leaves some matterswanting.
For instance, there are metameric sounds (as there are metamers amongcolors), that is, sounds that feel identical to the ear although thecorresponding medial properties are different. There is no one-to-onepsychophysical correlation between auditory content and sounds aswaves (Churchland 2007: 222). Moreover, ultrasounds, above 20.000 Hzand infrasounds, below 20 Hz, have the same physical nature assounds—they are mechanical vibrations transmitted by an elasticmedium—but they are not audible (as infrared and ultraviolet arenot visible in the domain of colors): do they count as sounds? Itfurther appears that the relationship between sound and soundingobject remains underspecified. Do sound waves depend upon soundingobjects in the sense in which we usually think sounds (as auditoryevents) do?
Most importantly, as happened with proximal theories, medial theoriesdo not locate sounds where an untutored conception of auditoryperception suggests they are. If sounds were sound waves, we would bealmost always mistaken in our aural perceptions on important aspects,which fact, once more, amounts to accounting for auditory perceptionin terms of a massive error theory.
What is the nature of sounds under the wave theory? Relevant to ourpurposes, there are two main metaphysical conceptions about waves, inboth cases construed as individuals. Either(a)waves are considered to be of the same nature as processes (temporallyextended entities, with temporal parts), or(b)they are taken as individuals of a peculiar kind. In case (a), it maybe argued that they do not move, for arguably processes do not move(Dretske 1967), but rather have phases (temporal parts) located atdifferent spatial regions. Although we do say that the party movedfrom John’s room to Mary’s room, the party was never fullyin either room. Objects like people (John, Mary, Sue, Lynn), on theother hand, were completely in John’s room first, and completelyin Mary’s room later. Objects and people moved, as movementoccurs whenever a whole entity is fully first in a place and then inanother. The party did not move, but part of it was in John’sroom and part of it in Mary’s at different times.
If waves were processes like parties, one would have to construe soundwaves’ “movement” in terms of the presence ofdifferent phases of the wave at different spatial locations. However,one does not hear a sound wave’s phase as being at a particularlocation, in particular at any of the locations between the source andoneself. Sound waves do not appear to be perceptually located wheresounds are. A wave-process has some starting phases in the object, andsome end-phases around the perceiver: but perception locates soundswholly where the object is.
Incase (b), (sound) waves are different from processes and are peculiar justbecause waves move, as individual substances do. But again, if soundwaves move, the corresponding sounds are not generally heard asmoving. (Contrast this case with seeing a sea wave’s phase at aparticular place.) Sound waves propagate in all or most directionsfrom a sounding object, but the corresponding sounds are not actuallyheard as propagating in any direction: the only moving sounds are thesounds emitted by a moving source. It follows that if sounds weresound waves in this sense, we would not be hearing them as theyare.
Consider an analogy with light. In the realm of vision, the closestanalogue to sounds are the activities of light sources. We perceivethese activities, and we perceive them as located where their sourcesare: the emission of a light bulb out there, the glowing ofcandlelight over the table, the irradiation of the sun at the horizon;each with its own respective color. Do we perceive light itself, asopposed to the events of light emission at light sources? Clearly,light travels from the source to our eye—if it didn’t, wewould not perceive the source. But what we perceive is the emissionevent, and not the light. An irrelevant element of disanalogy isrelated to the temporal unfolding of sounds and light eventsrespectively. Typically in the environment light is emittedcontinuously, whereas sounds occur episodically and have short lives.If most light sources were intermittent as in piezoelectric lighters,or if most objects were buzzing all the time, this element ofdisanalogy would be less salient (Pasnau 2007).
In some conditions it looks as if we perceive light rays, e.g., in adusty room. But what is actually perceived is a set of particles ofwater or of dust that intercept light. In order for a light ray to bevisible, it would have to send information to our eye without themediation of interposed matter. Coming back to sounds, the argumentbased on the constraint of fidelity to the content of auditoryperception is thus, in a compact form: in order for sound waves to beaudible, they would have to transmit audible information that reachesour ears. They don’t. So we do not hear them. But we do hearsounds. Hence sounds are not (medial) sound waves.
Some other remarks against the identification of sounds with soundwaves in the medium between the object and the perceiver applyindependently of the metaphysical construal of sound waves asprocesses or sui generis individuals.
Consider first the fact that sounds are sometimes loud and sometimessoft; as we have seen, in the wave theory this feature is correlatedwith amplitude of the sound waves. However, the location of the soundplays a role in establishing the correlation. A loud sound, which isheard as being far from us, is different from a soft sound, which isheard as being close to us. The spectra of the two sounds aredifferent. If you want a vivid example, consider what happens if youamplify the sound of a person who is speaking low. You do not have theimpression of a person who speaks loud. You have, on the other hand,the impression of having come very close to a low-speaking giant.
In some conditions, however, it can be difficult for us to tell a loudsound in the distance from a soft sound nearby, because in the part ofspace closer to our ears the sound waves that reach us and correspondto the two sounds can have the same amplitude. (This only holds for anideal case of a long lasting sine wave with no sharp attack. This typeof sound is practically distance-proof because having a singlecomponent, the only possible variation is in amplitude, and no otherspectral differences arise.)
Indeed, the fact that the two sine waves have the same amplitude whenthey are close to the ear accounts for the indistinguishability of thetwo sounds. Nevertheless, the two sounds, even if indistinguishable,are distinct, whereas the two sound waves are not. A way to put thedifficulty is as follows: a loud sinusoid heard at a distance is stilla loud sinusoid; but the corresponding sound waves decrease inamplitude. As we have seen discussing proximal theories, we can makeperfect sense of the notion of distal volume of sounds, even in thecase in which we cannot distinguish a loud distant sound from a softsound in the vicinity. Something (decrease in amplitude) happens tosound waves which does not happen to sounds (the distal volume doesnot change). Hence it can be argued that on this account too soundsare distinct from sound waves.
As a matter of fact, one may want to distinguish two possible lines ofargument here:
The thesis that sounds are sound waves is also often motivated by theargument from vacuums. Surely, it is argued, sounds cannot exist in avacuum. As Hylas says in Berkeley’sThree Dialogues BetweenHylas and Philonous,
a bell struck in the exhausted receiver of an air-pump, sends forth nosound. The air therefore must be thought the subject of sound. (1713[1975: 171–172])
But is the claim that there are no sounds in vacuums really obvious?On pain of question begging, it cannot be made to follow from anyparticular metaphysics of sounds. In order to assess it on its ownmerits, consider once more the analogy between, on the one hand,sounds and air, and, on the other hand, emission events at lightsources and light. Air is the medium of auditory perception, and lightis the medium of visual perception. The reasoning now is that just asthings can sensibly be taken to have colors in the dark, they cansensibly be taken to produce sounds in a vacuum.
In the above arguments an important role is played by the followingrequirement:A theory of sounds should be true to thephenomenological content of auditory perception. It seems quitereasonable to require that as sounds are the objects of hearing,whatever they are should be somewhat revealed in hearing. In point offact, there are two ways in which the fidelity to auditory contentrequirement can enter the arguments. In a strong version(O’Callaghan 2009) it may be held that no theory of sounds“should make the fact of location perception a wholesaleillusion” (2009: 29); hence, as sounds are represented aslocated, it would follow that they are correctly so represented. Itmay be argued that this principle is too strong because it isunjustifiably specific. Why should location, among all possiblefeatures of sounds, be protected against the possibility of systematicillusion?
In another, weaker version (Casati & Dokic 1994) it is therepresentational power of auditory content that is appealed to.Auditory experience has the power to represent sounds, and it has thepower to represent movement (as when one hears the sound of a movingtrain’s whistle). It is then natural to assume that auditoryexperience would be able to represent sounds as moving, if sounds wereindeed moving entities by their nature. But such is not the case;hence auditory experience correctly represents them as firmly located.This construal of the requirement is compatible with the existence ofsystematic, though possibly selective, illusion.
The requirement of fidelity to auditory content may be challenged on anumber of grounds. First, it may be challenged by opposing itsrationale. Auditory content may well be massively illusory, and thiscould be just the price to pay for any realist account of sounds.
Possibly there is here an analogue to the case of colors, in whichthere is room for selective illusion and for the choice betweenrealism about location, say, and realism about hue. Most arguments forthe subjectivity of colors start from the existence of strongcorrelations between phenomenal hue structures (such as the relationof complementarity between colors, e.g., red and green, which ismanifest in phenomena such as afterimages or color-blindness) andneural structures (such as bipolar cells). These arguments then stressthe absence of physical correlates for hue structure (nothing in thewavelengths corresponding to red and green hues can make one predictthe subjective complementarity between red and green). They finallyconclude that colors are mind-dependent. However, the arguments onlyestablish the mind-dependence of hue structures, not of colors per se.The location of colors as “outside the brain” can still betaken for granted. Brought back to sounds, this form of selectiverealism would allow for subjective tonal qualities and for an external(non proximal) location of sounds.
Second, the requirement of fidelity to auditory content may bechallenged by questioning the phenomenological claim that motivatesit. The move would consist in suggesting that sounds do not have thestrong spatial property of locatedness, but the weaker property ofdirectionality. The distinction between two senses of“locatedness” in relation to sounds can be traced back toMalpas (1965), based on ordinary language arguments, and is echoed inUrmson (1968) and Hacker (1987: 102 ff.); cf. O’Shaughnessy(1957). Locatedness in the strong sense specifies an address for thesound, e.g., by specifying both directionality and distance from thehearer (thus including egocentrical directionality as a component), orby locating sounds in allocentric space (e.g., a siren from the boatat Pier 3). According to a weaker sense of “locatedness”,sounds would only be perceived as “coming from” a certaindirection, without any information about the distance they travel. Nowsurely this is not the general case. Although in some cases (e.g., thedecrease in amplitude of a sine wave) it may be difficult orimpossible to tell, say, whether what we hear is a soft sound nearbyor a loud sound far away, in most cases the distinction is perfectlyavailable to the subject, as we noticed earlier when we introduced thenotion of distal volume: someone screaming in a distance is neverconfused with someone speaking low near you ear. Indeed the issue ofthe locatedness of sounds is the subject matter of specializedbranches of cognitive science (Blauert 1974; Bregman 1990; Schnupp,Nelken, & King 2011; Grimshaw & Garner 2015).
Incidentally, claiming that sounds are heard in a direction ratherthan at a location mixes up two ways of accounting for auditoryexperience: phenomenology on the one hand and commonsense reflectionson the directional transmission of auditory information on the otherhand. The commonsense picture may have been made a bit toosophisticated by exposure to some physical accounts of sounds.
Finally, the requirement of fidelity to auditory content may also bechallenged by proposing a different phenomenology. For example,Kalderon (2017) argues that sound is an event that is identical to thepropagation in every direction of a pattered disturbance by means of amedium, such as air or water (2017: 105). He claims, indeed, thatauditory phenomenology is essentially emanative phenomenology.According to emanative phenomenology, we hear sound as anever-expanding sphere which is the medium disturbance propagating inevery direction from its source. Sound is like
an expanding ripple caused by a drop in an otherwise calm body ofwater, except that the sound event occurs in three dimensions, nottwo, and so takes the form of a sphere rather than a circle. (2017:.106)
When facing the challenge of what we hear when we say we hear birds inthe garden outside or people in the corridor outside ouroffice—which are cases in which it is clear that auditoryphenomenology is distally locating sounds and not making us hearingsounds as pervading the surrounding medium—Kalderon replies bysaying that what we hear in these cases as distally located are soundsources, rather than sounds (2017: 115). Nevertheless, this replyraises the question of how we make the distinction between hearingsound as located and sound sources as located. Is there a way todistinguish their different location phenomenologically? Anotherquestion which might challenge the medial view based on emanativephenomenology is what are exactly the properties of sound sourceswhich are audible and which provide us with spatial information ontheir location.
Wave theorists typically give up the requirement that an account ofsound be faithful to auditory content (although they would nottypically acknowledge this; as Pasnau 1999 has remarked, the very sametextbook on sounds may simultaneously endorse a medial, a proximal anda distal theory). However, wave theorists may also try to reconcileauditory content with the wave conception. Sorensen (2008) proposesfor instance that it is not a purely auditory phenomenon that we dosometimes identify and localize objects and events at theircenter, whenever a center is available. For instanceearthquakes arelocalized (in the epistemic sense) at theirhypocenter, although it is admitted that they are notlocatedat their hypocenter. An earthquake is everywhere it can be felt ormeasured. Analogously, according to the wave theorist sounds are wavesin a medium, but they are located at their center (at their origin).The wave conception considers as relatively benign the error oflocating a sound (i.e., a sound wave) at its center/origin.
An important dialectical limitation of Sorensen’s suggestion isthat it does not provide us with an independent argument in favor ofthe Wave Theory. The identification of sounds with sound waves is ofcourse compatible with the fact that we locate sounds at a point (thesounding object’s location) which happens to be the center ofexpanding sound waves. However, the analogy with the localization ofearthquakes breaks down at a crucial point. Of course, we can usefullyidentify a certain regionas the center of a particularearthquake. (In general, earthquakes can be localized at theirhypocenter only when we have at least a rough representation of theirfull extension in space.) By contrast, as Sorensen admits, theauditory system does not identify the sounding object’s locationas the center of expanding sound waves. Indeed, it does notidentify this location as the center of anything. (Compare the way thevisual system identifies the landing area of a stone thrown in wateras the center of a series of concentric, expanding waves. None of thesort is available to the auditory system.) Now many entities otherthan sound waves are at the object’s location when we hear asound, including events (monadic or relational, as we shall seebelow). Thus, facts about the apparent location of sounds do notjustify the Wave Theory better than the Event Theory; quite thecontrary, given our other independent considerations against theformer approach.
Several among the previous remarks jointly point to the necessity ofbetter accounting for the distinction between events in the soundingthing and events in the surrounding medium. As for sounds, thisdistinction is consequent upon the distinction between two kinds ofmedium: the source medium (that is, the stuff the thing is composedof) and the medium proper or environment medium, surrounding thesounding thing, the one in which the hearer could be immersed as well.Let us take them in turn. First, a thing is a sounding entity onlyinsofar the stuff or the stuffs the thing is composed of is or arevibrating. For a simple example, there are no properties of a tuningfork which account for its sounding, which are not properties of thestuff(s) the fork consists of (including shape). A more complexexample is the case of the flute, in which the “soundingobject” is air inside the flute. In both cases a portion ofmatter—the source medium—is vibrating.
But, second, do we ever happen to hear events in the medium? Weactually do, but in a somewhat indirect way. Consider the visualrealm. In some cases both a thing and (a part of) a medium between usand the thing are seen. This happens when we look at things throughirregularly warmed air, or through moving water. In these cases, wesee both the thing and the medium, the thing in an unclear way, andthe medium as that which makes the thing appear in an unclear way.
But these cases are not the norm. Perceptual media are in the normcognitively transparent: they are imperceptible insofar as theytransmit without significant alteration information about somerelevant properties of the thing perceived through them. Media becomeperceivable when this transmitting function is impaired by some eventor disturbance occurring in them. Auditorily, this occurs in the caseof the Doppler effect. The vibration of the air carries informationboth about the sounding object and about the effect its speed has onthe medium.
The affectedness of the medium is a feature which is mostly evident,and almost pervasive in the case of sounds, because of the relativelysimilar size of the phenomena involved at the source and in the mediumproper. Vibrations in the sounding objects are macroscopic phenomena,of a size which is fairly comparable to the size of the soundingobject itself. Therefore the interaction of these vibrations with thesurrounding medium can easily be a source of misperception, for theirimpact on the medium brings about processes which are of the sameorder of magnitude of the object involved.
After proximal and medial theories, one should consider anothercandidate for the physical identification of sounds, namely distalproperties, processes or events in the medium inside (or at thesurface of) sounding objects, or in the stuff of the sounding object.Distal views claim their superiority to non-distal competitors invirtue of their adherence to the spatial structure of auditorycontent. As we have seen, we do hear sounds both as externalized(hence auditory content is at odds with proximal views) and asdistally located (hence auditory content is at odds with medialviews).
There exist at least four varieties of the distal account of sounds:the Property Theory, the Located Event Theory, the Relational EventTheory, and the Dispositional Theory. These accounts all subscribe tothe idea that sounds are distally located, but they differ inascribing to sounds different ontological status. Let us take them inturn.
According to the Property View sounds are properties of materialobjects just like colors and shapes.
The property view is in part endorsed by the founding fathers ofmodern philosophy of perception, Galileo and Locke, who opened atradition of lumping various sensory items in the class of secondaryqualities. The typical seventeenth century list of secondary qualitiesincludes colors, smells and sounds. No significant internalmetaphysical differentiation is made within the class of sensoryqualities, hence the charge of an oversimplification cannot bedirectly addressed to historical accounts. Other philosophers may haveadded shapes to the list (as Berkeley did), without addressing theissue of the homogeneity of the class: the issue was only whethershapes are secondary as sounds are, not whether they are on a par withsounds as to their structure.
The Property View has contemporary endorsers (Pasnau 1999; althoughPasnau takes sounds to be properties like colors, he comes close tothe event view when he writes that sounds “either are thevibrations of [objects that have sounds], or supervene on thosevibrations” [1999: 316]; indeed Pasnau 2007 rejoined the EventTheory). Leddington (2019) also defends a property view of soundwithin a distal view positions, since he claims that sound itself isnot an event but it is a property of the event which is producing it(i.e., a property of the collision).
The property view faces a number of objections. First, we ordinarilydescribe objects as “having” colors or shapes, but we donot ordinarily describe sources as “having sounds”.
Rather, we say that they make or produce sounds (conversely, a redtable does not “produce” or “make” red). Thisis an ordinary language argument, and as such it might not be verystrong.
The main consideration against the Property View is that itunderestimates the important differences between colors and shapes onthe one hand, and sounds on the other hand. The latter, unlike theformer, are dynamic dependent individuals. And even if colors andshapes can be theoretically conceived as individuals, they are notdynamic. Sounds take up time. They start and cease. They areintrinsically temporal entities. Their temporal profile is essentialto individuating them, in a way which has no analogue in the case ofcolors and shapes. However, Roberts (2017) explicitly defends theProperty View by discussing salient disanalogies between sound andcolors. He also suggests a very exhaustive taxonomy of the differentpositions available within the property view space. Cohen (2010),although not directly endorsing the property view, criticizesarguments that conclude to an asymmetry between sounds and colors, inparticular with regard to temporality.
Di Bona and Santarcangelo (2018: chapter 4) discuss to a certainlength the relationship between sound and time, especially when thisrelationship grounds a metaphysical difference between sounds andcolors. They investigate different temporal experiences of sounds andconclude that temporal experiences of sounds are similar to sometemporal experiences of colors. Sounds and colors differ with regardto temporality only insofar as one focuses on the role that time playswhen the auditory system has to segregate auditory stimuli intoauditory streams (Bregman 1990). When segregating colors, space is farmore important than time. For a discussion of vision and audition withregard not only to sounds and colors but also to auditory and visualobjects and always with relation to spatiality and temporality (seeO’Callaghan 2008; Kubovy & Schutz 2010; and Di Bona &Santarcangelo 2017).
In this and the following section, we shall present two Event Theoriesof sounds, for which we use two distinct labels. An earlier one, TheLocated Event Theory, was defended by Casati and Dokic (1994). It hasbeen rejoined by Pasnau (2009) and extended to the field of sonic artby Roden (2010). A second, more recent version, the Relational EventTheory, has received an articulated defense by O’Callaghan(2007, 2010a). The two accounts agree on categorizing sounds asevents, that is, located temporal particulars, and diverge on somespecifics of the class of particulars that are admitted to be sounds.The Relational Event Theory makes sounds depend upon the existence ofa medium that carries information about them. In this sense, only asubclass of situations involving sounds for the Located Event Theoryare situations involving sounds for the Relational Event Theory.
According to the Located Event Theory, sounds are events happening tomaterial objects. They are located at their source, and are identicalwith, or at least supervene on, vibration processes in the source. Onthis view (Casati & Dokic 1994), auditory perception of soundsrequires a medium which transmits information from the vibratingobject to the ears; however, the transmitting medium is not essentialto the existence of sounds. One can see at once the fit of this viewwith those features of sound which were sources of trouble in thecases discussed above when criticizing proximal and medialtheories.
A by-product of the Located Event Theory is that it makes plain whatcategory sounds belong to, as opposed to views that construe soundsgenerically as qualities. Sounds are either instantaneous events ortemporally extended processes. They start and cease. They areintrinsically temporal entities.
Another feature of the Located Event Theory is that it provides uswith a clear example of the compatibility of a theory of non-directperception, according to which we hear external events by hearingtheir perceptual deputies, with a non-phenomenalistic theory,according to which perceptual deputies are not mental items. The caseof sound perception shows that there can be indirect perceptionwithout mental deputies. We hear coaches and telephones by hearingtheir sounds, i.e., by hearing some (cluster of) vibratory processesor events occurring in those objects. Sounds are both physical eventsand perceptual deputies.
We shall now briefly discuss some objections to the identificationproposed by the Located Event Theory, as they allow us to highlightcertain metaphysically interesting features of sounds.
The first objection concerns sound location. Even if sounds are heardas located, it could be held that location is often imprecise or evenerroneous, this in turn depending on—and being explainedby—the nature of sound waves. Here is a relatively common echoexample. Suppose you walk under the rain, your umbrella open. At somepoint you enter a building with a glass roof. Rain drops on the roof,and no longer on your umbrella. The umbrella attenuates the noise fromthe roof, which is reflected by the ground. You hear the raindrops asif they were below you, and not above you. Erroneous location is hereexplained by the path of sound waves.
The temptation of identifying sounds with sound waves can arisebecause of this fact: that sounds can be mislocated in audition. Theycan be heard as located in a region which is larger than, or removedfrom, the one occupied by a sounding object, a region which it isreasonable to take as being occupied by sound waves.
This example poses no particular threat to the distal view. Consideragain a visual analogy. Seeing an object in a mirror is not seeinganother, immaterial object located in an immaterial space beyond themirror-plane. There is no such immaterial object; we see one and theonly material object, and we locate it incorrectly as if it was behindthe mirror.
The mirror sophism should be credited to Hobbes’Leviathan (1651: I, I; cf. Casati & Dokic 1994:49–51), which explicitly linked perception in a mirror andperception of echoes:
The cause of sense, is the external body, or object, which presseththe organ proper to each sense, either immediately, as in the tasteand touch; or mediately, as in seeing, hearing, and smelling; whichpressure, by the mediation of the nerves, and other strings andmembranes of the body, continued inwards to the brain and hearth,causeth there a resistance, or counter-pressure, or endeavour of theearth to deliver itself, which endeavour, because outward, seemeth tobe some matter without. And this seeming, or fancy, is that which mencall sense; and consisteth, as to the eye, in a light, or colourfigured; to the ear, in a sound…if those colours and soundswere in the bodies, or objects that cause them, they could not severedby them, as by glasses, and in echoes by reflection, we see they are;where we know the thing we see is in one place, the appearance inanother.
The deviant paths of sound waves (in echoes) is responsible for theperceptual difficulty in locating sounds, much as the deviant path oflight rays (in mirrors) is responsible for the analogous difficultyfor visible objects (Casati & Dokic 1994; see alsoO’Callaghan 2007 for a lengthy discussion of echoes, and Fowler2013 for arguments against O’Callaghan’s view). But it isnot the case that sound waves are sounds just because of theirresponsibility. From the fact that a subject hears something asimperfectly located, it does not follow that she hears something whichis imperfectly located.
The second objection concerns typical acoustical effects, like theDoppler effect, which are perfectly accounted for by appealing to(medial) sound waves. The Doppler effect is a shift in frequency ofthe sound heard by an observer who moves relative to the sound source.As waves in the direction of movement are compressed, and waves in theopposite direction are expanded, the frequency drops dramatically whenthe hearer and the source go past each other. Such explanations of theDoppler effect are harmless for a distal account. The Doppler effectis dependent on something going on in the medium, but this should notallow one to conclude that what we hear are sound waves in the medium.The situation can be described as follows in a way that is relativelyuncommitting: When we hear sounds as undergoing the Doppler effect, wedo not hear anything different from a vibration process in a soundingobject, a process which is heard in a sort of perspectival shorteningbecause the movement of the sounding object causes, among otherthings, the Doppler effect.
As a matter of fact, the objection could be turned on its head. On atrain passing by, a trumpet player is delivering a concert. Accordingto the distal view, the melody does not change, it is just perceivedas changing. If we repeated the experience a number of times, we wouldfind it suspect that the melody’s key drops only when the trainpasses by. Surely, we would infer, there is something wrong with themedium that blocks our perception of the true melody. And surely thetrain passengers would disconfirm our impression: they do not hear anykey drop. The medial theory here indeed predicts that there are twomelodies, the one that we hear from the platform, and the one thepassengers hear.
A third objection is as follows.
Sounds are phenomenologically high or low (they have high or lowpitch). But processes in objects cannot be high or low. Thereforesounds are not processes in objects.
This can be answered in the following way. Notice first that soundwaves fare no better on this objection—they cannot literally besaid to be high or low. But a more substantial answer is available.What one needs is a way of systematically correlating predicates like“…is high”, “…is higherthan…” to processes in sounding objects. It is likelythat a high sound corresponds to a quickly vibrating process, and soforth.
A fourth objection has it that
surely there are sound waves in the ambient medium, otherwise nocausal link could be set between the sounding object and ourperception of the latter.
And such sound waves can certainly be measured and physicallydescribed. Now there is no point in denying that there are sound wavesin the ambient medium: of course there are, and they are causallyresponsible for our aural perceptions when these are perceptions ofanything at all. A defender of the Located Event Theory ought to justcontend that such sound waves are not what we hear.
Consider an analogy we discussed before. Light is causally responsiblefor your perception of an object’s surface. But this does notmake you see the light when you see the surface. We can seeluminescent sources, but never light in itself: in order to be seen,light should have to emit light carrying information about it.
Finally, another objection concerns the alleged meaningful use ofexpressions such “the sound fills the room”, “soundsfill the room”. It seems that what makes these sentences true isbest found in the spreading of sound waves, which could actually beeverywhere in the room. But one should not be too much impressed byidioms. “The sound fills the room” does not describe anyphenomenological fact which is different from the fact that the soundis audible from any place in the room (in this respect sounds areunlike fog, which can literally be seen to fill a room).
However, this point deserves closer attention. For sometimes sounds doseem to fill space. Thunders seem to. This is a case in which the onlyvibrating entity is the medium. Nonetheless this case too can beaccommodated by the Located Event Theory: what we hear is suddenheating of air due to the electric discharge, whose impact isconfusedly propagated by the medium. A portion of air (the portionthat is suddenly heated up) is the vibrating object and anotherportion is the transmitting medium.
According to the Relational Event Theory, sounds are events whichinvolve both the source and the surrounding medium. They arerelational rather than “monadic” events. (The distinctionbetween monadic and relational events is not to be taken as cast iniron, since the latter can be reduced to the former by making themereological sum of sources and surrounding medium the subject ofsounds.) O’Callaghan (2002, 2007) has developed such a view atsome length. He notes that the wave conception of sounds is not theonly possible interpretation of Aristotle’s remarks about soundsinOn the Soul. Aristotle writes that “everything thatmakes a sound does so by the impact of something against somethingelse, across a space filled with air” (De Anima II.8420b15). On O’Callaghan’s view, what Aristotle might havemeant is that the sound itself is not a movement of the air, it israther the event in which a vibrating object disturbs a surroundingmedium and sets it “moving”. Waves in the medium are notthe sounds themselves, but rather the effects of sounds. According tothe Relational Event Theory, sounds are “disturbings” of amedium, hence depend existentially on a medium that is disturbed andthat will transmit information to a listener. This account differsfrom the Located Event Theory insofar as the latter allows for soundsthat exist in a vacuum, and thus distinguishes between a medium thathosts the vibration and a medium that transmits the vibration.
The Relational Event Theory shares with medial theories an endorsementof Berkeley’s argument that sounds do not exist in a vacuum. Itis, in point of fact, a hybrid theory (O’Callaghan 2007: 55),sharing with medial theories the tenet of the indispensability of amedium to the existence of a sound.
An argument against the Relational Event Theory capitalizes upon thefact that we have the conceptual resources to distinguish between nothearing a sound because the sounding object is no longer resonatingand not hearing it because we do not have informational access to thesounding object anymore. The Relational Event theory faces the risk ofcollapsing unto a medial position.
To develop this line of thought and make it vivid, imagine a vacuumjar which has the property of immediately creating a vacuum uponclosing the lid, and of immediately recalling air upon opening thelid. Take now a sounding object like a tuning fork at 440Hz and haveit vibrate, supposing that the vibration fades and becomes inaudibleafter 10 seconds. What you hear is an A that becomes feebler andfeebler until it disappears. Now, place the tuning fork inside thejar, have it vibrate as before, and repeatedly open-and-close the lidof the jar, say once in a bit less than a second. What do you hear?You may have the feeling that a few short sounds, each feebler thanthe preceding one, come into existence and pass away. But you may aswell have the impression of a sound that isrevealed by theopening and closing of the jar. Indeed, the fading of the sound shouldbe audible from each “window” to the next, implying thatthere is a single sound that fades. If sounds were either sound wavesin the medium surrounding the object, or items dependent on the mediumas per the Relational Event Theory, we would be forced to admit thatthe tuning-fork started and ceased to sound; because the relevantsound waves would not be present in the surrounding medium. Only thefirst impression, that of a series of short sounds, would be accountedfor. But the second impression, that of a continuous underlying soundthat is revealed, is supported too. A visual analogue of this would bethe perception of an object in the dark, on which light is shed atintervals. We would not have the impression that the object gets itscolors and then loses them at intervals.
An advantage of the Relational Event Theory over the Located EventTheory is that the former provides a criterion for specifying whichamong vibratory events at a source are sounds, namely, those thatcreate medial disturbances that are (or can be) heard. The LocatedEvent Theorist can, on the one hand, observe that the minimalrequirement of a metaphysics of sounds is to specify which type ofentities are sounds, and not, more specifically, which, among entitiesof that type, are sounds. Thus she would claim that sounds are eventsat a source, without caring to discuss whether some events at a sourceare not sounds. On the other hand, she can observe that too muchfine-grained a classification would create a problem with those eventsthat share with sounds all interesting metaphysical properties exceptfor the property of being audible; a problem which, incidentally,affects a number of physicalist reductions of sounds.
The Relational Event Theory and the Located Event Theory differconcerning the way in which they account for the audible relationshipbetween sound and sound source. A manner of characterizing thisrelation within the Relational Event Theory is the part-wholerelationship, according to which sounds are heard as the constituentparts of wholes that are everyday audible events (O’Callaghan2011a, 2011b). This is a Mereological View, which presupposes adistinction between sound and the broad event which sound is part of.The parthood relationship explains a striking and quite ambiguousaspect of audition, namely, the fact for which we hear the sound andits source as two different events but, at the same time, we hear themas notwholly distinct.
The advantages of the Mereological View over both the PropertyInstantiation view—which claims that sound is heard as aproperty of the sounding object—and the Causal View—forwhich sound is heard as the effect of its source (O’Callaghan2011a: 383 and following)—are that:
The Located Event Theory proposes a metaphysical Ockhamization of theMereological View (Casati, Di Bona, & Dokic 2013), which cleansthe metaphysical landscape up from entities which are not necessary.As a starting point, it is useful to distinguish between twocomponents of sound sources:thing sources (such as keys ormusical instruments) andevent sources (such as collisions orvibratory events at the sounding object) (2013: 462). The LocatedEvent Theory maintains an Identity View for which sound, instead ofbeing a proper part of a distinct event that is its source, isidentical with the event source. That is, the collision we hearis the sound we hear. The Identity View appears to simplifythe metaphysical landscape depicted by the Mereological View (2013:463). The metaphysical claim which grounds the Identity View issupported by considerations about the phenomenology of audition. Theseconsiderations suggests that we do not commonly hear sounds as part oflarger events—at least when, with “larger events”,we refer to event sources. We hear sounds as a unified entity, and notas parts of larger events. The biggest price the Identity View has topay is for some ordinary language statements to which we have torenounce, meaning that instead of saying “I heard the bang(produced by/and) the collision”, it would be more appropriateto say that we heard the collision. This is a disadvantage of thetheory which is acceptable, though, if the advantage is to propose aplausible theory of sound (2013: 465).
As for the advantages that the Mereological View has over the PropertyView and the Causal View, the Identity View can account for them aswell. One of the advantages of the Mereological View is that itjustifies the supposedly different properties that we usuallyattribute to sounds and to sound sources. The Identity View respondsto this by arguing that:
The Identity View and the Mereological View have an equal explanatorycapacity, but the Identity View, in addition to the metaphysicalreduction, seems to untangle the ambiguity of hearing sound and soundsources as two different events and, at the same time, of hearing themas notwholly distinct. This ambiguity evaporates if weidentify sound with the event source.
Different options concerning the relationship between sound and soundsources and how we perceive them have been proposed. According toMatthen (2010) both sounds (and sound composites like melodies,harmonies or sequences of phonemes) and sound sources can be hearddirectly. Leddington supports the Heideggerian view of hearing, forwhich we “hear sound sources directly, in hearing the soundsthey make—not,à la Berkeley, merely in virtueof hearing those sounds” (2014: 340). Nudds, instead, suggeststhat when listening to sounds we hear them as apparently beingproduced by the same source (2010a, 2010b). He suggests thatwe can also perceive the production of sound bi-modally: bysimultaneously listening to and seeing the cause of sound (2001).
Nevertheless, issues about the relationship between sound and thething source (namely, sound and the material object which produces it)and about how we hear both the objects that produce sounds, and therelation between objects and sounds still have to be fully worked outwithin both views. As a starting point to develop an account on theperception of sound sources or, at least, of some aspects of the thingsource, Di Bona (2017) proposes an argument for the perception of aspecific characteristic of human voices. She offers an account of howa certain feature of the sound source when the source is the humanvoice is perceivable. Di Bona focuses on gender and engages with thedebate on the admissible content of auditory experience, which hasbeen mostly developed within the field of visual perception, andargues for a “rich” view of the auditory experience. Thisview is defended by means of the method of contrast applied to a caseof auditory adaptation to human voices. The idea is that we hear notonly the low-level auditory properties of pitch, timbre, and loudnessbut also the high-level property of gender, that is, the propertycommonly referred to as being a female voice or a male voice. Thishigh-level property displays adaptational effects and, given thatdisplaying these effects for a property is a clear mark that thisproperty is perceivable, then we can conclude wehear genderproperties instantiated by human voices. O’Callaghan (2011b)also engages with the debate on the admissible content of auditoryperception, endorsing a more restricted view on the content ofaudition. He focuses on speech perception and argues that we do nothear the semantic properties of voices (which can also be seen asproperties of sound sources, of voices) since what is auditoryperceivable are, instead, the phonological properties.
Soteriou (2018) claims that both versions of the event theory are toostrict about what they count as sounds: the Located Event Theorycounts as sounds only monadic events happening at material objects;the Relational Event Theory, instead, admits only relational eventsthat are bearers of acoustic properties. Soteriou challenges theassumption that sounds are one kind of thing and suggests that thesounds we hear are bothpure audibilia (such as the barkingof a dog), which are events or act-types bearing acoustic propertiesthat can also be disconnected from their material causes and cannotexist in a vacuum, and events or act-types that can lack acousticproperties (footsteps), which are perceivable by means of modalitiesthat are different from audition, and can exist in a vacuum.Soteriou’s “simple view” or “catholicview” (2018: 48) moves from the necessity of rejecting the ideaof considering the experiences of echoes and recorded sounds asillusory experiences or distortion of space and time, which is how hethinks both event theories regard them in order to be consistent. Theevent theories classify the experiences of echoes and recorded soundsas illusory for the sake of preserving the idea that, usually, soundsare the bearers of acoustic properties located where we experiencethem to be located, namely distally. That is because if sound issomething located at its own source and the echo seems to be a soundnot perceived at its source, then we are hallucinating a sound. Whenhearing recorded sound, we experience and illusion since the bearer ofthe acoustic properties that we actually hear to be distally locatedis distinct from the “original” bearer of acousticproperties (2018: 44). Soteriou suggests that when hearing echoes orrecorded sounds, even though we do not hear event-like individualsdistally located, we do not have an illusory experience since what wehear arepure audibilia. Soteriou’s proposal has themerit of broadening up the range of the audible, but it mightencounter some worries because of the non-univocal characterization ofwhat sounds are. For example, a challenge to this view is to explain,on the one hand, how pureaudibila are connected to thephysical causes and, on the other hand, how collisions or non-audibleact-types “acquire” acoustic properties that make themaudible.
Nudds (2018) challenges both event theories by virtue of “theargument from the medium” which is elaborated to show thatenvironmental events are not the bearers of acoustic features (2018:54). The main premise of the argument, the existence premise, statesthat we can hear a sound without the occurrence of an environmentalevent. This premise can be used against both event theories, whichclaim that sounds are either identical to environmental events or toparts of such events (2018: 67). Nudds imagines two situations, S1 andS2. In S1 there is a disturbance of the medium and an environmentalevent, such as a collision; S2 is just like S1 except for the factthat the disturbance of the medium takes place without that there isan environmental event. In both situations you hear a sound, but in S2there is not any environmental event; therefore, it is possible tohave a sound and its acoustic features without the happening of anenvironmental event. In order to check whether this conclusion istrue, Nudds elaborates an analogous case within vision and concludesthat while there are two different arguments to show that the premiseis false in vision, none of them apply also to the auditory case. Heasks to imagine to have the situation S3 in which you see a red cube,and the situation S4 in which we keep the same pattern of light whichdetermine the appearance of a red cube in S3, except for the fact thatthere is no red cube. The arguments to show that the premise is falsein the visual case are as follows.
(1) The premise is false since you cannot see a red cube without thatthere is something, a real thing, that is the bearer of the visualfeatures that usually a red cube has. When it seems to you that youare seeing a red cube without that there actually is a red cube, it isbecause you are hallucinating it: there is nothing that instantiatesthe visual features of a real red cube. Can we apply the samereasoning to audition in order to claim that the premise is falseconcerning auditory appearances too? When hearing a sound in S2, youdo not hallucinate the sound of a collision since you can perfectlyhear a sound without that there is a collision, as when listening tomusic played by loudspeakers. Therefore, there is still something thatis the bearer of auditory qualities that you hear in S2 which can beproduced, for example, by loudspeakers and you are clearly not havinga hallucination (Nudds 2018: 56). That is to say that while it seemsthat, in the visual case, we hallucinate, in the auditory case we areclearly hearing a sound, and we can conclude that the premise is truein audition. Nudds’ conclusion can be challenged by the eventview since one can say that an environmental event is still present inS2, that is, the event involved in the mechanics responsible for thefunctioning of the loudspeaker. Therefore, the premise is false in theauditory case as well, but for a different reason: not because you arehallucinating a sound, but because the loudspeaker case does not showthat you hear sound in the absence of an environmental event.
(2) Nudds (2018: 56) adds that the falsity of the existence premise inthe visual case can be showed also by imagining to see a hologram inS4. If what you see when it seems to you that you see a red cube is ahologram of a red cube—that is, you have the experience ofseeing something that is indistinguishable from seeing a real redcube—you don’t literally see the object, but you do nothallucinate it as well. When seeing a hologram of a red cube, you seesomething which is the bearer of visual qualities without that theactual object having these visual features is present. Starting fromMartin’s (2012) discussion of visual images, according to whicha hologram is the appearance of an object “in the absence of anyobject which might possess it” (2012: 339), Nudds concludes thata hologram presents the appearance of something that it does notinstantiate. Therefore, the existence premise is false not because youliterally see nothing in S4, as usually happens in hallucinations, butbecause what you see is not the bearer of visual features; and yet itis indistinguishable from a real red cube. Is there a similar case inaudition which can be used in order to show that the existence premiseis false? Given that auditory holograms do not seem to exist, namely,“something that presents the acoustic appearance of asound-producing event, an appearance it does not itselfinstantiate” (2018: 58), it is difficult to claim that theexistence premise is false in audition for the same reasons for whichit is false in vision; therefore, Nudds concludes that the existencepremise is true: environmental events are not the bearers of acousticfeatures. This conclusion is based on the debatable notions ofauditory hologram and auditory “image”. Is it reallylegitimate to look for analogous auditory concepts for notions thatare authentically visual? Important questions that are triggered byNudds’ argument are: how far we can go with the analogy betweenvision and audition, and how admissible is to ground our reasoning onthe analogy between the two sensory modalities especially when dealingwith items so intrinsically visual, such as holograms and images?
There are yet other ways to construe sounds under the distal umbrella.According to dispositional account of sounds,
sounds are dispositions of objects to vibrate in response to beingstimulated. Sounds are perceived transiently, but they are notperceived as being transient and they are not in fact transient.(Kulvicki 2008: 2)
The account takes Pasnau’s original insight that sounds beconsidered as akin to colors in their being located features ofobjects. Indeed, Kulvicki draws a set of analogies between sounds andcolors based on the assumption that colors are dispositions to reflectlight in a certain manner. The analogies include the fact that ascolors exist in the dark (being dispositions) sounds exist in thevacuum, and that as light waves are a way to get knowledge of colors,compression waves in a medium are a way to get knowledge of sounds. Tocomplete the analogy, as “colors give off light when stimulatedby light, objects in a medium give off compression waves when they arethwacked” (2008: 4). More controversially, “Withoutvibrating, objects have sounds, but these sounds cannot beheard” (2008: 4).Thwacking is the key element in thedispositional proposal. Thwackingreveals sounds. A goodthwack (an impulse that contains all relevant frequencies) isconsidered to be to sounds what white light is to colors. White lightcontains waves of all lengths in roughly the same proportion and thussamples adequately the dispositions of the object to reflect some ofthem. If you lit up a surface with a monochromatic laser beam at500nm, the disposition the surface may have to reflect light of 600nmwill not be revealed. A good thwack makes the object resonate at allfrequencies and thus reveals the vibratory modes of an object bystimulating it at the frequencies at which it responds better.
The impulse proposal solves an alleged disanalogy between color andsound, i.e., the absence of a normative analogue of white light in thecase of sounds (O’Callaghan 2007). White light is normativelysignificant for revealing colors; a good impulse is normativelysignificant for revealing a sounding disposition.
Kulvicki’s dispositional theory neatly accounts for some distalintuitions about sounds. (Other intuitions, such as the idea thatsound have a loudness, are beyond the descriptive power of the theory,that on that score considers loudness as a property not of the sound,but of the thwack). In particular it highlights the importance ofaction in bringing about auditory information about an object: mostobjects sound because we deliberately impart a thwack on them, and inmany cases in which we want to know how an object sounds, we do impartthe thwack, and subtly modulate it. When we hear a sound we getknowledge of a kind of resistance an object opposes to thwacking; itis knowledge of an elastic disposition. The account explains the factthat the more the object reacts, the more it is sonorous. And anobject—say, a guitar string—is “A sharp”because at that frequency it responds optimally to thwacking.
Kulvicki’s alleges that an analysis of the phenomenon of soundconstancy, modeled on an analogue to color constancy, militates infavor of the dispositional view. Colors are notoriously resilient tomany changes of illumination, and even when incident light compositionis very distant from standard light, surfaces may be seen as havingtheir standard color. “The green grass looks green in the orangelight at dusk, but it looks like green grass illuminated oddly”(Kulvicki 2008: 9). This is in sync with the likely function ofvision—to inform about distal, stable properties of surfaces.Analogously, Kulvicki claims, “objects sound roughly the samewhether it is white noise [i.e., an impulse] or something reasonablyfar from it that stimulates them” (2008: 9). Our ability torecognize voices across a large variation of ways to produce them is acase in point. In general, humans are able to identify sounds in termsof their spectral slopes and spectral envelope patterns, which are“fairly constant across changes in the fundamental frequencyproduced by the vocal chords” (2008: 10). Now, it can beobjected that voices are but a part of the auditory world, and theyare the object of dedicated neural machinery. More generally, however,sound constancy is exhibited in hearing objects of various sorts.
There is no obvious reason the auditory system should exhibitconstancy for something like spectral slope or envelope pattern,unless the auditory system has the function of identifying stableproperties of objects. (2008: 10)
Facts of auditory constancy thus help out the dispositional theoryinsofar as the function of the auditory system is to find out whateverstable property of the environment is heard as being indeedconstant.
We shall enlist two objections to the dispositional account.
A first problem with the dispositional account is that of specifyingwhat precisely the disposition in question is. An object’sdisposition to vibrate at its natural frequencies does not exhaust thepossibilities.
When struck, an object vibrates in an odd but characteristic fashionfor a brief time before it settles into vibrating at its naturalfrequencies, if it has any. Like natural frequencies, these attackpatterns are relatively stable dispositions of objects to vibrate whenthwacked. Similarly, objects’ vibrations decaycharacteristically. Attacks and decays are stable dispositions ofobjects to vibrate when thwacked, just as natural frequencies ofvibration are. (Kulvicki 2008: 14)
Hence there are three dispositions (at least) here: a disposition toreact in the attack mode, a disposition to react in the decay mode, adisposition to vibrate in a certain way. There are many more. Forinstance, a disposition to respond by emitting a certain pattern ofwaves when rubbed or when brushed; or when broken. Many are thedispositions, and the dispositional account should endorse the ideathat many are the (possibly unheard) sounds here. Call this themany-disposition problem.
Normativity half-solves the many-disposition problem. Some ways ofimparting energy to an object are more telling than others. Bycontrast, light is the sole “thwacker” for colors (modulodifferences in intensity and light spectrum), but there are manytwhackers for eliciting the sonority of an object. The dispositionaltheory of colors suffers from some variant of the many-dispositionproblem. The color of many objects and stuff depends on theirtemperature. At 1000 °C, iron glows. At −20 °C, water iswhite.
A different problem for the dispositional theory is: What are thoseindividuals that we hear when the object sounds? The main problem adispositional account of sounds has to face is that even if we acceptthat sounds are sound dispositions, there appear prima facie to exist,on top of those dispositions, the individual, occurrentsounds—those that last when the disposition is realized. (Asimilar problem plagued Chisholm’s account of events asrepeatable entities: as Davidson pointed out, any such account shouldaccommodate the problem of assigning a status to the singleoccurrences of the repetitions, which cannot be themselves repeatableitems.) Indeed, as Kulvicki notes,“now and again one encountersobjects making sounds they do not have” (2008: 8); he dismissesthese encounters as rather exceptional, but no matter theirfrequency—and radio listening provides a very commonexample—the “sound made”, as distinct from the“sound had”, must be accommodated by any theory of sounds.Kulvicki addresses the problem in a general way, by distinguishingbetween sounds as dispositions and the hearings of sounds. There areno individual sounds, only episodes of hearing the dispositionalsounds. We do hear sounds, these are the dispositions, but theimpression that sounds last depends on the fact that our hearingepisodes last.
Kulvicki denies that this account entails an error theory because hethinks that intuitions are not particularly telling about thedistinction between hearing as an individual event and hearing asound. However, a threefold distinction should apply here. Grant thedispositional sound. Grant hearing episodes. Now, the duration of ahearing episode is different from the duration of an occurrent sound.You set a tuning fork in motion. It vibrates for thirty seconds. Butthen, after ten seconds, you get bored and block your ears with yourhand. The duration of your hearing is ten seconds. But the sound isstill there, for another twenty seconds. Generally speaking, unheardsounds are accounted for in the dispositional theory only asunrealized dispositions. Occurrent unheard sounds are invisible to thedispositional account.
Kulvicki answers this objection by saying that this phenomenon isconsistent with the dispositional view (2014: 218). In order to havesounds, it is required that the dispositions of an object to vibratein a certain way are realized, that these dispositions generate wavesin a medium, and that ears can detect these waves. Occurrent unheardsounds produced when you block your ears and do not hear the tuningfork vibrating for thirty seconds are not a problem simply becausethey are not sounds. For the dispositional view “sounds are notoccurrent anythings: they are qualities” (2014: 218). But thequestions that naturally arise and need to be answered are: if theunheard dispositions are not sounds, what are they? Moreover,imagining that when I am still blocking my ears with my hand, someoneelse is there listening to the tuning fork, would she be listening tosounds? What are the items she would be listening to?
Another critical point of the dispositional view is that it does nottake into account the intuitive fact that sounds are heard asunfolding over time and as having a duration, which is a clear sign oftheir being an event. Kulvicki (2014: 211) discusses this byelaborating on the plausibility of the argument of “perceptualseeming” that he attributes to the people who claim that soundsare events. The argument is based on the commonsensical idea for whichif soundappears to have a duration, it is because itactuallyhas a duration. This argument rests on threeassumptions:
For Kulvicki, even though the concurrence of these three assumptionsdoes not imply that sounds have durations, the claim that sounds havea duration is still thebest explanation of the threeuncontroversial claims. Therefore, even though, he does not propose away to completely undermine this argument, he suggests a way to weakenit at least. The dispositional view can account for the third auditoryintuition—for which the time the events we hear take to occur isoften coincident with the duration of the related auditoryexperience—emphasizing the role of mechanical stimulants. If weimagine the circumstance of a “long-lasting, broad-spectrummechanical stimulant” (2014: 214) which thwacks the soundingobject in a stable vibratory disposition, it is not surprising that wehear events for about as long as we hear sounds. Kulvicki himselfrecognizes that this is not a conclusive way of excluding that soundshave durations, but yet it is a way of weakening this claim.
Recently Kulvicki (2017: 91) has proposed a more inclusive view ofwhat sounds are. He admits that identifying sounds with the merevibratory dispositions of objects does not seem to capture the fullrange of the auditory experience. Therefore, among sounds he includesalso aspects of events and the contexts in which those aspects areheard, and distinguishes between non-perspectival and perspectivalqualities. The former are the dispositions of objects to vibrate inresponse to being mechanically stimulated but also happenings, and thesecond are “abstractions over features of objects, events, andenvirons” (2017: 87). Both qualities are taken to be sounds,which are a “mongrel” category. Kulvicki explains that theperspectival qualities are “abstraction over intrinsic andrelational features of ordinary objects, environments, and rangeevents in which they participate” (2017: 90). That is,concerning some aspects things sound in a way and, concerning otheraspects, they sound differently, depending on the perspective. When weimitate the voice of someone, the reproduction of her voice is similarto the original in some respects but is different in others. WithinKulvicki’s view, the worry is that sounds seem to be differentkinds of things, very distant and often in opposition with each other(features of objects but also happenings, perspectival properties butalso non-perspectival qualities). Rather than claiming that these aredifferent ways of characterizing sounds, Kulvicki is exhaustivelytelling us the different things we can hear, without explaining infull what sounds are.
We divided up existing accounts of sounds according to the spatiallocation they assign to sounds. However, there are aspatial theoriesof sounds as well. Aspatial theories deny either (i) that sounds areintrinsically spatial, or (ii) that auditory perception isintrinsically spatial. Arguably, claim (i) implies claim (ii), but theconverse is not true, which leaves room for an interesting aspatialtheory of auditory perception which nevertheless acknowledges thatsounds have some spatial locations.
We have seen the use of phenomenological arguments against both medialand proximal theories of sound. To sum them up, one may claim that,first, auditory experience has a spatial content whereby sounds seemto be located in egocentric space (to the left, above, in front of us,etc.). Second, unless one subscribes to an error theory of auditoryexperience, sounds are where they are (normally) heard to be, namelyat their sources.
Before presenting a-spatial theories, let us conclude the discussionon the spatial theories of sound by adding that they all share theassumptions that reflections on the spatiality of audition are crucialin order to tackle issues on the metaphysics of sound—that is,we can say what soundsare when we knowwhere theyare located (Di Bona 2019). Moreover, considerations on the spatialityof audition and sounds are also important to understand thesegregation of auditory streams—which constitute the auditorylandscape—and the experience of some musical compositions(2019). Finally—despite the remarkable difference between thespatiality of vision and the spatiality of audition due to the factthat, in audition, we perceive where sources are, the space between usand the sources (O’Callaghan 2010b) and the space between thesources, whereas, in vision, we perceive also where objects arepotentially to be seen (Nudds 2009)—spatial cues can also enterthe auditory content and provide info on the volume of empty space,especially when derived from reverberations produced in a specificclosed space (Young 2017).
But are sounds really located in space? There is in the literaturea strong non-locational view of sounds. Strawson has made theplausibility of a thought-experiment of a purely auditory world restupon the tenet of an intrinsic non-spatiality of sounds. (Philosophersand psychologists such as Lotze (1841 [1884: Ch. 6, 123–6]),Binet (1905: Ch. 3), Heymans (1911 [1905]: Ch. 1), Stumpf (1883–1890: Vol. 1,29), Wellek (1934), Révész (1946), Ihde (1976: Ch. 6),Evans (1980 [1986: 248 ff.]), have investigated the phenomenology ofauditory space and suggested the possibility of a purely auditoryworld.) Strawson (1959: 65–66) thus wrote:
Sounds…have no intrinsic spatial characteristics: suchexpressions as “to the left of”, “spatiallyabove”, “nearer”, “farther” have nointrinsically auditory significance (…). A purely auditoryconcept of space…is an impossibility. The fact that, with thevariegated types of sense-experience which we in fact have, we can, aswe say, “on the strength of hearing alone” assigndirections and distances to sounds, and things that emit or causethem, counts against this not at all. For this fact is sufficientlyexplained by the existence of correlations between the variations ofwhich sound is intrinsically capable and other non-auditory featuresof sense-experience.
Strawson designed his thought-experiment in the context of an analysisof the Kantian claim that the notion of there being objective entities(entities which do not depend on our perception of them) involves thenotion of space:
[t]he question we are to consider, then, is this: Could a being whoseexperience was purely auditory have a conceptual scheme which providedfor objective particulars? (Strawson 1959: 66)
Thus, he imagined a character (later on called “Hero” byEvans 1980) who has only non-spatial auditory experiences. Heroperceives sounds, but he does not perceive them to be located inphysical space. What Strawson tried to show is that Hero needs an“analogue” to the notion of space in order to“locate” sounds when they are not actually heard. Thisanalogue is provided by what Strawson calls a“master-sound”, namely a constant sound varying in pitch.Any particular sound is heard against the background of themaster-sound. Thanks to the master-sound, Hero can distinguish betweenexperiencing the same particular sound again (when its“location” as provided by the master-sound is the same),and experiencing successively two particular sounds of the same type(when they have different “locations” on the master-soundmap). Evans (1980) questioned the claim that the master-sound couldplay the role of physical space in grounding the notion of objectiveparticulars; see also Strawson’s (1980) reply to Evans.
In evaluating Strawson’s thought-experiment, we shoulddistinguish the claim that there can be non-spatial auditoryexperiences from the claim that there can be a world populated onlywith sounds. Strawson’s thought-experiment justifies the formerclaim, but it does not obviously lead to the latter. One cannot inferfrom the fact that we can perceptually represent a sound withoutrepresenting its location, that we can perceptually represent anon-located sound.
It is possible to argue that auditory perception is not intrinsicallyspatial independently of a commitment to the claim that sounds do nothave spatial locations. This is O’Shaughnessy’s view, whowrites that
while we have the auditory experience of hearing that a sound comesfrom p, we do not have any experience that it is here where it nowsounds…And this is so for a very interesting reason: namely,that we absolutely never immediately perceive sounds to be at anyplace. (2000: 446)
However, O’Shaughnessy does not draw the conclusion that soundshave no spatial locations. On the contrary, as we have seen, hedefends a proximal account of the location of sounds, according towhich sounds are where hearers, rather than sources, are (2009).
O’Shaughnessy would not be impressed by allegedlyphenomenological arguments according to which one normally hearssounds as located at their sources. One may still have the feelingthat his sophisticated version of proximal theories does not locatesounds where an untutored description of what is perceived suggeststhey are. As a consequence, it appears to be another massive errortheory of auditory perception.
Scruton (2009a, 2009b; see also his 1997) proposes a non-physicalistaccount of sounds. He is impressed by the fact that when we hearsounds as music we (can) hear them as events detached from theirphysical causes. He then suggests that sounds are “pureevents”, things that happen but which don’t happen toanything, and that they are “secondary objects”, entitieswhose nature is bound up with the way we perceive them.
One way of reconciling Scruton’s interesting suggestions with aphysicalist account of sounds is to draw a distinction (which ofcourse should be properly developed) between the ontology we need toaccount for our ecological ability to hear sounds in our naturalenvironment and the ontology we need to account for our (at leastpartly acquired) ability to hear sounds as music. Eventually, theontology of sounds as music, which Scruton wants to focus on, might bequite different from the ontology of natural sounds, which can stillbe of the physicalist kind.
We have suggested that a fruitful way to classify the various accountsof sounds that have been given in the literature is in terms of theirspatial locations. If the sounds we hear have spatial locations, theycan be thought to be located either where the material sources are(distal theories), or where the hearers are (proximal theories), orsomewhere in between (medial theories). It has also been denied thatsounds have any spatial locations, which gives rise to a fourth classof theories, aspatial theories. All these theories have interestinglydifferent phenomenological, epistemological and metaphysicalimplications.
How to cite this entry. Preview the PDF version of this entry at theFriends of the SEP Society. Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO). Enhanced bibliography for this entryatPhilPapers, with links to its database.
causation: the metaphysics of |color |Davidson, Donald |events |perception: auditory
This entry was prepared with the help of funds from theIST-2002-002114 Enactive Network of Excellence of the 6th Frameworkprogramme of the European Commission. Thanks to Maurizio Giri for helpon some parts of the draft.
View this site from another server:
The Stanford Encyclopedia of Philosophy iscopyright © 2023 byThe Metaphysics Research Lab, Department of Philosophy, Stanford University
Library of Congress Catalog Data: ISSN 1095-5054