ADJUSTING SPATIAL AUDIO PLAYBACK BASED ON DEVICE ORIENTATION
CROSS-REFERENCE TO RELATED APPLICATION
[0001] This application claims priority to co-pending U.S. Provisional Application No. 63/557,322 filed on February 23, 2024, which is hereby incorporated herein by reference in its entirety.
FIELD OF THE DISCLOSURE
[0002] The present disclosure is related to consumer goods and, more particularly, to methods, systems, products, features, services, and other elements directed to media playback or some aspect thereof.
BACKGROUND
[0003] Options for accessing and listening to digital audio in an out-loud setting were limited until in 2002, when Sonos, Inc. began development of a new type of playback system. Sonos then filed one of its first patent applications in 2003, entitled “Method for Synchronizing Audio Playback between Multiple Networked Devices,” and began offering its first media playback systems for sale in 2005. The SONOS Wireless Home Sound System enables people to experience music from many sources via one or more networked playback devices. Through a software control application installed on a controller (e.g., smartphone, tablet, computer, voice input device), one can play what she wants in any room having a networked playback device. Media content (e.g., songs, podcasts, video sound) can be streamed to playback devices such that each room with a playback device can play back corresponding different media content. In addition, rooms can be grouped together for synchronous playback of the same media content, and/or the same media content can be heard in all rooms synchronously.
BRIEF DESCRIPTION OF THE DRAWINGS
[0004] Features, aspects, and advantages of the presently disclosed technology may be better understood with regard to the following description, appended claims, and accompanying drawings, as listed below. A person skilled in the relevant art will understand that the features shown in the drawings are for purposes of illustrations, and variations, including different and/or additional features and arrangements thereof, are possible. [0005] Figure 1 A is a partial cutaway view of an environment having a media playback system configured in accordance with aspects of the disclosed technology.
[0006] Figure IB is a schematic diagram of the media playback system of Figure 1 A and one or more networks.
[0007] Figure 1C is a block diagram of a playback device.
[0008] Figure ID is a block diagram of a playback device.
[0009] Figure IE is a block diagram of a bonded playback device.
[0010] Figure IF is a block diagram of a network microphone device.
[0011] Figure 1G is a block diagram of a playback device.
[0012] Figure 1H is a partial schematic diagram of a control device.
[0013] Figures II through IL are schematic diagrams of corresponding media playback system zones.
[0014] Figure IM is a schematic diagram of media playback system areas.
[0015] Figure 2A is a front isometric view of a playback device configured in accordance with aspects of the disclosed technology.
[0016] Figure 2B is a front isometric view of the playback device of Figure 2A without a grille.
[0017] Figure 2C is an exploded view of the playback device of Figure 2A.
[0018] Figure 3 A is an exploded view showing an example of a playback device in accordance with examples of the disclosed technology;
[0019] Figure 3B is an isometric side view of the playback device of Figure 3 A.
[0020] Figure 3C is a partial exploded view of the playback device of Figure 3A.
[0021] Figure 3D is a perspective view of the playback device of Figure 3 A.
[0022] Figure 4A is a flow diagram of a method of tuning a playback device in accordance with aspects of the disclosed technology.
[0023] Figure 4B is a flow diagram of a method of tuning a playback device in accordance with aspects of the disclosed technology.
[0024] Figure 4C is a flow diagram of a method of tuning a playback device in accordance with aspects of the disclosed technology.
[0025] Figure 5A is a circuit diagram showing an example of digital signal processing circuitry for rendering audio when a playback device is in a standard orientation, in accordance with aspects of the disclosed technology.
[0026] Figure 5B is a circuit diagram showing an example of the digital signal processing circuitry for rendering audio when a playback device is in an inverted orientation, in accordance with aspects of the disclosed technology. [0027] Figure 6A is a circuit diagram showing an example of digital signal processing circuitry for rendering audio when a playback device is in the standard orientation, in accordance with aspects of the disclosed technology.
[0028] Figure 6B is a circuit diagram showing an example of the digital signal processing circuitry for rendering audio when a playback device is in the inverted orientation, in accordance with aspects of the disclosed technology.
[0029] Figures 7A-7E are partial schematic diagrams showing examples of playback device configurations in accordance with aspects of the disclosed technology.
[0030] Figure 8 is a flow diagram of a method of tuning one or more playback devices in accordance with aspects of the disclosed technology.
[0031] The drawings are for the purpose of illustrating example embodiments, but those of ordinary skill in the art will understand that the technology disclosed herein is not limited to the arrangements and/or instrumentality shown in the drawings.
DETAILED DESCRIPTION
I. Overview
[0032] Embodiments described herein relate to tuning playback device parameters related to multi-channel audio playback so as to compensate for otherwise potentially sub-optimal placement of one or more playback devices within a listening environment.
[0033] Home theater audio configurations can involve a plurality of playback devices distributed about a listening environment. In some instances, for example, a primary playback device (e.g., a soundbar) can be configured to be placed in a front center position of the listening environment, and one or more satellite playback devices can be placed in various positions about the listening environment. Depending on the type of audio content, the number and type of playback devices, and/or user preferences, satellite playback devices may be placed in front right, front left, rear left, rear right, right side, left side, or other suitable positions relative to the intended listening position.
[0034] Typical wireless home theater approaches assume that individual satellite playback devices output a single audio channel (e.g., a left rear satellite playback device outputs only a left rear audio channel; a right rear satellite playback device outputs only a right rear audio channel). While such single-channel satellite playback devices provide significant benefits over systems that do not utilize satellite playback devices at all, multichannel satellite playback devices can provide additional benefits for the listener. As described in more detail below, by using satellite playback devices capable of outputting multiple audio channels (for example, outputting different audio channels along different sound axes) can provide a more immersive listening experience for the user. Moreover, such multichannel satellite playback devices are better able to capitalize on spatial audio formats (e.g., DOLBY ATMOS or DTS:X) that may allow for a greater number of channels than conventional audio formats.
[0035] The use of such multi-channel satellite playback devices presents certain challenges, however. In some cases, a user’s placement of a multi-channel playback device within the listening environment differs from the intended placement, either in terms of device location or device orientation. For example, when a satellite playback device is placed relatively high on a wall (e.g., closer to the ceiling than the floor), the user may orient the playback device upside-down (referred to herein as being inverted). For some playback devices, this placement may not present a significant issue in terms of the resulting listening experience for the user. However, for playback devices capable of rendering spatial audio channels, including a height channel, for example, orienting the playback device upside-down can be problematic because, among other reasons, the height audio information would be directed downward towards the floor, rather than upward toward the ceiling. The result is a suboptimal listening experience. One approach to resolving this problem might be to simply disable height channel audio, but doing so may defeat the purpose of a highly-capable spatial audio playback device.
[0036] Accordingly, techniques are described herein for modifying playback parameters of multichannel playback devices to compensate for placement and/or orientation within the listening environment. As a result, even when a user places multichannel audio satellite playback devices in undesirable locations, positions, and/or orientations, the system can adapt playback to provide an improved listening experience. According to certain examples, when a spatial audio playback device (e.g., one capable of rendering a height audio channel in addition to one or more lateral audio channels) is oriented upside-down, the transducer array can be reconfigured to re-assign responsibility for rendering the height audio channel, and optionally one or more lateral audio channels, to different transducers in the array to compensate for the inverted orientation of the playback device. In addition, one or more digital signal processing parameters associated with audio playback, such as amplification levels, equalization parameters, and/or filtering, can be adjusted to compensate for the inverted orientation of the playback device, as described in more detail below. For example, a sound field corresponding to one or more audio channels (e.g., a center channel) can be repositioned based on whether the playback device is in the standard or inverted orientation. Furthermore, in some examples, techniques are provided for balancing audio rendering among multiple playback devices in a bonded zone or group (e.g., a stereo pair or home theater arrangement) when one (or more) of the playback devices is in an inverted orientation.
[0037] In some examples, there is provided a method for a playback device comprising a plurality of audio transducers configured to output audio along a plurality of sound axes including at least a first lateral sound axis and a vertical sound axis, the first lateral sound axis being aligned, within an inclination angle of less than 30 degrees, with a first plane extending along a horizontal axis of the playback device and the vertical sound axis being angled with respect to the first horizontal axis by between 50 and 90 degrees. The method may comprise receiving, at the playback device, multichannel audio content including a first audio channel, obtaining an indication of an orientation of the playback device, and determining, based at least in part on the indication of the orientation of the playback device, a vertical offset relative to the horizontal axis that corresponds to at least one perceived source location of the first audio channel. In some examples, the method further comprises determining, based on the vertical offset, proportions of the first audio channel assigned for playback via the first lateral sound axis and the vertical sound axis that generate a sound field corresponding to the first audio channel at a height with respect to a center of the playback device that corresponds to the vertical offset, and playing back the first audio channel via the first lateral sound axis and the vertical sound axis.
[0038] These and other examples and aspects described herein improve upon earlier-developed systems and methods including, for example the systems and methods disclosed and described in the following earlier-filed patent applications assigned to Sonos, Inc.
[0039] U.S. Patent No. 8,234,395 titled, “System and method for synchronizing operations among a plurality of independently clocked digital data processing devices,” filed on April 1, 2004 and issued on July 31, 2012 (“Millington ‘395) describes, among other features, examples of synchronizing audio playback among a plurality of playback devices or groups of playback devices.
[0040] U.S. Patent No. 10,712,997 titled “Room Association Based on Name,” filed on August 21, 2017 and issued on July 14, 2020 (“Wilberding ‘997”) describes, among other features, using playback device attributes by a controller application to control one or more playback devices in a media playback system. According to Wilberding ‘997, the playback device attributes can include one or more of (i) a player name for the playback device, (ii) a player type of the playback device, (ii) a player icon for the playback device, (iii) a player configuration for the playback device, (iv) a zone name for a zone associated with the playback device (e.g., the “downstairs zone” or “bedroom zone”), (v) a session name for a session associated with the playback device, (v) a room name where the playback device is located, (vi) a room type where the playback device is located, or (vii) a name of an area where the playback device is located (e.g., “downstairs” or “patio”). According to Wilberding ‘997, the controller application can be installed on a control device that may present a graphical user interface to facilitate user access and control of the media playback system, optionally using one or more of the playback device attributes.
[0041] U.S. Patent No. 8,483,853 titled “Controlling and manipulating groupings in a multizone media system,” filed on September 11, 2007 and issued on July 9, 2013 (“Lamboume ‘853”) describes, among other features, techniques of controlling a plurality of multimedia players in groups. According to Lamboume ‘853, a user can group some of the players according to a theme or scene, where each of the players is located in a zone. Lamboume ‘853 discloses that when the scene is activated, the players in the scene react in a synchronized manner. For example, the players in the scene can all be caused to play a multimedia source or music in a playlist, wherein the multimedia source may be located anywhere on a network.
[0042] U.S. Patent No. 8,788,080 titled “Multi-channel pairing in a media system” filed on April 8, 2011 and issued on July 22, 2014 (Kallai ‘080”) describes, among other features, techniques for grouping, consolidating, and/or pairing two or more playback devices together to create or enhance multi-channel audio reproduction, such as stereo, surround sound, or some other multi-channel environment.
[0043] U.S. Patent No. 10,499,146 titled “Voice control of a media playback system,” filed on February 21, 2017 and issued on December 3, 2019 (“Lang ‘ 146”) discloses voice control and related features and functionality for media playback devices, networked microphone devices, microphone-equipped media playback devices, and speaker-equipped networked microphone devices. Lang ‘ 146 describes, among other features, designating and managing default networked devices, audio response playback, room-corrected voice detection, content mixing, music service selection, metadata exchange between networked playback systems and networked microphone systems, handling loss of pairing between networked devices, actions based on user identification, and other voice control of networked devices.
[0044] U.S. Patent No. 9,886,234 titled “Systems and methods of distributing audio to one or more playback devices” filed on January 28, 2016 and issued on February 8, 2018 (“Lin ‘234”) describes, among other features, playback devices that comprise multiple transducers (e.g., one or more woofers and/or tweeters) and techniques for configuring the playback devices to generate audio streams to be sent to the various transducers. According to Lin ‘234, playback devices may stream and play audio content according to audio processing algorithms that may be customized for different transducers of the playback device. For example, Lin ‘234 discloses that a playback device may use the audio content to generate a first audio stream for a first woofer, a second audio stream for a second woofer, and a third audio stream for a tweeter. According to Lin ‘234, to avoid delay in rendering the audio content, at least some of the processing to generate these different audio streams can be performed by a computing device that is not designated to play the audio content. ’The computing device can then send the audio streams to the corresponding playback device(s).
[0045] U.S. Patent No. 9,042,556 titled “Shaping sound responsive to speaker orientation,” filed on July 19, 2011 and issued on May 26, 2015 (“Kallai ‘556”) discloses, among other features, adjusting audio output based on a detected orientation of a playback device. According to Kallai ‘556, an audio data stream is obtained by a player having one or more speaker drivers, an orientation of the player is determined, and sound is reproduced by the player based on the orientation. Kallai ‘556 further discloses that the sound may be further shaped based on other states of the player in addition to orientation (such as whether the player is grouped with another player, for example, in a stereo pair or other group configuration), and that the overall sound may be shaped from one player or from a collection of players. According to Kallai ‘556, sound reproduced by the player may be shaped differently depending on the orientation of the player. For example, the sound coming from each speaker driver may be configured to reproduce a different frequency range, channel, or both frequency range and channel depending on the orientation. For example, Kallai ‘556 discloses that the sound coming from a plurality of speakers in the player may be in stereo when in horizontal position, whereas the sound coming from the same plurality of speakers may be in monaural when in vertical position.
[0046] U.S. Patent No. 9,456,277 titled “Systems, methods, and apparatus to filter audio,” filed on July 11, 2014 and issued on September 27, 2016 (“Burlingame ‘277”) describes, among other features, applying individual audio filters to corresponding transducers according to a determined orientation. According to Burlingame ‘277, the filters may be based on a device parameter, such as a distance between the transducers.
[0047] U.S. Patent No. 9,973,851 titled “Multi-channel playback of audio content,” filed on December 1, 2014 and issued on May 15, 2018 (“Chamness ‘851”) discloses, among other features, adjusting radiation patterns of a playback device based on orientation (and/or other parameters). According to Chamness ‘851, multi-channel playback of audio content (using multiple audio drivers and/or multiple playback devices) may enhance a listener’s experience by causing the listener to perceive a balanced directional effect when the audio content is played back. Chamness ‘851 discloses that, in order to widen an area over which a balanced directional effect may be perceivable, signal processing may be used to produce target radiation patterns corresponding different sets of audio drivers. Chamness ‘851 describes generating transfer functions based on the desired target radiation patterns and causing individual drivers to output sound accordingly. In some examples, the drivers include those that are oriented upward toward a ceiling of a room.
[0048] International Patent Publication No. WO/2025/014856 titled “Height audio adjustment based on listening environment characteristics” with an international filing data of July 8, 2024 (“Jones ‘4856”) describes, among other aspects, techniques for determining ceiling heights and/or relative heights of devices in a room or zone.
[0049] U.S. Patent No. 9,264,839 titled “Playback device configuration based on proximity detection,” filed on March 17, 2014 and issued on February 16, 2016 (“Oishi ‘839”) discloses detecting a barrier (e.g., a wall or a ceiling) near a playback device and adjusting acoustic output accordingly. According to Oishi ‘839, in some circumstances, when a barrier, such as an object or wall, is in proximity to the playback device, the barrier can affect the audio output of the playback device. For example, an object placed in front of a speaker of the playback device may distort the audio output of the speaker. Accordingly, Oishi ‘839 discloses techniques for dynamically configuring a playback device to compensate for the presence of a barrier when the playback device detects that the barrier is within a close proximity of the playback device. Oishi ‘839 discloses that dynamic configuration of the playback device can involve deactivating a speaker of the playback device (e.g., a particular speaker of the playback device may be deactivated when a barrier is detected within close proximity to the particular speaker) and/or modifying the sound output of a particular speaker to help compensate for the presence of a barrier detected within close proximity to the particular speaker. For example, Oishi ‘839 discloses that the playback device may adjust high range frequency components of the audio output to compensate for distortion of the perceived frequency response of the playback device within that high frequency range caused by the barrier. Oishi ‘839 further discloses that in some examples, dynamic configuration of the playback device involves deactivating a first speaker and modifying the sound output of a second speaker. In some examples, the first and second speakers are part of the same playback device, and in other examples, the first speaker is part of a first playback device and the second speaker is part of a second playback device.
[0050] U.S. Patent Publication No. 2023/0179937 titled “Manipulation of Playback Device Response Using Signal Processing” filed on December 9, 2022 and published on June 8, 2023 (“Chamness ‘9937”) describes outputting multiple audio channels using a multiple driver playback device. According to Chamness ‘9937, each group of audio driver(s) may be configured to generate sound waves corresponding to a certain audio channel according to a particular radiation pattern. Chamness discloses that such radiation patterns may define a direction-dependent amplitude of sound waves produced by the corresponding group of audio drivers (i) at a given audio frequency (or range of audio frequencies), (ii) at a given radius from the audio driver, (iii) for a given amplitude of input signal. Chamness ‘9937 further describes adjusting audio drivers for one or more audio channels to distribute responsibility for audio channel rendering among different transducers and along different sound axes in different scenarios.
[0051] International Patent Publication No. WO/2024/073401 with an international filing data of September 26, 2023 (“Peace ‘ 105”), describes the use of multichannel satellite playback devices in home theater configurations. Peace ‘ 105 describes, among other features, techniques for adjusting the acoustic output of multichannel satellite playback devices to compensate for sub-optimal placement and/or orientation of the playback devices.
[0052] U.S. Patent No. 11,212,635 titled “Systems and methods of spatial audio playback with enhanced immersiveness,” filed on November 24, 2020 and issued on December 28, 2021 (“MacLean ‘635”) describes, among other features, the psychoacoustic effects of vertically directed audio versus forward propagating audio. For example, MacLean ‘635 discloses that some 3D audio rendering formats may include one or more vertical channels configured to represent sounds originating from above a listener, and in some instances, such vertical channels can be played back via transducers positioned over a user’s head (e.g., ceiling mounted speakers). MacLean ‘635 further discloses that in the case of soundbars or other multitransducer devices, an upwardly oriented transducer can output audio along a sound axis that is at least partially vertically oriented with respect to a forward horizontal plane of a playback device. According to MacLean ‘635, this audio output can reflect off an acoustically reflective surface (e.g., a ceiling) to be directed toward a listener at a target location, and because the listener perceives the audio as originating from point of reflection on the ceiling, the psychoacoustic perception is that the sound originates “above” the listener.
[0053] U.S. Patent Publication No. 2022/0066008 titled “Ultrasonic Transmission for Presence Detection,” filed on August 30, 2021 and published on March 3, 2022 (“Jones ‘6008”) describes, among other features, using transmission and reception of acoustic/audio (e.g., ultrasonic) signals to determine the proximity of a playback device to one or more other playback devices. According to Jones ‘6008, one or more playback devices can be caused to transmit/ output unique audio signals that may be detected by a portable playback device (via a microphone, for example). Jones ‘6008 discloses that the portable playback device may detect audio signals from a plurality of other playback devices, and from the detected audio signals, it can be determined which of the plurality of playback devices is closest (based on the strength of the audio signal, for example) to the portable playback device.
[0054] U.S. Patent Publication No. 2021/0099736 titled “Systems and Methods for Playback Device Management,” filed on January 28, 2020 and published on April 1, 2021 (“Soto ‘9736”) describes, among other features, using wireless signal patterns among playback devices to determine locations of playback devices and/or other objects or individuals in a space.
[0055] U.S. Patent Publication No. 2021/0297168 titled “Systems and Methods for State Detection via Wireless Radios,” filed on March 17, 2021 and published on September 23, 2021 (“van Erven ‘7168”) describes, among other features, localizing individuals in a region using wireless signals. Van Erven ‘7168 discloses techniques for using signal strength measurements of portions of a wireless channel to evaluate signals between wireless devices. According to van Erven ‘7168, signal strength indicators for at least some subcarriers within a wireless channel can be used to detect the presence (or absence) of people in a region. Van Erven ‘7168 further describes that audio characteristics, or variables, (e.g., volume, balance, etc.) can be adjusted based on a user’s location in the area between the playback devices.
[0056] U.S. Patent No. 11,178,504 titled “Wireless multi-channel headphone systems and methods,” filed on May 17, 2019 and issued on November 16, 2021 (“Beckhardt ‘504”) describes, among other features, playing surround sound audio with a wireless headphone set based at least in part on the position of the listener (and thus, the position of the headphone set) relative to a screen configured to display video content corresponding to the surround sound audio. Beckhardt ‘504 describes tracking or monitoring the position of the listener’s wireless headphone set relative to the screen so that the surround sound audio generated by the listener’ s wireless headphone set remains consistent with the listener's position as the listener moves about the environment. Beckhardt ‘504 further discloses that the position of the headphone set relative to another playback device (e.g., a soundbar) can be determined/tracked using properties of wireless signals, such as signal strength measurements (e.g., received signal strength (RSS) measurements), wireless round-trip transmission time (RTT) measurements, wireless time of flight (TOF) measurements, wireless time of arrival (TOA) measurements, beamforming, wireless ranging approaches or any combination thereof. Beckhardt ‘504 further discloses that the playback device can additionally or alternatively determine the position of the headphone set relative to the playback device via ultrasonic acoustic signaling transmitted by one or both of the playback device and/or the headphone set in combination with acoustic beamforming, RTT, TOF, TOA, or other approaches for sound location.
[0057] However, none of the aforementioned earlier-filed applications/patents, individually or in combination, disclose the particular combinations of features and functions shown, described, and claimed herein that relate to (i) to re-assigning responsibility for rending the height audio channel (and optionally one or more lateral audio channels), to different transducers in the array to compensate for the inverted orientation of the playback device; (ii) adjusting one or more digital signal processing parameters associated with audio playback to compensate for the inverted orientation of the playback device, including, for example, repositioning the sound field corresponding to one or more audio channels (e.g., a center channel) based on the playback device being in the inverted orientation; and/or (iii) balancing audio rendering among multiple playback devices in a bonded zone or group (e.g., a stereo pair or home theater arrangement) when one (or more) of the playback devices is in the inverted orientation.
[0058] Each of U.S. Patent Nos. 8,234,395, 8,483,853, 8,788,080, 9,042,556, 9,264,839, 9,456,277, 9,886,234, 9,973,851, 10,712,997, 11,178,504, and 11,212,635, U.S. Patent Publication Nos. 2021/0099736, 2021/0297168, 2022/0066008, and 2023/0179937, and International Patent Publication Nos. WO/2025/014856 and WO/2024/073401 is hereby incorporated herein by reference in its entirety for all purposes.
[0059] While some examples described herein may refer to functions performed by given actors such as “users,” “listeners,” and/or other entities, it should be understood that such references are for purposes of explanation only. The claims should not be interpreted to require action by any such example actor unless explicitly required by the language of the claims themselves.
[0060] In the Figures, identical reference numbers identify generally similar, and/or identical, elements. To facilitate the discussion of any particular element, the most significant digit or digits of a reference number refers to the Figure in which that element is first introduced. For example, element 110a is first introduced and discussed with reference to Figure 1 A. Many of the details, dimensions, angles, and other features shown in the Figures are merely illustrative of particular embodiments of the disclosed technology. Accordingly, other embodiments can have other details, dimensions, angles, and features without departing from the spirit or scope of the disclosure. In addition, those of ordinary skill in the art will appreciate that further embodiments of the various disclosed technologies can be practiced without several of the details described below. II. Suitable Operating Environment
[0061] Figure 1A is a partial cutaway view of a media playback system 100 distributed in an environment 101 (e.g., a house). The media playback system 100 comprises one or more playback devices 110 (identified individually as playback devices HOa-n), one or more network microphone devices 120 (“NMDs”) (identified individually as NMDs 120a-c), and one or more control devices 130 (identified individually as control devices 130a and 130b).
[0062] As used herein the term “playback device” can generally refer to a network device configured to receive, process, and output data of a media playback system. For example, a playback device can be a network device that receives and processes audio content. In some embodiments, a playback device includes one or more transducers or speakers powered by one or more amplifiers. In other embodiments, however, a playback device includes one of (or neither of) the speaker and the amplifier. For instance, a playback device can comprise one or more amplifiers configured to drive one or more speakers external to the playback device via a corresponding wire or cable.
[0063] Moreover, as used herein the term “NMD” (i.e., a “network microphone device”) can generally refer to a network device that is configured for audio detection. In some embodiments, an NMD is a stand-alone device configured primarily for audio detection. In other embodiments, an NMD is incorporated into a playback device (or vice versa).
[0064] The term “control device” can generally refer to a network device configured to perform functions relevant to facilitating user access, control, and/or configuration of the media playback system 100.
[0065] Each of the playback devices 110 is configured to receive audio signals or data from one or more media sources (e.g., one or more remote servers, one or more local devices, etc.) and play back the received audio signals or data as sound. The one or more NMDs 120 are configured to receive spoken word commands, and the one or more control devices 130 are configured to receive user input. In response to the received spoken word commands and/or user input, the media playback system 100 can play back audio via one or more of the playback devices 110. In certain embodiments, the playback devices 110 are configured to commence playback of media content in response to a trigger. For instance, one or more of the playback devices 110 can be configured to play back a morning playlist upon detection of an associated trigger condition (e.g., presence of a user in a kitchen, detection of a coffee machine operation, etc.). In some embodiments, for example, the media playback system 100 is configured to play back audio from a first playback device (e.g., the playback device 110a) in synchrony with a second playback device (e.g., the playback device 110b). Interactions between the playback devices 110, NMDs 120, and/or control devices 130 of the media playback system 100 configured in accordance with the various embodiments of the disclosure are described in greater detail below with respect to Figures 1B-1H.
[0066] In the illustrated embodiment of Figure 1 A, the environment 101 comprises a household having several rooms, spaces, and/or playback zones, including (clockwise from upper left) a master bathroom 101a, a master bedroom 101b, a second bedroom 101c, a family room or den 101 d, an office lOle, a living room 10 If, a dining room 101g, a kitchen lOlh, and an outdoor patio lOli. While certain embodiments and examples are described below in the context of a home environment, the technologies described herein may be implemented in other types of environments. In some embodiments, for example, the media playback system 100 can be implemented in one or more commercial settings (e.g., a restaurant, mall, airport, hotel, a retail or other store), one or more vehicles (e.g., a sports utility vehicle, bus, car, a ship, a boat, an airplane, etc.), multiple environments (e.g., a combination of home and vehicle environments), and/or another suitable environment where multi-zone audio may be desirable.
[0067] The media playback system 100 can comprise one or more playback zones, some of which may correspond to the rooms in the environment 101. The media playback system 100 can be established with one or more playback zones, after which additional zones may be added, or removed, to form, for example, the configuration shown in Figure 1A. Each zone may be given a name according to a different room or space such as the office lOle, master bathroom 101a, master bedroom 101b, the second bedroom 101c, kitchen lOlh, dining room 101g, living room 10 If, and/or the balcony lOli. In some aspects, a single playback zone may include multiple rooms or spaces. In certain aspects, a single room or space may include multiple playback zones.
[0068] In the illustrated embodiment of Figure 1 A, the second bedroom 101c, the office lOle, the living room 10 If, the dining room 101g, the kitchen lOlh, and the outdoor patio lOli each include one playback device 110, and the master bathroom 101a, the master bedroom 101b, and the den 101 d include a plurality of playback devices 110. In the master bedroom 101b, the playback devices 1101 and 110m may be configured, for example, to play back audio content in synchrony as individual ones of playback devices 110, as a bonded playback zone, as a consolidated playback device, and/or any combination thereof. Similarly, in the den 101 d, the playback devices HOh-k can be configured, for instance, to play back audio content in synchrony as individual ones of playback devices 110, as one or more bonded playback devices, and/or as one or more consolidated playback devices. Additional details regarding bonded and consolidated playback devices are described below with respect to Figures IB, IE, and II - IM.
[0069] In some aspects, one or more of the playback zones in the environment 101 may each be playing different audio content. For instance, a user may be grilling on the patio lOli and listening to hip hop music being played by the playback device 110c while another user is preparing food in the kitchen lOlh and listening to classical music played by the playback device 110b. In another example, a playback zone may play the same audio content in synchrony with another playback zone. For instance, the user may be in the office lOle listening to the playback device 1 lOf playing back the same hip hop music being played back by playback device 110c on the patio lOli. In some aspects, the playback devices 110c and 11 Of play back the hip hop music in synchrony such that the user perceives that the audio content is being played seamlessly (or at least substantially seamlessly) while moving between different playback zones. Additional details regarding audio playback synchronization among playback devices and/or zones can be found, for example, in Millington ‘395 referenced above, a. Suitable Media Playback System
[0070] Figure IB is a schematic diagram of the media playback system 100 and a cloud network 102. For ease of illustration, certain devices of the media playback system 100 and the cloud network 102 are omitted from Figure IB. One or more communication links 103 (referred to hereinafter as “the links 103”) communicatively couple the media playback system 100 and the cloud network 102.
[0071] The links 103 can comprise, for example, one or more wired networks, one or more wireless networks, one or more wide area networks (WAN), one or more local area networks (LAN), one or more personal area networks (PAN), one or more telecommunication networks (e.g., one or more Global System for Mobiles (GSM) networks, Code Division Multiple Access (CDMA) networks, Long-Term Evolution (LTE) networks, 5G communication networks, and/or other suitable data transmission protocol networks), etc. The cloud network 102 is configured to deliver media content (e.g., audio content, video content, photographs, social media content, etc.) to the media playback system 100 in response to a request transmitted from the media playback system 100 via the links 103. In some embodiments, the cloud network 102 is further configured to receive data (e.g., voice input data) from the media playback system 100 and correspondingly transmit commands and/or media content to the media playback system 100.
[0072] The cloud network 102 comprises computing devices 106 (identified separately as a first computing device 106a, a second computing device 106b, and a third computing device 106c). The computing devices 106 can comprise individual computers or servers, such as, for example, a media streaming service server storing audio and/or other media content, a voice service server, a social media server, a media playback system control server, etc. In some embodiments, one or more of the computing devices 106 comprise modules of a single computer or server. In certain embodiments, one or more of the computing devices 106 comprise one or more modules, computers, and/or servers. Moreover, while the cloud network 102 is described above in the context of a single cloud network, in some embodiments the cloud network 102 comprises a plurality of cloud networks comprising communicatively coupled computing devices. Furthermore, while the cloud network 102 is shown in Figure IB as having three of the computing devices 106, in some embodiments, the cloud network 102 comprises fewer (or more than) three computing devices 106.
[0073] The media playback system 100 is configured to receive media content from the networks 102 via the links 103. The received media content can comprise, for example, a Uniform Resource Identifier (URI) and/or a Uniform Resource Locator (URL). For instance, in some examples, the media playback system 100 can stream, download, or otherwise obtain data from a URI or a URL corresponding to the received media content. A network 104 communicatively couples the links 103 and at least a portion of the devices (e.g., one or more of the playback devices 110, NMDs 120, and/or control devices 130) of the media playback system 100. The network 104 can include, for example, a wireless network (e.g., a WI-FI network, a BLUETOOTH network, a Z-WAVE network, a ZIGBEE network, and/or other suitable wireless communication protocol network) and/or a wired network (e.g., a network comprising Ethernet, Universal Serial Bus (USB), and/or another suitable wired communication). As those of ordinary skill in the art will appreciate, as used herein, “WI-FI” can refer to several different communication protocols including, for example, Institute of Electrical and Electronics Engineers (IEEE) 802.11a, 802.11b, 802.11g, 802.1 In, 802.1 lac, 802.1 lac, 802.1 lad, 802.11af, 802.11 ah, 802.1 lai, 802.1 laj, 802.11aq, 802.1 lax, 802. Hay, 802.15, etc. transmitted at 2.4 Gigahertz (GHz), 5 GHz, and/or another suitable frequency.
[0074] In some embodiments, the network 104 comprises a dedicated communication network that the media playback system 100 uses to transmit messages between individual devices and/or to transmit media content to and from media content sources (e.g., one or more of the computing devices 106). In certain embodiments, the network 104 is configured to be accessible only to devices in the media playback system 100, thereby reducing interference and competition with other household devices. In other embodiments, however, the network 104 comprises an existing household or commercial facility communication network (e.g., a household or commercial facility WI-FI network). In some embodiments, the links 103 and the network 104 comprise one or more of the same networks. In some aspects, for example, the links 103 and the network 104 comprise a telecommunication network (e.g., an LTE network, a 5G network, etc.). Moreover, in some embodiments, the media playback system 100 is implemented without the network 104, and devices comprising the media playback system 100 can communicate with each other, for example, via one or more direct connections, PANs, telecommunication networks, and/or other suitable communication links. The network 104 may be referred to herein as a “local communication network” to differentiate the network 104 from the cloud network 102 that couples the media playback system 100 to remote devices, such as cloud servers that host cloud services.
[0075] In some embodiments, audio content sources may be regularly added or removed from the media playback system 100. In some embodiments, for example, the media playback system 100 performs an indexing of media items when one or more media content sources are updated, added to, and/or removed from the media playback system 100. The media playback system 100 can scan identifiable media items in some or all folders and/or directories accessible to the playback devices 110, and generate or update a media content database comprising metadata (e.g., title, artist, album, track length, etc.) and other associated information (e.g., URIs, URLs, etc.) for each identifiable media item found. In some embodiments, for example, the media content database is stored on one or more of the playback devices 110, network microphone devices 120, and/or control devices 130.
[0076] In the illustrated embodiment of Figure IB, the playback devices 1101 and 110m comprise a group 107a. The playback devices 1101 and 110m can be positioned in different rooms and be grouped together in the group 107a on a temporary or permanent basis based on user input received at the control device 130a and/or another control device 130 in the media playback system 100. When arranged in the group 107a, the playback devices 1101 and 110m can be configured to play back the same or similar audio content in synchrony from one or more audio content sources. In certain embodiments, for example, the group 107a comprises a bonded zone in which the playback devices 1101 and 110m comprise left audio and right audio channels, respectively, of multi-channel audio content, thereby producing or enhancing a stereo effect of the audio content. In some embodiments, the group 107a includes additional playback devices 110. In other embodiments, however, the media playback system 100 omits the group 107a and/or other grouped arrangements of the playback devices 110. Additional details regarding groups and other arrangements of playback devices are described in further detail below with respect to Figures II through IM. [0077] The media playback system 100 includes the NMDs 120a and 120b, each comprising one or more microphones configured to receive voice utterances from a user. In the illustrated embodiment of Figure IB, the NMD 120a is a standalone device and the NMD 120b is integrated into the playback device 1 lOn. The NMD 120a, for example, is configured to receive voice input 121 from a user 123. In some embodiments, the NMD 120a transmits data associated with the received voice input 121 to a voice assistant service (VAS) configured to (i) process the received voice input data and (ii) facilitate one or more operations on behalf of the media playback system 100.
[0078] In some aspects, for example, the computing device 106c comprises one or more modules and/or servers of a VAS (e.g., a VAS operated by one or more of SONOS, AMAZON, GOOGLE, APPLE, MICROSOFT, etc.). The computing device 106c can receive the voice input data from the NMD 120a via the network 104 and the links 103.
[0079] In response to receiving the voice input data, the computing device 106c processes the voice input data (i.e., “Play Hey Jude by The Beatles”), and determines that the processed voice input includes a command to play a song (e.g., “Hey Jude”). In some embodiments, after processing the voice input, the computing device 106c accordingly transmits commands to the media playback system 100 to play back “Hey Jude” by the Beatles from a suitable media service (e.g., via one or more of the computing devices 106) on one or more of the playback devices 110. In other embodiments, the computing device 106c may be configured to interface with media services on behalf of the media playback system 100. In such embodiments, after processing the voice input, instead of the computing device 106c transmitting commands to the media playback system 100 causing the media playback system 100 to retrieve the requested media from a suitable media service, the computing device 106c itself causes a suitable media service to provide the requested media to the media playback system 100 in accordance with the user’s voice utterance. b. Suitable Playback Devices
[0080] Figure 1C is a block diagram of the playback device 110a comprising an input/output 111. The input/output 111 can include an analog EO I l la (e.g., one or more wires, cables, and/or other suitable communication links configured to carry analog signals) and/or a digital EO 11 lb (e.g., one or more wires, cables, or other suitable communication links configured to carry digital signals). In some embodiments, the analog EO I l la is an audio line-in input connection comprising, for example, an auto-detecting 3.5mm audio line-in connection. In some embodiments, the digital EO 111b comprises a Sony/Philips Digital Interface Format (S/PDIF) communication interface and/or cable and/or a Toshiba Link (TOSLINK) cable. In some embodiments, the digital I/O 111b comprises a High-Definition Multimedia Interface (HDMI) interface and/or cable. In some embodiments, the digital I/O 111b includes one or more wireless communication links comprising, for example, a radio frequency (RF), infrared, WI-FI, BLUETOOTH, or another suitable communication link. In certain embodiments, the analog EO I l la and the digital EO 111b comprise interfaces (e.g., ports, plugs, jacks, etc.) configured to receive connectors of cables transmitting analog and digital signals, respectively, without necessarily including cables.
[0081] The playback device 110a, for example, can receive media content (e.g., audio content comprising music and/or other sounds) from a local audio source 105 via the input/output 111 (e.g., a cable, a wire, a PAN, a BLUETOOTH connection, an ad hoc wired or wireless communication network, and/or another suitable communication link). The local audio source 105 can comprise, for example, a mobile device (e.g., a smartphone, a tablet, a laptop computer, etc.) or another suitable audio component (e.g., a television, a desktop computer, an amplifier, a phonograph (such as an LP turntable), a Blu-ray player, a memory storing digital media files, etc.). In some aspects, the local audio source 105 includes local music libraries on a smartphone, a computer, a networked-attached storage (NAS), and/or another suitable device configured to store media files. In certain embodiments, one or more of the playback devices 110, NMDs 120, and/or control devices 130 comprise the local audio source 105. In other embodiments, however, the media playback system omits the local audio source 105 altogether. In some embodiments, the playback device 110a does not include an input/output 111 and receives all audio content via the network 104.
[0082] The playback device 110a further comprises electronics 112, a user interface 113 (e.g., one or more buttons, knobs, dials, touch-sensitive surfaces, displays, touchscreens, etc.), and one or more transducers 114 (referred to hereinafter as “the transducers 114”). The electronics 112 are configured to receive audio from an audio source (e.g., the local audio source 105) via the input/output 111 or one or more of the computing devices 106a-c via the network 104 (Figure IB), amplify the received audio, and output the amplified audio for playback via one or more of the transducers 114. In some embodiments, the playback device 110a optionally includes one or more microphones 115 (e.g., a single microphone, a plurality of microphones, a microphone array) (hereinafter referred to as “the microphones 115”). In certain embodiments, for example, the playback device 110a having one or more of the optional microphones 115 can operate as an NMD configured to receive voice input from a user and correspondingly perform one or more operations based on the received voice input. [0083] In the illustrated embodiment of Figure 1C, the electronics 112 comprise one or more processors 112a (referred to hereinafter as “the processors 112a”), memory 112b, software components 112c, a network interface 112d, one or more audio processing components 112g (referred to hereinafter as “the audio components H2g”), one or more audio amplifiers 112h (referred to hereinafter as “the amplifiers 112h”), and power 112i (e.g., one or more power supplies, power cables, power receptacles, batteries, induction coils, Power-over Ethernet (POE) interfaces, and/or other suitable sources of electric power). In some embodiments, the electronics 112 optionally include one or more other components 112j (e.g., one or more sensors, video displays, touchscreens, battery charging bases, etc.).
[0084] The processors 112a can comprise clock-driven computing component(s) configured to process data, and the memory 112b can comprise a computer-readable medium (e.g., a tangible, non-transitory computer-readable medium loaded with one or more of the software components 112c) configured to store instructions for performing various operations and/or functions. The processors 112a are configured to execute the instructions stored on the memory 112b to perform one or more of the operations. The operations can include, for example, causing the playback device 110a to retrieve audio data from an audio source (e.g., one or more of the computing devices 106a-c (Figure IB)), and/or another one of the playback devices 110. In some embodiments, the operations further include causing the playback device 110a to send audio data to another one of the playback devices 110a and/or another device (e.g., one of the NMDs 120). Certain embodiments include operations causing the playback device 110a to pair with another of the one or more playback devices 110 to enable a multi-channel audio environment (e.g., a stereo pair, a bonded zone, etc.).
[0085] The processors 112a can be further configured to perform operations causing the playback device 110a to synchronize playback of audio content with another of the one or more playback devices 110. As those of ordinary skill in the art will appreciate, during synchronous playback of audio content on a plurality of playback devices, a listener will preferably be unable to perceive time-delay differences between playback of the audio content by the playback device 110a and the other one or more other playback devices 110. Additional details regarding audio playback synchronization among playback devices can be found, for example, in Millington ‘395, which is incorporated by reference above.
[0086] In some embodiments, the memory 112b is further configured to store data associated with the playback device 110a, such as one or more zones and/or zone groups of which the playback device 110a is a member, audio sources accessible to the playback device 110a, and/or a playback queue that the playback device 110a (and/or another of the one or more playback devices) can be associated with. The stored data can comprise one or more state variables that are periodically updated and used to describe a state of the playback device 110a. The memory 112b can also include data associated with a state of one or more of the other devices (e.g., the playback devices 110, NMDs 120, control devices 130) of the media playback system 100. In some aspects, for example, the state data is shared during predetermined intervals of time (e.g., every 5 seconds, every 10 seconds, every 60 seconds, etc.) among at least a portion of the devices of the media playback system 100, so that one or more of the devices have the most recent data associated with the media playback system 100.
[0087] The network interface 112d is configured to facilitate a transmission of data between the playback device 110a and one or more other devices on a data network such as, for example, the links 103 and/or the network 104 (Figure IB). The network interface 112d is configured to transmit and receive data corresponding to media content (e.g., audio content, video content, text, photographs) and other signals (e.g., non-transitory signals) comprising digital packet data including an Internet Protocol (IP)-based source address and/or an IP -based destination address. The network interface 112d can parse the digital packet data such that the electronics 112 properly receive and process the data destined for the playback device 110a.
[0088] In the illustrated embodiment of Figure 1C, the network interface 112d comprises one or more wireless interfaces 112e (referred to hereinafter as “the wireless interface 112e”). The wireless interface 112e (e.g., a suitable interface comprising one or more antennae) can be configured to wirelessly communicate with one or more other devices (e.g., one or more of the other playback devices 110, NMDs 120, and/or control devices 130) that are communicatively coupled to the network 104 (Figure IB) in accordance with a suitable wireless communication protocol (e.g., WI-FI, BLUETOOTH, LTE, etc.). In some embodiments, the network interface 112d optionally includes a wired interface 112f (e.g., an interface or receptacle configured to receive a network cable such as an Ethernet, a USB-A, USB-C, and/or Thunderbolt cable) configured to communicate over a wired connection with other devices in accordance with a suitable wired communication protocol. In certain embodiments, the network interface 112d includes the wired interface 112f and excludes the wireless interface 112e. In some embodiments, the electronics 112 exclude the network interface 112d altogether and transmit and receive media content and/or other data via another communication path (e.g., the input/output 111).
[0089] The audio components 112g are configured to process and/or filter data comprising media content received by the electronics 112 (e.g., via the input/output 111 and/or the network interface 112d) to produce output audio signals. In some embodiments, the audio processing components 112g comprise, for example, one or more digital-to-analog converters (DACs), audio preprocessing components, audio enhancement components, digital signal processors (DSPs), and/or other suitable audio processing components, modules, circuits, etc. In certain embodiments, one or more of the audio processing components 112g can comprise one or more subcomponents of the processors 112a. In some embodiments, the electronics 112 omit the audio processing components 112g. In some aspects, for example, the processors 112a execute instructions stored on the memory 112b to perform audio processing operations to produce the output audio signals.
[0090] The amplifiers 112h are configured to receive and amplify the audio output signals produced by the audio processing components 112g and/or the processors 112a. The amplifiers 112h can comprise electronic devices and/or components configured to amplify audio signals to levels sufficient for driving one or more of the transducers 114. In some embodiments, for example, the amplifiers 112h include one or more switching or class-D power amplifiers. In other embodiments, however, the amplifiers 112h include one or more other types of power amplifiers (e.g., linear gain power amplifiers, class-A amplifiers, class-B amplifiers, class-AB amplifiers, class-C amplifiers, class-D amplifiers, class-E amplifiers, class-F amplifiers, class- G amplifiers, class H amplifiers, and/or another suitable type of power amplifier). In certain embodiments, the amplifiers 112h comprise a suitable combination of two or more of the foregoing types of power amplifiers. Moreover, in some embodiments, individual ones of the amplifiers 112h correspond to individual ones of the transducers 114. In other embodiments, however, the electronics 112 include a single one of the amplifiers 112h configured to output amplified audio signals to a plurality of the transducers 114. In some other embodiments, the electronics 112 omit the amplifiers 112h.
[0091] The transducers 114 (e.g., one or more speakers and/or speaker drivers) receive the amplified audio signals from the amplifier 112h and render or output the amplified audio signals as sound (e.g., audible sound waves having a frequency between about 20 Hertz (Hz) and 20 kilohertz (kHz)). In some embodiments, the transducers 114 can comprise a single transducer. In other embodiments, however, the transducers 114 comprise a plurality of audio transducers. In some embodiments, the transducers 114 comprise more than one type of transducer. For example, the transducers 114 can include one or more low frequency transducers (e.g., subwoofers, woofers), mid-range frequency transducers (e.g., mid-range transducers, mid-woofers), and one or more high frequency transducers (e.g., one or more tweeters). As used herein, “low frequency” can generally refer to audible frequencies below about 500 Hz, “mid-range frequency” can generally refer to audible frequencies between about 500 Hz and about 2 kHz, and “high frequency” can generally refer to audible frequencies above 2 kHz. In certain embodiments, however, one or more of the transducers 114 comprise transducers that do not adhere to the foregoing frequency ranges. For example, one of the transducers 114 may comprise a mid-woofer transducer configured to output sound at frequencies between about 200 Hz and about 5 kHz.
[0092] By way of illustration, Sonos, Inc. presently offers (or has offered) for sale certain playback devices including, for example, a “SONOS ONE,” “PLAY:1,” “PLAY:3,” “PLAY:5,” “PLAYBAR,” “PLAYBASE,” “CONNECT: AMP,” “CONNECT,” “AMP,” “PORT,” and “SUB.” Other suitable playback devices may additionally or alternatively be used to implement the playback devices of example embodiments disclosed herein. Additionally, one of ordinary skill in the art will appreciate that a playback device is not limited to the examples described herein or to Sonos product offerings. In some embodiments, for example, one or more playback devices 110 comprise wired or wireless headphones (e.g., over-the-ear headphones, on-ear headphones, in-ear earphones, etc.). In other embodiments, one or more of the playback devices 110 comprise a docking station and/or an interface configured to interact with a docking station for personal mobile media playback devices. In certain embodiments, a playback device may be integral to another device or component such as a television, an LP turntable, a lighting fixture, or some other device for indoor or outdoor use. In some embodiments, a playback device omits a user interface and/or one or more transducers. For example, FIG. ID is a block diagram of a playback device I lOp comprising the input/output 111 and electronics 112 without the user interface 113 or transducers 114.
[0093] Figure IE is a block diagram of a bonded playback device HOq comprising the playback device 110a (Figure 1C) sonically bonded with the playback device HOi (e.g., a subwoofer) (Figure 1 A). In the illustrated embodiment, the playback devices 110a and 1 lOi are separate ones of the playback devices 110 housed in separate enclosures. In some embodiments, however, the bonded playback device HOq comprises a single enclosure housing both the playback devices 110a and HOi. The bonded playback device HOq can be configured to process and reproduce sound differently than an unbonded playback device (e.g., the playback device 110a of Figure 1C) and/or paired or bonded playback devices (e.g., the playback devices 1101 and 110m of Figure IB). In some embodiments, for example, the playback device 110a is a full-range playback device configured to render low frequency, midrange frequency, and high frequency audio content, and the playback device HOi is a subwoofer configured to render low frequency audio content. In some aspects, the playback device 110a, when bonded with the first playback device, is configured to render only the mid- range and high frequency components of a particular audio content, while the playback device HOi renders the low frequency component of the particular audio content. In some embodiments, the bonded playback device HOq includes additional playback devices and/or another bonded playback device. Additional playback device embodiments are described in further detail below with respect to Figures 2A-C. c. Suitable Network Microphone Devices (NMDs)
[0094] Figure IF is a block diagram of the NMD 120a (Figures 1 A and IB). The NMD 120a includes one or more voice processing components 124 (hereinafter “the voice components 124”) and several components described with respect to the playback device 110a (Figure 1C) including the processors 112a, the memory 112b, and the microphones 115. The NMD 120a optionally comprises other components also included in the playback device 110a (Figure 1C), such as the user interface 113 and/or the transducers 114. In some embodiments, the NMD 120a is configured as a media playback device (e.g., one or more of the playback devices 110), and further includes, for example, one or more of the audio components 112g (Figure 1C), the amplifiers 112h, and/or other playback device components. In certain embodiments, the NMD 120a comprises an Internet of Things (loT) device such as, for example, a thermostat, alarm panel, fire and/or smoke detector, etc. In some embodiments, the NMD 120a comprises the microphones 115, the voice processing components 124, and only a portion of the components of the electronics 112 described above with respect to Figure 1C. In some aspects, for example, the NMD 120a includes the processor 112a and the memory 112b (Figure 1C), while omitting one or more other components of the electronics 112. In some embodiments, the NMD 120a includes additional components (e.g., one or more sensors, cameras, thermometers, barometers, hygrometers, etc.).
[0095] In some embodiments, an NMD can be integrated into a playback device. Figure 1G is a block diagram of a playback device HOr comprising an NMD 120d. The playback device 11 Or can comprise many or all of the components of the playback device 110a and further include the microphones 115 and voice processing components 124 (Figure IF). The playback device 1 lOr optionally includes an integrated control device 130c. The control device 130c can comprise, for example, a user interface (e.g., the user interface 113 of Figure 1C) configured to receive user input (e.g., touch input, voice input, etc.) without a separate control device. In other embodiments, however, the playback device 11 Or receives commands from another control device (e.g., the control device 130a of Figure IB).
[0096] Referring again to Figure IF, the microphones 115 are configured to acquire, capture, and/or receive sound from an environment (e.g., the environment 101 of Figure 1A) and/or a room in which the NMD 120a is positioned. The received sound can include, for example, vocal utterances, audio played back by the NMD 120a and/or another playback device, background voices, ambient sounds, etc. The microphones 115 convert the received sound into electrical signals to produce microphone data. The voice processing components 124 receive and analyze the microphone data to determine whether a voice input is present in the microphone data. The voice input can comprise, for example, an activation word followed by an utterance including a user request. As those of ordinary skill in the art will appreciate, an activation word is a word or other audio cue signifying a user voice input. For instance, in querying the AMAZON VAS, a user might speak the activation word "Alexa." Other examples include "Ok, Google" for invoking the GOOGLE VAS and "Hey, Siri" for invoking the APPLE VAS.
[0097] After detecting the activation word, voice processing components 124 monitor the microphone data for an accompanying user request in the voice input. The user request may include, for example, a command to control a third-party device, such as a thermostat (e.g., NEST thermostat), an illumination device (e.g., a PHILIPS HUE lighting device), or a media playback device (e.g., a SONOS playback device). For example, a user might speak the activation word “Alexa” followed by the utterance “set the thermostat to 68 degrees” to set a temperature in a home (e.g., the environment 101 of Figure 1 A). The user might speak the same activation word followed by the utterance “turn on the living room” to turn on illumination devices in a living room area of the home. The user may similarly speak an activation word followed by a request to play a particular song, an album, or a playlist of music on a playback device in the home. d. Suitable Control Devices
[0098] Figure 1H is a partial schematic diagram of the control device 130a (Figures 1A and IB). As used herein, the term “control device” can be used interchangeably with “controller” or “control system.” Among other features, the control device 130a is configured to receive user input related to the media playback system 100 and, in response, cause one or more devices in the media playback system 100 to perform an action(s) or operation(s) corresponding to the user input. In the illustrated embodiment, the control device 130a comprises a smartphone (e.g., an iPhone™ an Android phone, etc.) on which media playback system controller application software is installed. In some embodiments, the control device 130a comprises, for example, a tablet (e.g., an iPad™), a computer (e.g., a laptop computer, a desktop computer, etc.), and/or another suitable device (e.g., a television, an automobile audio head unit, an loT device, etc.). In certain embodiments, the control device 130a comprises a dedicated controller for the media playback system 100. In other embodiments, as described above with respect to Figure 1G, the control device 130a is integrated into another device in the media playback system 100 (e.g., one more of the playback devices 110, NMDs 120, and/or other suitable devices configured to communicate over a network).
[0099] The control device 130a includes electronics 132, a user interface 133, one or more speakers 134, and one or more microphones 135. The electronics 132 comprise one or more processors 132a (referred to hereinafter as “the processors 132a”), a memory 132b, software components 132c, and a network interface 132d. The processor 132a can be configured to perform functions relevant to facilitating user access, control, and configuration of the media playback system 100. The memory 132b can comprise data storage that can be loaded with one or more of the software components executable by the processor 132a to perform those functions. The software components 132c can comprise applications and/or other executable software configured to facilitate control of the media playback system 100. The memory 132b can be configured to store, for example, the software components 132c, media playback system controller application software, and/or other data associated with the media playback system 100 and the user.
[0100] The network interface 132d is configured to facilitate network communications between the control device 130a and one or more other devices in the media playback system 100, and/or one or more remote devices. In some embodiments, the network interface 132d is configured to operate according to one or more suitable communication industry standards (e.g., infrared, radio, wired standards including IEEE 802.3, wireless standards including IEEE 802.11a, 802.11b, 802.11g, 802.1 In, 802.1 lac, 802.15, 4G, LTE, etc.). The network interface 132d can be configured, for example, to transmit data to and/or receive data from the playback devices 110, the NMDs 120, other ones of the control devices 130, one of the computing devices 106 of Figure IB, devices comprising one or more other media playback systems, etc. The transmitted and/or received data can include, for example, playback device control commands, state variables, playback zone and/or zone group configurations. For instance, based on user input received at the user interface 133, the network interface 132d can transmit a playback device control command (e.g., volume control, audio playback control, audio content selection, etc.) from the control device 130a to one or more of the playback devices 110. The network interface 132d can also transmit and/or receive configuration changes such as, for example, adding/removing one or more playback devices 110 to/from a zone, adding/removing one or more zones to/from a zone group, forming a bonded or consolidated player, separating one or more playback devices from a bonded or consolidated player, among others. Additional description of zones and groups is presented below with respect to Figures II through IM.
[0101] The user interface 133 is configured to receive user input and can facilitate control of the media playback system 100. The user interface 133 includes media content art 133a (e.g., album art, lyrics, videos, etc.), a playback status indicator 133b (e.g., an elapsed and/or remaining time indicator), media content information region 133c, a playback control region 133d, and a zone indicator 133e. The media content information region 133c can include a display of relevant information (e.g., title, artist, album, genre, release year, etc.) about media content currently playing and/or media content in a queue or playlist. The playback control region 133d can include selectable (e.g., via touch input and/or via a cursor or another suitable selector) icons to cause one or more playback devices in a selected playback zone or zone group to perform playback actions such as, for example, play or pause, fast forward, rewind, skip to next, skip to previous, enter/exit shuffle mode, enter/exit repeat mode, enter/exit cross fade mode, etc. The playback control region 133d may also include selectable icons to modify equalization settings, playback volume, and/or other suitable playback actions. In the illustrated embodiment, the user interface 133 comprises a display presented on a touch screen interface of a smartphone (e.g., an iPhone™ an Android phone, etc.). In some embodiments, however, user interfaces of varying formats, styles, and interactive sequences may alternatively be implemented on one or more network devices to provide comparable control access to a media playback system.
[0102] The one or more speakers 134 (e.g., one or more transducers) can be configured to output sound to the user of the control device 130a. In some embodiments, the one or more speakers comprise individual transducers configured to correspondingly output low frequencies, mid-range frequencies, and/or high frequencies. In some aspects, for example, the control device 130a is configured as a playback device (e.g., one of the playback devices 110). Similarly, in some embodiments the control device 130a is configured as an NMD (e.g., one of the NMDs 120), receiving voice commands and other sounds via the one or more microphones 135.
[0103] The one or more microphones 135 can comprise, for example, one or more condenser microphones, electret condenser microphones, dynamic microphones, and/or other suitable types of microphones or transducers. In some embodiments, two or more of the microphones 135 are arranged to capture location information of an audio source (e.g., voice, audible sound, etc.) and/or configured to facilitate filtering of background noise. Moreover, in certain embodiments, the control device 130a is configured to operate as a playback device and an NMD. In other embodiments, however, the control device 130a omits the one or more speakers 134 and/or the one or more microphones 135. For instance, the control device 130a may comprise a device (e.g., a thermostat, an loT device, a network device, etc.) comprising a portion of the electronics 132 and the user interface 133 (e.g., a touch screen) without any speakers or microphones. e. Suitable Playback Device Configurations
[0104] Figures II through IM show example configurations of playback devices in zones and zone groups. Referring first to Figure IM, in one example, a single playback device may belong to a zone. For example, the playback device 110g in the second bedroom 101c (FIG. 1 A) may belong to Zone C. In some implementations described below, multiple playback devices may be “bonded” to form a “bonded pair” which together form a single zone. For example, the playback device 1101 (e.g., a left playback device) can be bonded to the playback device 110m (e.g., a right playback device) to form Zone B. Bonded playback devices may have different playback responsibilities (e.g., channel responsibilities). In another implementation described below, multiple playback devices may be merged to form a single zone. For example, the playback device 1 lOh (e.g., a front playback device) may be merged with the playback device 1 lOi (e.g., a subwoofer), and the playback devices 1 lOj and 110k (e.g., left and right surround speakers, respectively) to form a single Zone D. In another example, the playback devices 110b and 1 lOd can be merged to form a merged group or a zone group 108b. The merged playback devices 110b and HOd may not be specifically assigned different playback responsibilities. That is, the merged playback devices 110b and 1 lOd may, aside from playing audio content in synchrony, each play audio content as they would if they were not merged.
[0105] Each zone in the media playback system 100 may be provided for control as a single user interface (UI) entity. For example, Zone A may be provided as a single entity named Master Bathroom. Zone B may be provided as a single entity named Master Bedroom. Zone C may be provided as a single entity named Second Bedroom.
[0106] Playback devices that are bonded may have different playback responsibilities, such as responsibilities for certain audio channels. For example, as shown in Figure II, the playback devices 1101 and 110m may be bonded so as to produce or enhance a stereo effect of audio content. In this example, the playback device 1101 may be configured to play a left channel audio component, while the playback device 110m may be configured to play a right channel audio component. In some implementations, such stereo bonding may be referred to as “pairing.” [0107] Additionally, bonded playback devices may have additional and/or different respective speaker drivers. As shown in Figure 1 J, the playback device 1 lOh named Front may be bonded with the playback device 1 lOi named SUB. The Front device 1 lOh can be configured to render a range of mid to high frequencies and the SUB device 1 lOi can be configured to render low frequencies. When unbonded, however, the Front device 11 Oh can be configured to render a full range of frequencies. As another example, Figure IK shows the Front and SUB devices 11 Oh and HOi further bonded with Left and Right playback devices HOj and 110k, respectively. In some implementations, the Left and Right devices HOj and 110k can be configured to form surround or “satellite” channels of a home theater system. The bonded playback devices 1 lOh, 1 lOi, 1 lOj, and 110k may form a single Zone D (FIG. IM).
[0108] Playback devices that are merged may not have assigned playback responsibilities, and may each render the full range of audio content the respective playback device is capable of. Nevertheless, merged devices may be represented as a single UI entity (i.e., a zone, as discussed above). For instance, the playback devices 110a and 11 On in the master bathroom have the single UI entity of Zone A. In one embodiment, the playback devices 110a and 1 lOn may each output the full range of audio content each respective playback devices 110a and 11 On are capable of, in synchrony.
[0109] In some embodiments, an NMD is bonded or merged with another device so as to form a zone. For example, the NMD 120b may be bonded with the playback device I lOe, which together form Zone F, named Living Room. In other embodiments, a stand-alone network microphone device may be in a zone by itself. In other embodiments, however, a stand-alone network microphone device may not be associated with a zone. Additional details regarding associating network microphone devices and playback devices as designated or default devices may be found, for example, in Lang ‘ 146 referenced above.
[0110] Zones of individual, bonded, and/or merged devices may be grouped to form a zone group. For example, referring to Figure IM, Zone A may be grouped with Zone B to form a zone group 108a that includes the two zones. Similarly, Zone G may be grouped with Zone H to form the zone group 108b. As another example, Zone A may be grouped with one or more other Zones C-I. The Zones A-I may be grouped and ungrouped in numerous ways. For example, three, four, five, or more (e.g., all) of the Zones A-I may be grouped. When grouped, the zones of individual and/or bonded playback devices may play back audio in synchrony with one another, as described in Millington ‘395 referenced above. Playback devices may be dynamically grouped and ungrouped to form new or different groups that synchronously play back audio content. [0111] In various implementations, the zones in an environment may be the default name of a zone within the group or a combination of the names of the zones within a zone group. For example, Zone Group 108b can be assigned a name such as “Dining + Kitchen”, as shown in Figure IM. In some embodiments, a zone group may be given a unique name selected by a user.
[0112] Certain data may be stored in a memory of a playback device (e.g., the memory 112b of Figure 1C) as one or more state variables that are periodically updated and used to describe the state of a playback zone, the playback device(s), and/or a zone group associated therewith. The memory may also include the data associated with the state of the other devices of the media system, and shared from time to time among the devices so that one or more of the devices have the most recent data associated with the system.
[0113] In some embodiments, the memory may store instances of various variable types associated with the states. Variable instances may be stored with identifiers (e.g., tags) corresponding to type. For example, certain identifiers may be a first type “al” to identify playback device(s) of a zone, a second type “bl” to identify playback device(s) that may be bonded in the zone, and a third type “cl” to identify a zone group to which the zone may belong. As a related example, identifiers associated with the second bedroom 101c may indicate that the playback device is the only playback device of the Zone C and not in a zone group. Identifiers associated with the Den may indicate that the Den is not grouped with other zones but includes bonded playback devices 11 Oh- 110k. Identifiers associated with the Dining Room may indicate that the Dining Room is part of the Dining + Kitchen zone group 108b and that devices 110b and 1 lOd are grouped (FIG. IL). Identifiers associated with the Kitchen may indicate the same or similar information by virtue of the Kitchen being part of the Dining + Kitchen zone group 108b. Other example zone variables and identifiers are described below. [0114] In yet another example, the memory may store variables or identifiers representing other associations of zones and zone groups, such as identifiers associated with Areas, as shown in Figure IM. An area may involve a cluster of zone groups and/or zones not within a zone group. For instance, Figure IM shows an Upper Area 109a including Zones A-D and I, and a Lower Area 109b including Zones E-I. In one aspect, an Area may be used to invoke a cluster of zone groups and/or zones that share one or more zones and/or zone groups of another cluster. In another aspect, this differs from a zone group, which does not share a zone with another zone group. Further examples of techniques for implementing Areas may be found, for example, in Wilberding ‘997 and Lamboume ‘853, both of which are incorporated herein by reference above. In some embodiments, the media playback system 100 may not implement Areas, in which case the system may not store variables associated with Areas.
111. Example Playback Devices
[0115] Figure 2A is a front isometric view of a playback device 210 configured in accordance with aspects of the disclosed technology. Figure 2B is a front isometric view of the playback device 210 without a grille 216e. Figure 2C is an exploded view of the playback device 210. Referring to Figures 2A-2C together, the playback device 210 comprises a housing 216 that includes an upper portion 216a, a right or first side portion 216b, a lower portion, a left or second side portion 216d, the grille 216e, and a rear portion 216f. A plurality of fasteners 216g (e.g., one or more screws, rivets, clips) attaches a frame 216h to the housing 216. A cavity 216j (Figure 2C) in the housing 216 is configured to receive the frame 216h and electronics 212. The frame 216h is configured to carry a plurality of transducers 214 (identified individually in Figure 2B as transducers 214a-f). The electronics 212 (e.g., the electronics 112 of Figure 1C) are configured to receive audio content from an audio source and send electrical signals corresponding to the audio content to the transducers 214 for playback.
[0116] The transducers 214 are configured to receive the electrical signals from the electronics
112, and further configured to convert the received electrical signals into audible sound during playback. For instance, the transducers 214a-c (e.g., tweeters) can be configured to output high frequency sound (e.g., sound waves having a frequency greater than about 2 kHz). The transducers 214d-f (e.g., mid-woofers, woofers, midrange speakers) can be configured output sound at frequencies lower than the transducers 214a-c (e.g., sound waves having a frequency lower than about 2 kHz). In some embodiments, the playback device 210 includes a number of transducers different than those illustrated in Figures 2A-2C. For example, as described in further detail below with respect to Figures 3A-3C, the playback device 210 can include fewer than six transducers (e.g., one, two, three). In other embodiments, however, the playback device 210 includes more than six transducers (e.g., nine, ten). Moreover, in some embodiments, all or a portion of the transducers 214 are configured to operate as a phased array to desirably adjust (e.g., narrow or widen) a radiation pattern of the transducers 214, thereby altering a user’s perception of the sound emitted from the playback device 210.
[0117] In some examples, a filter is axially aligned with the transducer 214b. The filter can be configured to desirably attenuate a predetermined range of frequencies that the transducer 214b outputs to improve sound quality and a perceived sound stage output collectively by the transducers 214. In some embodiments, however, the playback device 210 omits the filter. In other embodiments, the playback device 210 includes one or more additional filters aligned with the transducers 214b and/or at least another of the transducers 214.
[0118] In some examples, various techniques described herein may be carried out with a playback device that includes multiple audio transducers 214, and which may optionally be used as a multi-channel satellite playback device for home theater applications. In particular, various techniques described herein may be implemented with a multi-channel spatial playback device, that is, a playback device that includes at least one upward-firing transducer in addition to one or more lateral-firing transducers. By way of illustration, an example of such a playback device is illustrated in Figures 3A-D. In the illustrated example, the playback device 310 includes a plurality of transducer assemblies 314a-f (collectively transducers/speakers 314) oriented in different directions or otherwise configured to direct sound along different sound axes. In particular, the transducer assemblies 314 include a forward firing transducer assembly 314a, a side-firing transducer assembly 314b, a side-firing transducer assembly 314c, an upward-firing (or vertically-firing) transducer assembly 314d, a side-firing transducer assembly 314e, and a side-firing transducer assembly 314f (not shown). The transducer assemblies 314 can be similar or identical to any of the transducers 214 described above.
[0119] Figure 3 A illustrates an exploded view of the playback device 310. The plurality of transducer assemblies 314 are carried in a housing 330. In the example shown in Figure 3 A, the forward-firing transducer assembly 314a comprises several components, including a first component 314a-l (e.g., an audio transducer, such as a tweeter) and a second component 314a- 2. In assembly, the first component 314a-l and the second component 314a-2 are joined to form the forward-firing transducer assembly 314a. In other examples, the forward-firing transducer assembly 314a may be formed from a single component. Within example implementations, the other speakers 314 as well as the other components may be formed from one or more multiple components as well. Further, the playback device 310 may otherwise include components the same as or similar to the playback devices 110a (e.g., Figure 1C) or 210 (e.g., Figures 2A-C), which may be carried by the housing 330.
[0120] Referring to FIG. 3B, there is illustrated a further view showing the playback device 310 as partially assembled (without the exterior speaker grilles and trim). Figure 3B shows the housing 330 carrying the side-firing transducer assembly 314b, the side-firing transducer assembly 314c, the up-firing transducer assembly 314d, and the side-firing transducer assembly 314e. The transducer assemblies 314 of the playback device 310 are arranged to output audio along a variety of sound axes. Figure 3B illustrates three axes 302, 304, 306 of a three- dimensional space relative to the playback device 310. A forward axis 302 extends generally through a center of the playback device 310, as shown. A horizontal axis 304 extends along a horizontal dimension of the playback device 310, perpendicular to the forward axis 302. The third axis 306 is a vertical axis, extending perpendicular to both the forward and horizontal axes 302, 304. In certain examples, the forward-firing transducer assembly 314a can be configured to output audio along a first lateral sound axis (also referred to as a forward sound axis) that is parallel with the forward axis 302. In some examples, the first lateral sound axis is inclined or vertically angled relative to the plane of the forward axis 302 by an inclination angle that is less than 30 degrees. Similarly, the side-firing transducer assemblies 314b, 314c, 314e, and 314f (not shown in Figure 3B) can be configured, either individually or in combination, to output audio primarily along one or more second lateral sound axes (also referred to as side sound axes) that are aligned, within an inclination angle, with a plane extending along the horizontal axis 304. In some examples, the one or more side sound axes are parallel with the horizontal axis 304. In other examples, any of the one or more side sound axes may be horizontally angled with respect to the horizontal axis 304, for example, by up to 45 degrees. The one or more side sound axes can be vertically angled with respect to the forward and horizontal axes by the same or different inclination angles, any of which may be, for example, up to 10, 20, 30, or 40 degrees in some examples. The vertically-firing transducer assembly 314d can be configured to output audio primarily along a vertical sound axis that, in some examples, is substantially parallel to the vertical axis 306 (and therefore vertically angled with respect to the forward and horizontal axes 302, 304). In some examples, the vertical sound axis may be vertically angled relative to the forward axis 302 by between about 50 degrees and about 90 degrees, between about 60 degrees and about 80 degrees, or about 70 degrees. In some examples, the vertical sound axis is vertically angled relative to the forward axis 302 by a suitable oblique angle.
[0121] Within examples, the transducer assemblies 314 of the playback device 310 have a particular arrangement relative to one another. Figure 3C is a partial view of the playback device 310 which illustrates the speakers 314 in an example arrangement. In one example, the side-firing transducer assembly 314b and the side-firing transducer assembly 314f are implemented as respective woofers (e.g., left woofer 314b and right woofer 314f) and are oriented in second and third directions that are approximately 180° from one another and approximately 90° from the first direction in the horizontal plane. In this example, three of the speakers 314 are implemented as tweeters. These include the side-firing transducer assembly 314c (e.g., left tweeter) and the side-firing transducer assembly 314e (e.g., right tweeter), which are similarly oriented as the side-firing transducer assembly 314b and the side-firing transducer assembly 314b. The tweeters also include the vertical-firing transducer assembly 314d, which is oriented in a fourth direction approximately 70° from the first direction in the vertical plane. As shown, the side-firing transducer assembly 314c, the side-firing transducer assembly 314e, the upward-firing transducer assembly 314d also include respective horns. Any one or more of the transducer assemblies 314 may include mounting/positioning hardware 318 for coupling the respective transducer assembly 314 to the housing 330 and/or angling the transducer with respect to any of the forward axis 302, horizontal axis 304, and/or vertical axis 306.
[0122] The arrangements of the transducer assemblies 314 may have particular acoustic effects. For instance, the arrangement of the side-firing transducer assembly 314c and the sidefiring transducer assembly 314e may provide an ambient effect when surround content is output via the side-firing transducer assembly 314c and the side-firing transducer assembly 314e respectively. The similar arrangement of the side-firing transducer assembly 314b and the sidefiring transducer assembly 314f may have a similar effect. In contrast, the forward-firing transducer assembly 314a has a relatively more direct sound (assuming that the playback device 310 is oriented such that the primary direction of forward-firing transducer assembly 314a is more oriented toward the user(s) relative to the primary direction of output of the side-firing transducer assemblies 314c and 314e).
[0123] To provide further illustration, Figure 3D is a view showing the playback device 310 as partially assembled. Figure 3D shows the housing 330 carrying the side-firing transducer assembly 314b, the up-firing transducer assembly 314d, the side-firing transducer assembly 314e, and the side-firing transducer assembly 314f, as well as the second component 314a-2 of the forward-firing transducer assembly 314a. The first component 314a-l is not shown in Figure 3D in order to provide a partial interior view of the housing 330.
[0124] In operation, the playback device 310 can be utilized to play back 3D audio content that includes a vertical component (also referred to herein as a “height component”), either as a standalone device or as one component of a home theater arrangement, as described further below.
IV. Examples Techniques for Tuning Playback Device Parameters Based on Position [0125] As described above, conventional surround sound audio rendering formats include a plurality of channels configured to represent different lateral positions with respect to a listener (e.g., front, right, left). More recently, three-dimensional (3D) or other immersive audio rendering formats have been developed that include one or more vertical or height channels in addition to any lateral channels. Examples of such 3D audio formats include DOLBY ATMOS, MPEG-H, and DTS:X formats. Such 3D audio rendering formats may include one or more vertical channels configured to represent sounds originating from above a listener. In some instances, such vertical channels can be played back using an up-firing transducer, such as the up-firing transducer assembly 314d of the playback device 310, for example, that can output audio along a sound axis that is at least partially vertically oriented with respect to a forward horizontal plane of the playback device. This audio output can reflect off an acoustically reflective surface (e.g., a ceiling) to be directed toward a listener at a target location. As described above, because the listener perceives the audio as originating from a point or region of reflection on the ceiling, the psychoacoustic perception is that the sound originates “above” the listener. Examples of psychoacoustic effects of vertically directed audio versus forward propagating audio are described in MacLean ‘635 referenced above.
[0126] In some cases, placement or position of multichannel satellite playback devices around the listening environment may differ from the intended placement or position, either in terms of device location or device orientation (or both). As a result, the audio output by the multichannel satellite playback devices can have unintended properties, including distorted spatial perception of the audio. Examples of the present technology can address such issues by modifying playback parameters of multi-channel satellite playback devices to compensate for their placement within the environment. As a result, even when multi-channel audio satellite playback devices are placed or positioned in suboptimal or undesirable locations or orientations from an acoustic standpoint, the system can adapt playback to provide an improved listening experience.
[0127] Figures 7A-7E illustrate examples of various positions of two multi-channel playback devices 710a, 710b within a listening environment. Examples of modifying playback parameters of multi-channel satellite playback devices, such as the playback devices 710a, 710b, based on their positioning within the environment are described further below with reference to Figures 7A-E. The playback devices 710a, 710 may be examples of the playback device 310, for example.
[0128] In some instances, modifying playback parameters can involve adjusting a characteristic phase response and/or a characteristic magnitude response of audio output for one or more of the transducers of the playback device, as described in more detail below. Furthermore, the distribution of the audio channels of the multi-channel audio content among the plurality of transducers can be altered. For example, when a multi-channel playback device is oriented upside-down, the transducer array can be reconfigured to re-assign responsibility for rendering the height audio channel, and optionally one or more lateral audio channels to different transducers than would be the assignment were the playback device oriented right- side up (a first or “standard” orientation). In some examples, a center tweeter, such as the forward transducer assembly 314a of the playback device 310, for example, is reassigned height audio responsibility, whereas in the first orientation, the upward-firing transducer assembly 314d may be primarily responsible for rendering the height audio channel. Instead, responsibility for rendering some or all of a center audio channel and/or one or more side audio channels (e.g., left, right, etc.) can be reassigned to the upward-firing (downward-firing in the inverted scenario) transducer assembly 314d. Examples are described in more detail below.
[0129] In addition, according to certain examples, the media playback system can be configured to support a mismatched pair of playback devices in which one playback device is oriented right-side up (“standard”) and the other is inverted, as illustrated in Figure 7E, for example. The pair of playback devices 710a, 710b may be a stereo pair, for example, or may be part of a home theater arrangement (e.g., left and right rear surround speakers). In such cases, playback parameters of the inverted playback device 710a can be configured to operate the inverted playback device in an inverted mode (which may include altering audio channel assignment among transducers and/or adjusting aspects of the digital signal processing chain(s) associated with one or more transducer assemblies). Playback parameters for the standard- oriented playback device 710b may remain unchanged or may also be adjusted to account for effects of the other playback device in the pair being inverted such that the pair, in combination, produces audio playback with an improved listener experience.
[0130] Referring to Figures 4A-C, there are illustrated flow diagrams of examples of methods for tuning or otherwise adjusting playback devices to compensate for playback device placement, position, and/or orientation (such as inversion), according to certain aspects. The methods 400, 412, and/or 416 may be implemented in whole or in part by a playback device (e.g., at least one of the playback devices HOa-n (Figure IB), such as playback device 1101, 110m, 1 lOj or 110k (Figure IK); the playback device 210 (Figures 2A-2C); the playback device 310 (Figures 3A-3D)) whose configuration settings are being modified, or in whole or in part by an administrator device in the media playback system (e.g., another playback device, a controller such as the control device 130a (Figure 1H), a computing device, etc.) which may then provide determined playback parameter settings to the playback device(s) whose settings are to be modified. In some examples, one or more of the methods 400, 412, and/or 416 is at least partially implemented by one or more remote computing devices (e.g., one or more servers and/or cloud computing environments such as the cloud network 102 of Figure IB) that are communicatively coupled to the media playback system via WAN or another suitable network connection (e.g., at least one of the links 103 of Figure IB). [0131] Referring to Figure 4A, at operation 402, the playback device receives multi-channel audio content. Accordingly, the playback device may prepare to render one or more audio channels of the multi-channel audio content. In some examples, the playback device receives the multi-channel audio content via a network interface such as the network interface 112d (Figure 1C), either via the wireless interface(s) 112e, the wired interface(s) 112f, or perhaps both. In certain examples, the playback device receives the multi-channel audio via a hardware interface such as the input/output 111 (Figure 1C). In some examples, the playback device receives the multi-channel audio from a primary device (e.g., a playback device such as the playback device I lOh (Figure IK), a local audio source (e.g., the local audio source 105 of Figure 1C), and/or a network device such as a mobile device, television, digital streaming device, set-top box, etc.). In some cases, the primary device receives first audio data comprising multi-channel audio content, decodes a portion of the first audio data to generate second audio content and transmits the generated second audio to the playback device. In certain examples, the second audio comprises a downmixed version of a portion of the first audio that the playback device receives and up-mixes to produce the multi-channel audio for which it is assigned playback. Further details regarding the distribution of audio among multi-channel devices can be found U.S. Patent No. 8,788,080, U.S. Patent No. 9,886,234, and Peace ‘ 105 referenced above.
[0132] Operation 404 includes obtaining orientation information for a playback device. For purposes of explanation, the following discussion may refer to the playback device 310 described above; however, it will be appreciated that the method may be applied to playback devices having a different configuration from that shown in Figures 3 A-D. In some examples, obtaining an indication of device orientation at operation 404 can involve sensing an orientation of the playback device 310 via on-board sensors (e.g., one or more gyroscopes, accelerometers, inertial motion units (IMUs), and/or other suitable motion sensors that can determine whether the playback device is inverted or not), analyzing acoustic output of the playback device 310, using other sensors not associated with the playback device 310, or any other suitable technique.
[0133] In various examples, the orientation of the playback device can be determined by the playback device itself or can be determined via other devices and the indication can be transmitted to the playback device or other device of the media playback system. The orientation can include, for example, an angular orientation of the device (e.g., rotation about a vertical axis extending through the playback device 310) relative to the environment, relative to a listener, and/or relative to other playback devices. In some examples, the orientation can also include other positional information (e.g., absolute location with an environment, distance between the playback device 310 and the listening location, other devices within the environment, a height of the playback device 310 relative to the environment, etc.). In some examples, determining that the playback device 310 is inverted includes detecting that the physical orientation of the playback device 310 has changed relative to the upright orientation (e.g., depicted in Figures. 3B and 3D) by more than a threshold amount of change. The threshold amount of change may be 90° of rotation about the rotational axis of symmetry, but other examples are possible.
[0134] In some examples, one or more sensors disposed in and/or on a wall mount, ceiling mount, or another fastening device can be used in conjunction with (or perhaps exclusively) to determine the playback device orientation. In certain examples, other sensors such as one or more cameras, microphones, radio frequency (RF) sensors, either on-board the playback device or another device, provide indications of the playback device in orientation. For instance, in some examples, a camera carried on board the playback device or another device can determine the orientation of the device by comparing a detected placement or orientation to a default orientation. In some examples, the playback device emits sound (e.g., a predetermined waveform such as a chirp, tone, sweep, and/or another suitable signal) toward another device (e.g., such as another playback device, a television, mobile device) (or vice versa). The emitted signal is received by one or more microphones of another device, and based on a comparison between the emitted sound and the received microphone data, the orientation of the playback device is determined. Techniques for determining the orientation, or other positioning information, of playback devices are described, for example, in Beckhardt ‘504, Kallai ‘556, Jones ‘6008, Soto ‘9736, and/or van Erven ‘7168 referenced above.
[0135] In some instances, aspects of operation 404 may be performed during a playback device set-up operation, or in response to one or more sensors detecting and/or providing an indication that the position (e.g., orientation and/or location) of the playback device has changed. Accordingly, in some examples, aspects of operation 404 may be performed prior to operation 402. In such instances, after beginning operation 402, the playback device may access stored information indicating a current orientation of the playback device, or receive an indication of the orientation of the playback device from an external device. However, determination of the orientation of the playback device may have been performed prior to the start of operation 402, for a given playback session.
[0136] If the indication of the device orientation indicates that the playback device is in the standard orientation, the playback device may proceed to render audio content at operation 406. If the device is inverted, however, the method may include operation(s) 408 of adjusting various configurations, parameters, and/or settings of the playback device (and optionally one or more other playback devices in a bonded zone (or another group) with the playback device, if applicable) such that the listener may perceive the same audio “image” (or a similar audio image) as if the playback device were in the standard orientation. In some instances, the left and right audio channels may behave very similarly when the playback device 310 is inverted compared to when the playback device is in the standard orientation. Accordingly, little or no adjustment to parameters controlling playback of these audio channels may be needed. However, in some examples, the center audio channel(s) and height channel(s) behave with some extreme variation when the playback device 310 is inverted. Accordingly, the inventors have developed the techniques that are described below for adjusting parameters controlling playback of these channels to compensate for inversion of the playback device 310.
[0137] As described above, due to the arrangement of individual transducer assemblies 314 within the playback device 310, the playback device can be configured to output one or more audio channels of the multi-channel audio along a number of sound axes (e.g., along the forward axis 302 wherein the sound propagates in a direction generally perpendicular to a front face of the playback device 310, along one or more side axes wherein the sound propagates at a lateral angle with respect to the forward axis 302, and optionally along one or more vertical axes wherein the sound propagates at a vertical angle with respect to the forward axis 302, optionally along or parallel to the vertical axis 306) to achieve a desired acoustic effect. In some instances, a single audio channel can be mapped to a particular sound axis, while in other instances a single audio channel can be output via two or more sound axes, and moreover two or more audio channels can be output via the same sound axis. Accordingly, perceived characteristics of audio playback can be modified by selectively distributing the audio output among the different transducer assemblies 314 and thus the different sound axes of the playback device 310. These techniques can be applied to control and/or adjust the center and/or height audio channel(s) to maintain an overall audio image that provides a positive listening experience even when the playback device 310 is inverted. In some examples, it may be desirable to adjust the sound field, or audio image, corresponding to one or more audio channels of the multi-channel audio content to shift a perceived location of the source of that audio channel. For example, referring again to Figure 3B, if the forward-firing transducer assembly 314a outputs a center audio channel along the forward axis 302, a listener positioned in front of the playback device 310 may perceive the center audio channel to originate at (or proximate to) a center of the playback device 310. Depending on the vertical placement of the playback device 310 relative to other features of the listening environment and configuration of the playback device 310, this may or may not represent an optimal listening experience for the listener. For example, if the playback device 310 is part of a home theater group, it may be preferable for the perceived source of the center audio channel to correspond generally with a location of a television (or other video device) outputting video content corresponding to the audio content output by the playback devices of the home theater group. Thus, if the playback device 310 is placed in the standard orientation (right-side up) and below the video device, it may be preferable to elevate the perceived source of the center audio channel. Similarly, if the playback device 310 is mounted above the video device, it may be desirable to “de-elevate” the center audio channel to relocate the perceived source to be coincident with the video device. A variety of other scenarios in which it may be desirable to reposition a perceive source of one or more audio channels of the multi-channel audio content can be envisioned.
[0138] Figure 7A illustrates an example of the pair of playback devices 710a, 710b positioned such that elevation of the center audio channel may be preferred. In the example shown in Figure 7 A, the playback devices 710a, 710b are approximately aligned with one another such that a horizontal axis A passes roughly through a center of the playback devices 710a, 710b, and both playback devices 710a, 710b are in the standard orientation, such that the vertically- firing transducers are directed towards the ceiling, as indicated by arrows O. Accordingly, in this example, the horizontal axis A may correspond to the horizontal axis 304 of the playback device 310 described above. The playback device 710a is positioned a distance DI from the floor of the environment, and the playback device 710b is positioned a distance D2 from the floor. In the example illustrated in Figure 7A, DI is approximately equal to D2. As shown, in this example, the playback devices 710a, 710b are positioned closer to the floor than to the ceiling. The ceiling has a height H measured from the floor. Examples of techniques for determining ceiling heights and/or relative heights of devices in a room or zone are described in “Jones ‘4856” which is incorporated by reference above.
[0139] Figure 7A illustrates a configuration in which it may be preferable to elevate the sound filed corresponding to the center audio channel. For example, the playback devices 710a, 710b can be configured to position the sound field corresponding to the center audio channel at a height represented by horizontal axis C. As shown, the axis C is vertically offset from the axis A (e.g., the horizontal axes 304 of the playback devices 710a, 701b) by a distance D3, and is positioned a distance D4 below the ceiling.
[0140] As described above, in some instances, the playback device 310 is inverted when the playback device 310 is placed relatively high up in a listening environment (e.g., closer to a room ceiling than to the room floor), as illustrated in Figure 7B, for example. In such instances, it may also be desirable to de-elevate the center audio channel output by the playback device 310 so as to reposition the sound field corresponding to the center audio channel (C’) to a desired vertical offset (D3’) relative to the horizontal axis 304 of the playback device (e.g., relative to the horizontal axis A shown in Figure 7B). This can be accomplished by distributing responsibility for rendering of the center channel among at least one lateral-firing transducer (e.g., the forward transducer assembly 314a) and at least one vertically-firing transducer (e.g., the transducer assembly 314d).
[0141] In the example shown in Figure 7B, the playback device 710a is positioned a distance DI’ from the ceiling, and the playback device 710b is positioned a distance D2’ from the ceiling. In the illustrated example, DI’ is approximately equal to D2’. The horizontal axis C’ corresponding to the sound field of the center audio channel is positioned a distance D4’ from the floor of the environment. The vertical offset distance D3’ can be selected based on a desired distance D4’.
[0142] In the examples of Figures 7A and 7B, the playback devices 710a, 710b are positioned at approximately the same height. However, in other examples, the playback devices 710a, 710b, may be positioned at different heights. Alternatively, the playback devices 710a, 710b may be positioned at approximately the same height, but offset from one another in a horizontal plane perpendicular to the direction of distance between the two playback devices 710a, 710b. An example of such an arrangement, in which the playback devices 710a, 710b are offset from one another either in the vertical dimension (e.g., positioned at different heights) and/or in the horizontal dimension (perpendicular to the distance between them) is illustrated in Figure 7C. In such an example, the playback devices 710a, 701b can be configured to position the sound field corresponding to the center audio channel (axis C’) at a vertical offset D3’ that may be averaged between the two playback devices.
[0143] Referring again to Figure 4A, operation(s) 408 may include an operation 410 of determining a desired sound field position of one or more audio channels to be output by the playback device 310. The one or more audio channels may include the center audio channel, for example, and/or one or more other audio channels (e.g., left and/or right audio channels). In some examples, operation 410 includes determining a vertical offset (e.g., D3 or D3’) relative to the horizontal axis 304 that corresponds to at least one perceived source location of an audio channel, such as the center audio channel, for example. Since this offset D3, D3’ will differ based on whether or not the playback device 310 is inverted, determining the vertical offset may be based at least in part on the indication of the orientation of the playback device obtained at operation 404. In some examples, operation 410 further includes determining, based on the vertical offset, proportions of the audio channel can be assigned for playback via various sound axes (e.g., a forward sound axis and a vertical sound axis) so as to generate a sound field corresponding to the audio channel at a height with respect to a center of the playback device 310 that corresponds to the vertical offset. This can be accomplished by controlling the digital signal processing chains for one or more transducers that are capable of outputting audio content along the selected sound axes. For example, the center audio content corresponding to the center audio channel can be directed to the forward-firing transducer assembly 314a and to the vertically-firing transducer assembly 314d, and the digital signal processing chains for these transducers can be configured with appropriate filtering, amplification, equalization, and/or other settings or parameters so as to cause the forward-firing transducer assembly 314a and the vertically-firing transducer assembly 314d to output proportions of the center audio channel that generate the desired audio image.
[0144] As described above, in some examples, one or more playback devices may be tilted relative to the vertical or horizontal axes, rather than being oriented purely inverted. An example is illustrated in Figure 7D. In the illustrated example, the playback device 710a is positioned at a height DI” from the ceiling, and the playback device 710b is positioned at a height D2” from the ceiling. In some examples DI” and D2” may be approximately equal such that the playback devices 710a, 710b are positioned approximately along the same plane corresponding to axis A’, as shown in Figure 7D. In such an example, the digital signal processing chains for the vertically-firing and/or one or more laterally-firing transducers of the playback devices 710a, 710b can be configured to position the sound field corresponding to the center audio channel (represented by the axis C’) at a distance D4’ from the floor and vertically offset by a distance D3” from the axis A', as shown.
[0145] It will be appreciated that, depending on the relative placement and/or orientation of the playback devices 710a, 701b (e.g., positioned at same or different heights, inverted or not, etc.), the parameters of the individual playback devices 710a, 710b may be configured differently so as to position the overall sound field for a given audio channel (e.g., the center audio channel) produced from a combination of all the playback devices in a group (e.g., playback devices 710a and 710b collectively) at a desired location, such as a desired height relative to the ceiling or floor of an environment or desired height (vertical offset) relative to the horizontal axis (e.g., 304, A, A’) of one or more playback devices.
[0146] Thus, referring to Figure 8, in some examples, a method 800 may include an operation 802 of determining the positions of one or more playback devices. The method 800 may further include, at operation 804, determining a desired height of the sound field (or audio image) corresponding to the center audio channel, and at operation 806, adjusting audio output parameters (e.g., filtering, amplification, equalization, etc. of one or more digital signal processing chains associated with transducers of the playback device(s)) of one or more playback devices to position the sound field of the center audio channel at the desired height.
[0147] Further, referring to Figure 4B, in some examples, the method 412 includes operation 413a of determining an updated transducer arrangement (e.g., that the playback device 310 is inverted, or that the configuration of playback devices 710a, 710b has changed from the arrangement shown in Figure 7A to an arrangement shown in any of Figures 7B-7E, for example). Based on the determination in operation 413a, the method 412 may include, at operation 413b, determining which audio channels are to be output by the playback device 310, and at operation 413c, assigning responsibility for rendering of the audio channel(s) among one or more of the transducers 314.
[0148] Referring again to Figure 4A, operations 408 may include, at operation 412, adjusting audio channel assignment among the plurality of transducer assemblies 314 of the playback device 310, and at operation 414, adjusting one or more audio playback parameters, such as characteristic phase response and/or magnitude response. Modifications to the characteristic phase response and/or magnitude response for a given audio channel can be achieved by using suitable signal processing techniques, for example subjecting the audio to appropriate filtering operations (e.g., using finite impulse response (FIR) filters, infinite impulse response (HR) filters, or other suitable filters), as described further below.
[0149] As described above, a problem associated with inverting the orientation of the playback device 310 is that, without adjustment, the height audio information would be directed downward towards the floor, rather than upward toward the ceiling. As a result, the listener may no longer perceive the audio as originating from above the listener, as intended. Accordingly, to address this problem, operation 412 may include reassigning some or all responsibility for rendering the height audio channel to one or more transducer assemblies 314 other than the vertically-firing transducer assembly 314d. That is, as a result of operation 412, the height audio channel may be output via a lateral transducer assembly, rather than a conventional vertically-oriented transducer assembly.
[0150] Operation 414 may include adjusting one or more parameters or settings in the digital signal processing chains associated with the transducer assemblies 314 involved in the reassignment to accommodate output of the height audio channel while maintaining a desired audio image (e.g., one that is very similar to the audio image corresponding to acoustic output by the playback device 310 if not inverted). Accordingly, operations 412 and 414 may be performed to reconfigure the height audio channel (and optionally one or more other audio channels) based on the playback device 310 being inverted, regardless of whether operation 410 is performed.
[0151] Referring to Figure 4C, in some examples, the method 416 includes, at operation 417a, determining an updated transducer arrangement (e.g., that the playback device 310 is inverted). At operation 417b, the center audio channel can be assigned to the vertically-firing transducer assembly, and at operation 417c, the height audio channel can be assigned to a lateral-firing transducer assembly. Figure 4C may represent one example of the method 412 described above with reference to Figure 4B. In some examples, the methods 412, 416, and/or 800 may be part of the method 400 of Figure 4 A.
[0152] Referring again to Figure 4A, once the playback device 310 is configured with desired settings and audio channel distribution, the playback device 310 may render the multi-channel audio content at operation 406.
[0153] As described above, in some examples, inversion of the playback device 310 may not significantly impact the acoustic effects associated with the left and right audio channels. However, significant impact may be experienced or perceived with respect to the center audio channel(s) and height audio channel(s). According to certain examples, at operation 412, the following audio channel assignment modifications are made to allow the playback device 310 to properly render spatial audio, such as DOLBY ATMOS, for example, when inverted. In one example, the height audio channels are reassigned from the vertically-firing transducer assembly 314d to a combination of the left woofer 314b, the right woofer 314f, and/or the center tweeter 314a. In addition, the center audio channel is redistributed to a combination of the left and right woofers 314b, 314f, the center tweeter 314a, and the vertically-firing tweeter 314d. Figures 5 A and 5B illustrate an example of modifications to the digital signal processing chains for the center audio channel, and Figures 6A and 6B illustrate an example of modifications to the digital signal processing chains for the height audio channel corresponding to these changes.
[0154] Referring to Figure 5A, there is illustrated an example of a digital signal processing chain for the center audio channel and forming part of an audio driver signal chain for transducer assemblies 314 of the playback device 310 (Figures 3A-3D). Figure 5A shows an example of the digital signal processing chain for the center audio channel for the playback device 310 being in the standard orientation. The circuitry 500A includes a common portion 502, along with signal branches for the center tweeter 314a and for the vertically-firing tweeter 31 d, and a shared signal branch for the left and right woofers 314b, 314f. The common portion 502 performs phase equalization, and includes an equalization transfer function block 504, a FIR filter 506, and an amplification stage 508. The signal chain for the center tweeter 314a includes a FIR filter 510, an amplification block 514, and equalization transfer function blocks 512 and 516. Further, in the illustrated example, the signal chain for the vertically-firing tweeter 314d includes equalization transfer function blocks 518, 520, a highpass FIR filter 522, a highpass HR filter 524, a bandpass IIR filter 526, a bandpass FIR 528, and an amplification block 530.
[0155] According to certain examples, the circuitry 500A is configured to support a mid-band frequency range of 500Hz to 2kHz and a high frequency range above 2kHz. For the center audio channel, the circuitry 500A can be configured with an optimized cross-over in the higher frequency range using both the signal chain for the center tweeter 314a and the signal chain for the vertically-firing tweeter 314d such that the sound field, or audio image, is projected a certain distance (e.g., one or more feet) above (when the playback device 310 is in the standard orientation) or below (when the playback device 310 is inverted) the playback device 310. This cross-over can be accomplished through selection of the pass and/or stop bands of the various filters.
[0156] In some examples, the FIR filter 506 is a bandstop filter with a stopband centered around a frequency of 600 Hz. In one example, the FIR filter 510 is a highpass filter having a cut-off point at 600 Hz. Accordingly, in this example, the forward-firing and upward-firing center tweeter 314a is essentially full range and can be primarily responsible for outputting the center audio channel of the multi-channel audio content. In one example, the highpass FIR filter 522 has a cut-off frequency of 1800 Hz, the highpass HR filter 526 has a cut-off frequency of 8000 Hz. In one example, the bandpass HR filter 526 has a passband with a center frequency at 2500 Hz, and the bandpass FIR 528 has a passband with a center frequency at 2756 Hz. In one example, the bandpass HR filter 526 is for parametric equalization and the bandpass FIR filter 528 is for phase equalization.
[0157] In the illustrated example, the signal chain for the left and right woofers 314b, 314f includes an equalization transfer function block 532, a pair of HR filters 534, 536, a lowpass FIR filter 538, bandpass FIR filter 540, and an amplification block 542. In one example, the HR filter 534 is a bandpass filter with a passband centered around 65 Hz, and the HR filter 536 is a bandpass filter with a passband centered around 50 Hz. The IIR filters 534, 536 implement parametric equalization. In one example, the lowpass FIR filter has a cut-off frequency of 600 Hz, and the bandpass FIR has a passband centered around 500 Hz. [0158] Referring to Figure 5B, there is illustrated circuitry 500B corresponding to a modified version of the digital signal processing chain for the center audio channel for the playback device 310 being inverted. As shown, relative to the circuitry 500A, an HR filter 544 is added in the signal chain for the forward-firing tweeter 314a. In one example, the HR filter 544 is a lowpass filter with a cut-off frequency of 3483 Hz. In the signal chain for the vertically-firing tweeter 314d, the highpass HR filter 524, the bandpass HR filter 526, and the bandpass FIR 528 of the circuitry 500A are replaced by a FIR filter 546. In one example, the FIR filter 546 is a bandstop filter having a stopband centered around a frequency of 1613 Hz. In this example, the filters 544 and 546 modify the array of transducer assemblies 314 to shift the audio lower in image for the higher frequencies while still using the forward-firing tweeter 314a for the midband frequencies. This avoids the problem of the center audio channel sound field being recessed behind the playback device 310, which could occur if the vertically-firing tweeter 314d alone were used to render the center audio channel. In addition, it also creates a center channel that renders with a projected sound field below the playback device 310, as described above.
[0159] Figures 6A and 6B illustrate circuitry 600A and 600B, respectively, corresponding to examples of the digital signal processing chain for the height audio channel for the playback device 310 in the standard (Figure 6A) and inverted (Figure 6B) orientations. The circuitry 600 A, 600B may form part of an audio driver signal chain for transducer assemblies 314 of the playback device 310.
[0160] Referring to Figure 6A, in this example, the circuitry 600A includes a common signal chain portion, including an equalization transfer function block 602 and an amplification block 604. The circuitry 600 A further includes a pre-mix stage, including an HR filter 606 and another amplification block 608. In one example, the HR filter is a highpass filter having a cutoff frequency of 600Hz.
[0161] In the illustrated example, the signal chain for the vertically-firing tweeter 314d includes a pair of equalization transfer function blocks 610, 612, a highpass FIR filter 614, a bandstop FIR filter 616, and an amplification block 618. In one example, the highpass FIR filter 614 has a cut-off frequency of 1800 Hz, and the bandstop FIR filter 616 has a stopband with a center frequency of 1800 Hz. In the illustrated example, the signal chain for the forwardfiring tweeter 314a, the side tweeter 314c, and the left and right woofers 314b, 314f includes a second premix stage including an amplification block 620 and a pair of second-order lowpass HR filters 622, 624. In one example, the HR filter 622 has a cut-off frequency of 2300 Hz and the HR filter 624 has a cut-off frequency of 1400 Hz. [0162] Continuing with the example of Figure 6A, the signal path for the side-firing tweeter 314c includes an equalization transfer function block 626, a first bandstop FIR filter 628, a highpass FIR filter 630, a second bandstop FIR filter 632, and an amplification block 634. In one example, the first bandstop FIR filter 628 has a stopband with a center frequency of 1800 Hz, the highpass FIR filter 630 has a cut-off frequency of 1400 Hz, and the second bandstop FIR filter 632 has a stopband with a center frequency of 2200 Hz. The signal chain for the left woofer 314b and the forward-firing tweeter 314a further includes an equalization transfer function block 636 and a lowpass FIR filter 638. In one example, the lowpass filter 638 has a cut-off frequency of 1350 Hz. The signal chain for the right woofer 314f further includes a series of FIR filters 640, 642, 644. In one example, the FIR filter 640 is a lowpass filter having a cut-off frequency of 400 Hz, the FIR filter 642 is a bandpass filter having a passband with a center frequency of 800 Hz, and the FIR filter 644 is a bandpass filter having a passband centered around 853 Hz.
[0163] As shown in Figure 6A, in this example, the vertically-firing tweeter 314d is primarily responsible for rendering the height channel. In addition, a beamforming circuit 646 for an antiphase front lobe at the forward-firing tweeter 314a is present in the circuitry 600A. Examples of the beamforming circuit 646 are described in Peace ‘ 105 referenced above. Overall, the circuitry 600A is configured (or tuned with the values noted above) to gradually shift the audio image for the height audio channel from the side-firing woofers 314b, 314f to the side-firing tweeters 314c, 314e, to the up-firing tweeter 314d based on wavelength. In some examples, the smaller wavelengths that provide the majority of the directional imaging are rendered/played via the up-firing tweeter 314d, while stereo imaging is maintained in the mid-band playing via the side-firing tweeters 314c, 314e.
[0164] Referring to Figure 6B, for the inverted configuration of the playback device 310, circuitry 600B includes a HR filter 648 in the common portion of the signal chain. In one example, the HF filter 648 is a bandpass filter having a passband centered around 1756 Hz. The signal chain for the vertically-firing tweeter 314d is not illustrated in Figure 6B since, in at least some examples, with the playback device 310 in the inverted orientation, the vertically- firing tweeter 314d does not render any proportion of the height channel. Rather, a significant proportion of the height channel is rendered via the forward-firing tweeter 314a. As a result, the complex circuitry for producing the antiphase front lobe at the forward-firing tweeter 314a can be eliminated. Instead, in the illustrated example, the circuitry 600B includes, in the signal chain for the forward-firing tweeter 614a, equalization transfer function blocks 650, 652, amplification blocks 654, 672, a pair of highpass FIR filters 656, 658, and bandstop FIR filter 670. In one example, the highpass FIR filter 656 has a cut-off frequency of 1581 Hz, the highpass FIR filter 658 has a cut-off frequency of 2000 Hz, and the bandstop FIR filter 670 has a stopband with a center frequency of 1494 Hz. Overall, the circuitry 600B is configured (or tuned with the values noted above) to gradually shift the audio image for the height channel from the side-firing woofers, to the side-firing tweeters, to the front-firing tweeter based on wavelength. In some examples, the smaller wavelengths that provide the majority of the directional imaging are played via the up-firing tweeter 314d, while stereo imaging is maintained in the mid-band playing via the side-firing tweeters 314c, 314e. In some examples, since the playback device 310 in the inverted orientation may also be mounted high in a room (e.g., closer to the ceiling than to the floor), the center tweeter 314a may provide a better soundbeam for localization than the vertically-firing tweeter 314d.
[0165] By reconfiguring the digital signal processing circuitry and signal chains associated with the center audio channel and the height audio channel, as shown, for example, in Figures 5B (relative to Figure 5A) and Figure 6B (relative to Figure 6A), responsibility for rendering these channels can be redistributed among the transducer assemblies 314 of the playback device 310. In this manner, the overall audio image can be maintained, even when the playback device 310 is inverted.
[0166] Further examples of adjusting audio drivers for one or more audio channels to distribute responsibility for audio channel rendering among different transducers and along different sound axes in different scenarios are described in Chamness ‘9937 referenced above.
[0167] Thus, aspects and embodiments provide techniques for tuning playback device parameters related to multi-channel audio playback so as to compensate for a playback device being oriented inverted (upside-down) and provide an improved listening experience.
V. Conclusion
[0168] The above discussions relating to playback devices, controller devices, playback zone configurations, and media content sources provide only some examples of operating environments within which functions and methods described below may be implemented. Other operating environments and configurations of media playback systems, playback devices, and network devices not explicitly described herein may also be applicable and suitable for implementation of the functions and methods.
[0169] The description above discloses, among other things, various example systems, methods, apparatus, and articles of manufacture including, among other components, firmware and/or software executed on hardware. It is understood that such examples are merely illustrative and should not be considered as limiting. For example, it is contemplated that any or all of the firmware, hardware, and/or software aspects or components can be embodied exclusively in hardware, exclusively in software, exclusively in firmware, or in any combination of hardware, software, and/or firmware. Accordingly, the examples provided are not the only ways to implement such systems, methods, apparatus, and/or articles of manufacture.
[0170] Additionally, references herein to “embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one example embodiment of an invention. The appearances of this phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. As such, the embodiments described herein, explicitly and implicitly understood by one skilled in the art, can be combined with other embodiments.
[0171] The specification is presented largely in terms of illustrative environments, systems, procedures, steps, logic blocks, processing, and other symbolic representations that directly or indirectly resemble the operations of data processing devices coupled to networks. These process descriptions and representations are typically used by those skilled in the art to most effectively convey the substance of their work to others skilled in the art. Numerous specific details are set forth to provide a thorough understanding of the present disclosure. However, it is understood to those skilled in the art that certain embodiments of the present disclosure can be practiced without certain, specific details. In other instances, well known methods, procedures, components, and circuitry have not been described in detail to avoid unnecessarily obscuring aspects of the embodiments. Accordingly, the scope of the present disclosure is defined by the appended claims rather than the foregoing description of embodiments.
[0172] When any of the appended claims are read to cover a purely software and/or firmware implementation, at least one of the elements in at least one example is hereby expressly defined to include a tangible, non-transitory medium such as a memory, DVD, CD, Blu-ray, and so on, storing the software and/or firmware.
VI. Additional Examples
[0173] The following examples pertain to further embodiments, from which numerous permutations and configurations will be apparent.
[0174] Example 1 provides a method for a playback device comprising a plurality of audio transducers configured to output audio along a plurality of sound axes including at least a first lateral sound axis and a vertical sound axis, the first lateral sound axis being aligned, within an inclination angle of less than 30 degrees, with a first plane extending along a horizontal axis of the playback device and the vertical sound axis being angled with respect to the first horizontal axis by between 50 and 90 degrees. The method comprises receiving, at the playback device, multichannel audio content including a first audio channel, obtaining an indication of an orientation of the playback device, determining, based at least in part on the indication of the orientation of the playback device, a vertical offset relative to the horizontal axis that corresponds to at least one perceived source location of the first audio channel, determining, based on the vertical offset, proportions of the first audio channel assigned for playback via the first lateral sound axis and the vertical sound axis that generate a sound field corresponding to the first audio channel at a height with respect to a center of the playback device that corresponds to the vertical offset, and playing back the first audio channel via the first lateral sound axis and the vertical sound axis.
[0175] Example 2 includes the method of Example 1, wherein determining the proportions of the first audio channel is performed so as to de-elevate the sound field relative to the center of the playback device.
[0176] Example 3 includes the method of Example 2, wherein determining the proportions of the first audio channel includes assigning a first proportion of the first audio channel for playback via the first lateral sound axis, and assigning a second proportion of the first audio channel for playback via the vertical sound axis; wherein the second proportion is smaller than the first proportion.
[0177] Example 4 includes the method of Example 3, wherein the first audio channel is a center audio channel.
[0178] Example 5 includes the method of any one of Examples 1-4, wherein the multichannel audio content further includes a second audio channel. The method further comprises determining, based at least in part on the indication of the orientation of the playback device, proportions of the second audio channel assigned for playback via the first lateral sound axis and the vertical sound axis, and playing back the second audio channel via at least one of the first lateral sound axis or the vertical sound axis.
[0179] Example 6 includes the method of Example 5, wherein the second audio channel is a height channel, and wherein playing back the second audio channel includes playing back none of the height channel via the vertical sound axis.
[0180] Example 7 includes the method of one of Examples 5 or 6, wherein the plurality of sound axes includes a second lateral sound axis that is aligned, within the inclination angle, with the first plane, and wherein playing back the first and second audio channels includes playing back at least a portion of the first audio channel or the second audio channel via the second lateral sound axis.
[0181] Example 8 includes the method of any one of Examples 1-7, wherein obtaining the indication of the orientation of the playback device includes obtaining information corresponding to a proximity of the playback device to at least one structure in the environment. [0182] Example 9 includes the method of Example 8, wherein obtaining the indication of the orientation of the playback device includes obtaining information corresponding to a proximity of the playback device to a ceiling in the environment.
[0183] Example 10 provides a playback device comprising the plurality of audio transducers, one or more processors, and data storage having instructions thereon that, when executed by the one or more processors, cause the playback device to perform the method of one of Examples 1-9.
[0184] Example 11 provides a method for a playback device comprising a plurality of audio transducers configured to output audio along a plurality of sound axes including at least a lateral sound axis and a vertical sound axis, the lateral sound axis being aligned, within an inclination angle of less than 30 degrees, with a plane extending along a horizontal axis of the playback device and the vertical sound axis being angled with respect to the horizontal axis by between 50 and 90 degrees. The method comprises receiving, at the playback device, multichannel audio content including a first audio channel and a second audio channel, the second audio channel being a height channel, detecting an orientation of the playback device, assigning, based on the orientation of the playback device, proportions of the height channel for playback via at least one of the vertical sound axis or the lateral sound axis, playing back the first audio channel via at least one of the lateral sound axis or the vertical sound axis, and playing back the height channel via at least one of the lateral sound axis and the vertical sound axis.
[0185] Example 12 includes the method of Example 11, wherein detecting the orientation of the playback device includes determining that the orientation of the playback device is an inverted orientation, and wherein assigning proportions of the height channel includes assigning a first proportion of the height channel to be output via the lateral sound axis and a second proportion of the height channel to be output via the vertical sound axis, the first proportion being larger than the second proportion.
[0186] Example 13 includes the method of Example 12, wherein the second proportion is zero, and wherein playing back the height channel includes playing back the height channel via the lateral sound axis. [0187] Example 14 includes the method of one of Examples 12 or 13, further comprising assigning, based on the orientation of the playback device, a first proportion of the first audio channel to be output via the vertical sound axis and assigning a second proportion of the first audio channel to be output via the lateral sound axis, wherein the first and second proportions of the first audio channel are assigned to generate a sound field corresponding to the first audio channel that is de-elevated relative to a center of the playback device.
[0188] Example 15 includes the method of Example 14, wherein the first audio channel is a center channel.
[0189] Example 16 includes the method of any one of Examples 11-15, wherein detecting the orientation of the playback device includes detecting a proximity of the playback device to a ceiling of the environment.
[0190] Example 17 provides a playback device comprising the plurality of audio transducers, one or more processors, and data storage having instructions thereon that, when executed by the one or more processors, cause the playback device to perform the method of any one of Examples 11-16.
[0191] Example 18 provides a playback device comprising a plurality of audio transducers configured to output audio along a plurality of sound axes including at least lateral sound axis and a vertical sound axis, wherein the lateral sound axis is angled with respect to a horizontal axis of the playback device by less than 30 degrees and wherein the vertical sound axis is angled with respect to the horizontal axis by 50 - 90 degrees, one or more processors, and at least one tangible computer-readable storage medium storing program instructions that, when executed by the one or more processors, cause the playback device to receive multichannel audio content including first and second audio channels, the first audio channel being a height channel, detect an orientation of the playback device, determine, based at least in part on the orientation of the playback device, a vertical offset relative to the horizontal axis that corresponds to at least one perceived source location of the second audio channel, assign, based on the vertical offset, proportions of the second audio channel for playback via the first lateral sound axis and the vertical sound axis that generate a sound field corresponding to the second audio channel at a height with respect to a center of the playback device that corresponds to the vertical offset, and play back, via one or more of the plurality of audio transducers, the first and second audio channels.
[0192] Example 19 includes the playback device of Example 18, wherein the at least one tangible computer-readable storage medium further stores program instructions that, when executed by the one or more processors, cause the playback device to assign a first proportion of the first audio channel to be output via the vertical sound axis, and assign a second proportion of the first audio channel to be output via the lateral sound axis.
[0193] Example 20 includes the playback device of Example 19, wherein to assign the first and second proportions of the first audio channel, the at least one tangible computer-readable storage medium further stores program instructions that, when executed by the one or more processors, cause the playback device to determine that the orientation of the playback device is an inverted orientation, and assign the second proportion of the first audio channel to be greater than the first proportion of the first audio channel based on determining that the orientation of the playback device is the inverted orientation.
[0194] Example 21 includes the playback device of Example 20, wherein the first proportion of the first audio channel is zero.
[0195] Example 22 includes the playback device of any one of Examples 18-21, wherein the second audio channel is a center channel.
[0196] Example 23 includes the playback device of any one of Examples 18-22, further comprising an accelerometer configured to detect the orientation of the playback device.
[0197] Example 24 provides a media playback system, comprising a first playback device that includes a first set of transducers, including a first lateral transducer and a first vertical transducer, a second playback device that includes a second set of transducers, including a second lateral transducer and a second vertical transducer, one or more processors, and a memory storing instructions, that when executed by the one or more processors, cause the media playback system to perform operations. The operations comprise receiving multichannel audio content that includes a first audio channel, a second audio channel and a third audio channel, receiving an indication that the first playback device is in a first orientation, and playing back, via the first and second playback devices, the multichannel audio. Playing back the multichannel audio comprises outputting, based on the received indication that the first playback device in the first orientation, a first portion of the third audio channel via the first vertical transducer, and outputting a second portion of the third audio channel via the second lateral transducer.
[0198] Example 25 includes the media playback system of Example 24, wherein the operations further include determining a first vertical offset and a second vertical offset associated with the third audio channel, wherein the first vertical offset is a first vertical distance from a center of the first playback device, wherein the second vertical offset is a second vertical distance from a center of the second playback device, wherein the first and second vertical offsets correspond to a perceived vertical separation of the third audio channel from the first and second playback devices, respectively, during playback of the third audio channel.
[0199] Example 26 includes the media playback system of one of Examples 24 or 25, wherein the first audio channel is a left vertical audio channel and wherein the second audio channel is a right vertical audio channel, wherein playing back the multichannel audio further comprises outputting, based on the received indication that the first playback device is in the first orientation, the first audio channel via the first lateral transducer in substantial synchrony with output of the second audio channel via the second vertical transducer.
[0200] Example 27 provides a method for a media playback system comprising a first playback device having a first plurality of audio transducers configured to output audio along a first plurality of sound axes including a first lateral sound axis and a first vertical sound axis, the first lateral sound axis being angled with respect to a first horizontal axis of the first playback device by less than 30 degrees and the first vertical sound axis being angled with respect to the first horizontal axis by between 50 and 90 degrees. The method comprises receiving, at the first playback device, multichannel audio content including a center audio channel and a height audio channel, detecting an orientation of the first playback device, the orientation of the first playback device being one of a first orientation or a second orientation in which the first playback device is inverted relative to the first orientation, determining, based on the orientation of the first playback device, a first vertical offset relative to the first horizontal axis that corresponds to a perceived source location of the center audio channel, wherein the first vertical offset is different based on the first playback device being in the first orientation or the second orientation, determining, based on the first vertical offset, proportions of the center audio channel assigned for playback via the first lateral sound axis and the first vertical sound axis that generate a sound field corresponding to the center audio channel at the first vertical offset, playing back the center audio channel via the first lateral sound axis and the first vertical sound axis, and playing back the height audio channel via at least one of the first lateral sound axis or the first vertical sound axis.
[0201] Example 28 includes the method of Example 27, further comprising assigning, based on the orientation of the first playback device, a first proportion of the height audio channel for playback via the first lateral sound axis and a second proportion of the height audio channel for playback via the first vertical sound axis.
[0202] Example 29 includes the method of Example 28, comprising determining that the orientation of the first playback device is the second orientation, wherein the first proportion of the height audio channel is greater than the second proportion of the height audio channel. [0203] Example 30 includes the method of Example 29, wherein the second proportion of the height audio channel is zero.
[0204] Example 31 includes the method of any one of Examples 27-30, wherein the media playback system further comprises a second playback device having a second plurality of audio transducers configured to output audio along a second plurality of sound axes including a second lateral sound axis and a second vertical sound axis, the second lateral sound axis being angled with respect to a second horizontal axis of the second playback device by less than 30 degrees and the second vertical sound axis being angled with respect to the second horizontal axis by between 50 and 90 degrees. The method further comprises receiving, at the second playback device, the multichannel audio content, detecting an orientation of the second playback device, determining that the orientation of the second playback device is opposite to the orientation of the first playback device, determining, based on the orientation of the first playback device, a second vertical offset relative to the second horizontal axis that corresponds to the perceived source location of the center audio channel, determining, based on the second vertical offset, proportions of the center audio channel assigned for playback via the second lateral sound axis and the second vertical sound axis that generate a sound field corresponding to the center audio channel at a height relative to the second horizontal axis that corresponds to the first vertical offset, and playing back the center audio channel via the second lateral sound axis and the second vertical sound axis.