WO2025208094A1

Movatterモバイル変換

Info

Publication number: WO2025208094A1
Application number: PCT/US2025/022141
Authority: WO
Inventors: Jeffrey David; Steve Beckhardt; Edwin Fitzpatrick
Original assignee: Sonos Inc
Current assignee: Sonos Inc
Priority date: 2024-03-28
Filing date: 2025-03-28
Publication date: 2025-10-02
Anticipated expiration: 2026-09-28

Abstract

Disclosed embodiments relate to Multi-Player Playback Devices. In some embodiments, the Multi-Player Playback Device is configured to selectively operate in one of a plurality of operating modes. In each operating mode, the Multi-Player Playback Device is configured to implement a multi-stage audio processing procedure comprising: (i) routing incoming audio streams to mixer inputs of a mixer based on a current operating mode in which the Multi-Player Playback Device is operating; (ii) for one or more audio streams routed to an individual mixer input, (a) generating a mixed stream based on the one or more audio streams routed to the mixer input, and (b) routing the mixed stream to at least one audio output of the Multi-Player Playback Device based at least in part on a channel map associated with the current operating mode in which the Multi-Player Playback Device is operating.

Description

MULTI-STREAM AUDIO ROUTING FOR MULTI-PLAYER PLAYBACK DEVICE

CROSS REFERENCE TO RELATED APPLICATIONS

[0001] This application claims priority to U.S. Provisional App. 63/571,313, titled “MultiStream Audio Routing for Multi-Player Playback Device,” filed on March 28, 2024, and currently pending. The entire contents of U.S. Provisional App. 63/571,313 are incorporated herein by reference.

[0002] This application is also related to and incorporates by reference the entire contents of the following applications: (i) Patent Cooperation Treaty (PCT) App.

PCT/US2023/034170, titled “Playback System Architectures and Area Zone Configurations,” referred to as Docket No. 21-0703-PCT, filed on Sep. 29, 2023, and currently pending; (ii) U.S. Provisional App. 63/377,948, titled “Playback System Architecture,” referred to as Docket No. 21-0703p, filed on Sep. 30, 2022; (iii) U.S. Provisional App. 63/377,899, titled “Multichannel Content Distribution,” referred to as Docket No. 22-0207p (0400042), filed on Sep. 30 2022; (iv) U.S. Provisional App. 63/377,967, titled “Playback Systems with Dynamic Forward Error Correction,” referred to as Docket No. 22-0401p (0403973), filed on Sep. 30, 2022; (v) U.S. Provisional App. 63/377,978, titled “Broker/Subscriber Model for Information Sharing and Management Among Connected Devices,” referred to as Docket No. 22-0606Ap (0404192), filed on Sep. 30, 2022; (vi) U.S. Provisional App. 63/377,979, titled “Multiple Broker Deployment for Information Sharing and Management Among Connected Devices,” referred to as Docket No. 22-0606Bp (0404193), filed on Sep. 30, 2022; (vii) U.S.

Provisional App. 63/502,347, titled “Area Zones,” referred to as Docket No. 22-1002p, filed on May 15, 2023; (viii) U.S. Provisional App. 63/513,735, titled “State Information Exchange Among Connected Devices,” referred to as Docket No. 23-0306p, filed on Jul. 14, 2023; (ix) U.S. App. 18/181,727, titled “Playback of Generative Media Content,” referred to as Docket No. 20-0703, filed on Mar. 10, 2023, and issued on May 14, 2024, as U.S. Patent 11,985,376; (x) U.S. App. 18/671,824, titled “Generating Digital Media Based on Blockchain Data,” referred to as Docket No. 22-0402, filed on May 22, 2024, and published on Sep. 19, 2024, as U.S. Pub. 2024/0311416; (xi) U.S. App. 18/636,089, titled “Generative Audio Playback Via Wearable Playback Devices,” referred to as Docket No. 22-0403, filed on Apr. 15, 2024, and issued on Dec. 24, 2024, as U.S. Patent 12,175,161; and (xii) PCT App. PCT/US2024/039870, titled “Systems and Methods for Maintaining Distributed Media Content History and Preferences,” referred to as Docket No. 24-0602-PCT, filed on July 26, 2024, published on Feb. 6, 2025, as WO/2025/029673, and currently pending. The entire contents of Apps. PCT/US2023/034170; 63/377,948; 63/377,899; 63/377,967; 63/377,978; 63/377,979; 63/502,347; 63/513,735; 18/181,727; 18/671,824; 18/636,089; and PCT/US2024/039870 are incorporated herein by reference.

[0003] Aspects of the features and functions disclosed and described in the above-identified applications can be used in combination with the examples disclosed and described herein (and with each other in some instances) to improve the functionality and performance of playback devices, both individually and configured into playback systems comprising multiple playback devices, including but not limited to Multi-Player Playback Devices and systems comprising Multi-Player Playback Devices.

FIELD OF THE DISCLOSURE

[0004] The present disclosure is related to consumer goods and, in some more particular examples, to methods, systems, products, features, services, and other elements relating to media playback systems, media playback devices, methods of operating media playback systems and devices, and various features and aspects thereof.

BACKGROUND

[0005] Options for accessing and listening to digital audio in an out-loud setting were limited until in 2002, when SONOS, Inc. began development of a new type of playback system.

Sonos then filed one of its first patent applications in 2003, titled “Method for Synchronizing Audio Playback between Multiple Networked Devices,” and began offering its first media playback systems for sale in 2005. The Sonos Wireless Home Sound System enables people to experience music from many sources via one or more networked playback devices.

Through a software control application installed on a controller (e.g., smartphone, tablet, computer, voice input device), individuals can play most any music they like in any room having a networked playback device. Media content (e.g., songs, podcasts, video sound) can be streamed to playback devices such that each room with a playback device can play back corresponding different media content. In addition, rooms can be grouped together for synchronous playback of the same media content, and/or the same media content can be heard in all rooms synchronously. BRIEF DESCRIPTION OF THE DRAWINGS

[0006] Features, aspects, and advantages of the presently disclosed technology may be better understood with regard to the following description, appended claims, and accompanying drawings, as listed below. A person skilled in the relevant art will understand that the features shown in the drawings are for purposes of illustrations, and variations, including different and/or additional features and arrangements thereof, are possible.

[0007] Figure 1 A shows a partial cutaway view of an environment having a media playback system configured in accordance with aspects of the disclosed technology.

[0008] Figure IB shows a schematic diagram of the media playback system of Figure 1A and one or more networks.

[0009] Figure 1C shows a block diagram of a playback device.

[0010] Figure ID shows a block diagram of a playback device.

[0011] Figure IE shows a block diagram of a network microphone device.

[0012] Figure IF shows a block diagram of a network microphone device.

[0013] Figure 1G shows a block diagram of a playback device.

[0014] Figure 1H shows a partially schematic diagram of a control device.

[0015] Figures 1-1 through IL show schematic diagrams of corresponding media playback system zones.

[0016] Figure IM shows a schematic diagram of media playback system areas.

[0017] Figure 2A shows a front isometric view of a playback device configured in accordance with aspects of the disclosed technology.

[0018] Figure 2B shows a front isometric view of the playback device of Figure 3 A without a grille.

[0019] Figure 2C shows an exploded view of the playback device of Figure 2A.

[0020] Figure 3 A shows a front view of a network microphone device configured in accordance with aspects of the disclosed technology.

[0021] Figure 3B shows a side isometric view of the network microphone device of Figure 3A.

[0022] Figure 3C shows an exploded view of the network microphone device of Figures 3A and 3B.

[0023] Figure 3D shows an enlarged view of a portion of Figure 3B.

[0024] Figure 3E shows a block diagram of the network microphone device of Figures 3A- 3D

[0025] Figure 3F shows a schematic diagram of an example voice input. [0026] Figures 4A-4D show schematic diagrams of a control device in various stages of operation in accordance with aspects of the disclosed technology.

[0027] Figure 5 shows front view of a control device.

[0028] Figure 6 shows a message flow diagram of a media playback system.

[0029] Figure 7 shows a functional block diagram of an example Multi-Player Playback Device according to some embodiments.

[0030] Figure 8 shows a block diagram of an example multi-stream audio routing architecture for an example Multi-Player Playback Device according to some embodiments. [0031] Figure 9 shows an example method implemented by a Multi-Player Playback Device according to some embodiments.

[0032] The drawings are for the purpose of illustrating example configurations, but those of ordinary skill in the art will understand that the technology disclosed herein is not limited to the arrangements and/or instrumentality shown in the drawings.

DETAILED DESCRIPTION

I. Overview

[0033] Existing playback devices, examples of which include Sonos’ s intelligent playback devices, can be used to implement different types of groupings of playback devices configured to play audio content together with each other in a groupwise fashion. Playback groupings implemented with intelligent playback devices such as the playback devices available from Sonos are very flexible in terms of configuration options and the ability to play many different types of audio content, thereby enabling many different groupwise configurations for listening to many different types of audio content in many different types of listening environments.

[0034] The present application discloses and describes aspects of a new type of physical playback device referred to herein as a “Multi-Player Playback Device.” The Multi-Player Playback Devices disclosed herein provide enhanced flexibility and scalability for individual and groupwise playback as compared to existing playback devices via, among other features, implementation of “logical players.”

[0035] Aspects of Multi-Player Playback Devices and the logical players implemented by Multi-Player Playback Devices are particularly useful when implementing a new type of groupwise playback known as “Area Zones.” Some aspects of Area Zones that are relevant to Multi-Player Playback Devices and logical players are disclosed herein. Further aspects of Area Zones are disclosed and described in U.S. Provisional App. 63/502,347, titled “Area Zones,” referred to as Docket No. 22-1002p, filed on May 15, 2023, the entire contents of which are incorporated herein by reference. Although Multi-Player Playback Devices and logical players implemented by Multi-Player Playback Devices provide several advantages for Area Zone configurations, Multi-Player Playback Devices and the logical players implemented by Multi-Player Playback Devices provide similar and additional advantages for individual playback configurations and other grouped configurations, too.

[0036] However, implementing logical players via Multi-Player Playback Devices presents several technical challenges that do not typically exist with traditional playback devices including but not necessarily limited to, advertisement and discovery, control, and audio playback.

[0039] To support the complexities arising when a single physical Multi-Player Playback Device supports multiple logical players, some Multi-Player Playback Device embodiments disclosed herein implement a multi-stream audio routing architecture that differs from the audio routing architecture implemented by conventional playback devices. In some instances, multi-stream audio may be referred to a “multi-zone audio” and multi-stream audio routing may be referred to as “multi-zone audio” routing.

[0040] Some conventional playback devices are configured to control speakers associated with a single zone (or output) that includes left and right channels. In conventional playback devices equipped with integrated loudspeaker(s), the left and right channels of audio are played via the integrated loudspeaker(s) of the playback device. In conventional playback devices equipped with audio outputs configured to drive external passive loudspeakers separate from the playback device (similar to an amplifier connected to external speakers), the left and right channels of audio are played via the passive loudspeaker(s) connected to the audio outputs of the playback device. One such example of playback device equipped with audio outputs configured to drive external passive speakers is the Sonos Amp manufactured and sold by Sonos, Inc. of Santa Barbara, CA. The Sonos Amp includes an audio output for a single pair of left and right passive loudspeakers. Even in instances where multiple passive loudspeakers can be wired to a single audio output of any amplifier-type playback device, all of the multiple passive loudspeakers connected to the audio output play the same audio stream (or zone).

[0041] Some embodiments disclosed herein provide improvements over conventional playback devices. For example, playback device embodiments disclosed herein (including playback devices that play audio via integrated loudspeakers and playback devices that play audio via external passive loudspeakers) are able to support multiple loudspeakers and flexible zone configurations and reconfigurations. Such playback devices are sometimes referred to herein as Multi-Player Playback Devices in part because of their ability to support multiple loudspeakers and flexible zone configurations and reconfigurations.

[0042] This flexible zone configuration and reconfiguration feature is particularly desirable in amplifier style Multi-Player Playback Devices because flexible zone playback configuration and reconfiguration enables the Multi-Player Playback Device to playback any audio stream (or streams) of audio content via any passive loudspeaker (or set of loudspeakers) regardless of how the passive loudspeakers are wired or otherwise connected to the playback device. As a result, in a listening area having speakers distributed throughout the listening area that are connected to one or more Multi-Player Playback Devices, different audio content (and/or different portions or channels of audio content) can be flexibly routed to each speaker for playback to support different listening scenarios, and flexibly reconfigured from configuration to configuration to enable quick and easy changes between different configurations and corresponding listening scenarios without having to disconnect, re-connect, or otherwise physical change the connections between the connected speakers and the Multi-Player Playback Device.

[0043] Additionally, two or more such Multi-Player Playback Devices can be configured to operate in concert with each other to flexibly route different audio content to different speakers or groups of speakers connected to the Multi-Player Playback Devices. In operation, the group of two or more Multi-Player Playback Devices play the audio content routed to their connected speakers / groups of speakers in a groupwise fashion with each other.

[0044] For example, the two or more Multi-Player Playback Devices in the playback group in some listening scenarios are configured to play at least some of the audio content in synchrony with each other. For instance, in some listening scenarios, if a group of two or more Multi-Player Playback Devices are configured to play four different audio streams (sometimes referred to as playing four different zones of audio), the Multi-Player Playback Devices can be configured to play all four audio streams in synchrony with each other.

[0045] In other listening scenarios, the same group of two or more Multi-Player Playback Devices can instead be configured to play two of the four audio streams in synchrony while playing the other two audio streams independently of the two synchronized streams, i.e., the group of Multi-Player Playback Devices plays all four audio streams concurrently including: (i) playing two of the four audio streams in synchrony with each other and (ii) playing the other two audio streams independently of each other and independently of the two streams being played in synchrony with each other.

[0046] In still further listening scenarios, the same group of two or more Multi-Player Playback Devices can instead be configured to play all four of the audio streams independently of each other, i.e., the group of Multi-Player Playback Devices plays all four audio streams concurrently, but each of the audio streams is played independently of the others. Advantageously, the Multi-Player Playback Devices can switch between different operating modes or configurations without connecting or reconnecting physical links between the Multi-Player Playback Devices and without connecting or reconnecting the physical connections between the Multi-Player Playback Devices and the external loudspeakers, thereby enabling playback networks built with Multi-Player Playback Devices to quickly and easily switch between different listening scenarios.

[0047] Accordingly, in some embodiments, a physical Multi-Player Playback Device comprises: (i) one or more network interfaces configured to transmit and receive audio streams comprising audio information; (ii) one or more audio input ports configured to receive audio streams comprising audio information; (iii) one or more audio outputs configured to output analog audio signals to one or more loudspeakers; (iv) one or more processors; and (iii) tangible, non-transitory computer-readable media having program instructions stored therein, wherein the program instructions, when executed by the one or more processors, cause the physical Multi-Player Playback Device to perform any of the Multi-Player Playback Device functions disclosed herein.

[0048] For example, in some embodiments, the Multi-Player Playback Device is configured to selectively operate in one of a plurality of operating modes. In each operating mode, the Multi-Player Playback Device is configured to implement a multi-stage audio processing procedure to implement “multi-stream audio routing” (sometimes referred to as multi-zone audio routing). In some embodiments, the multi-stage audio processing procedure includes, among other features: (i) routing each audio stream to at least one mixer input of one or more mixer inputs based at least in part on a current operating mode in which the Multi-Player Playback Device is operating; (ii) for the one or more audio streams routed to an individual mixer input of the one or more mixer inputs, (a) generating a mixed stream based on the one or more audio streams routed to the mixer input, and (b) routing the mixed stream to at least one audio output of the one or more audio outputs based at least in part on a channel map associated with the current operating mode in which the Multi-Player Playback Device is operating. In some examples, generating a mixed stream based on the one or more audio streams routed to the mixer input includes generating the mixed stream by mixing a first audio stream with a second audio stream.

[0049] In operation, multi-stream audio routing in the disclosed embodiments enables a Multi-Player Playback Device to selectively route audio content from any one or more of the audio input ports of the Multi-Player Playback Device to any of the one or more audio outputs (or zones) of the Multi-Player Playback Device regardless of how the individual loudspeakers in the listening area are arranged within the listening area and/or connected to the Multi-Player Playback Device. In some embodiments, multi-stream audio routing enables the Multi-Player Playback Device to implement sophisticated configurations of logical players (described further herein) implemented by individual Multi-Player Playback Devices and groups of Multi-Player Playback Devices.

[0050] To support multi-stream audio routing (also referred to as multi-zone audio routing as mentioned above), Multi-Player Playback Device embodiments are configured to dynamically mix input audio streams (comprising one or more channels) based at least in part on the operating state of the Multi-Player Playback Device.

[0051] For example, one operating state may include a typical 1: 1 mapping of audio input ports to audio outputs. Another example operating state may comprise a first plurality of input streams (e.g., various audio sources such as streaming media, television, or perhaps another local source) routed to a set of audio outputs configured to play the first plurality of input streams, and a separate audio input stream (e.g., voice assistant, intercom, doorbell) that the Multi-Player Playback Device (i) selectively mixes with some or all the other input streams and (ii) routes to the audio outputs configured to play the first plurality of input streams.

[0052] In another example operating state, the Multi-Player Playback Device is configured to implement a home theater listening scenario (or state) in which 6 channels (e.g., left, center, right, left rear, right rear, and low frequency effects (LFE) channels) are mapped to a first set of Multi-Player Playback Device audio outputs (and/or to other playback devices in a home theater configuration), and two channels of the same home theater audio (e.g., either a downmix of the 5.1 audio or a dedicated stereo stream) are sent to a different set of Multi- Player Playback Device audio outputs (and/or to other playback devices).

[0053] In one example use case, the Multi-Player Playback Device is configured for operation in a retail store. Different areas (or zones) of the retail store could have tailored playlists to match the products being sold in the different areas or the target demographics shopping in the different areas. For example, the Multi-Player Playback Device (or a network of Multi-Player Playback Devices) can be configured to play pop music in a clothing section (or zone) for young adults, play soft rock in a home goods section (or zone), and play music from playlist created by employees in the stock room, break room, or other employee- only areas (or zones). The Multi-Player Playback Device (or network of Multi-Player Playback Devices) can be further configured to mix storewide paging (e.g., announcements, safety alarms, or other storewide information) into all locations (i.e., all the zones in which the Multi-Player Playback Device or network of Multi-Player Playback Devices is playing audio).

[0057] Additional aspects of the disclosed systems and methods and disclosed and described in further detail herein.

A. Relevant Nomenclature

[0058] Given the new concepts described herein, certain terminology is introduced and used for explaining various example features and embodiments. However, it should be understood that such terminology and the use thereof may be uniquely applicable in at least some respects to the examples described herein, such as when describing both existing concepts and new concepts. However, in other situations, the terminology may be more broadly applicable to other examples, depending on the context. i. Playback Device

[0059] To help illustrate aspects of some embodiments, a playback device as used herein sometimes refers to a single, physical hardware device that is configured to play audio. Such a playback device includes one or more network interfaces, one or more processors, and tangible, non-transitory computer-readable media storing program instructions that are executed by the one or more processors to cause the playback device to perform certain playback device features and functions.

[0060] In some embodiments, a playback device includes integrated speakers. In other embodiments, a playback device includes speaker outputs that connect to external speakers. In still further embodiments, a playback device includes a combination of integrated speakers and speaker outputs that connect to external speakers.

[0061] In some instances, a playback device may be (or may include) a set of headphones. In some instances, a playback device may be (or may include) a smartphone, tablet computer, laptop / desktop computer, smart television, or other type of device configurable to play audio content.

[0062] In some embodiments, a playback device comprises one or more microphones configured to receive voice commands. In some instances, playback devices with microphones are referred to herein as Networked Microphone Devices (NMDs). In some NMD embodiments, the NMD is configured to perform any (or all) of the playback device functions disclosed herein.

[0063] Additional details about playback devices consistent with some example embodiments are disclosed and described herein. ii. Logical Player

[0064] A logical player as used herein sometimes refers to a logical playback entity implemented by one or more physical playback devices to act as a single entity to play one stream of audio. In some instances where the one stream of audio comprises multichannel audio having two or more channels and the player includes two or more channel outputs, playing the multichannel audio includes the player playing the two or more channels via two or more corresponding channel outputs. In some instances, a single physical playback device implements a single logical player. In other instances, several physical playback devices may be configured to implement a single logical player. In still further instances, one physical playback device (i.e., a Multi-Player Playback Device) may implement several logical players.

[0065] Additional details about logical players consistent with some example embodiments are disclosed and described herein. iii. Multi-Player Playback Device

[0066] A Multi-Player Playback Device as used herein refers to a type of physical playback device comprising multiple configurable audio outputs, one or more processors, and tangible, non-transitory computer readable media storing program instructions that are executed by the one or more processors to cause the Multi-Player Playback Device to perform the Multi- Player Playback Device features and functions described herein. In scenarios where a playback device might be referred to as a zone player, the Multi-Player Playback Device may sometimes be referred to as a Multi-Zone Player.

[0067] In operation, the multiple audio outputs can be grouped together in different combinations to implement one or more logical players. In some embodiments, a single Multi-Player Playback Device is configurable to implement from one to eight logical players. Multi-Player Playback Devices according to some embodiments include eight configurable audio outputs. Multi-Player Playback Devices according to other embodiments include fewer than eight configurable audio outputs or more than eight configurable audio outputs. For example, in some embodiments, a Multi-Player Playback Device may include anywhere from two to six, eight, twelve, sixteen, eighteen, twenty-four, or more configurable audio outputs. Multi-Player Playback Devices with more or fewer audio outputs than those specifically identified herein are possible as well.

[0068] Additional details about Multi-Player Playback Devices consistent with some example embodiments are disclosed and described herein and also in U.S. Provisional App. 63/502,347, titled “Area Zones,” referred to as Docket No. 22-1002p, filed on May 15, 2023, the contents of which are incorporated herein by reference. iv. Playback Entity

[0069] A playback entity as used herein sometimes refers to a logical or physical entity configured to play audio. Playback entities include physical playback devices, physical Multi-Player Playback Devices, and logical players that are implemented via one or more Multi-Player Playback Devices.

[0070] Additional details about playback entities consistent with some example embodiments are disclosed and described herein and also in U.S. Provisional App. 63/502,347, titled “Area Zones,” referred to as Docket No. 22-1002p, filed on May 15, 2023, the contents of which are incorporated herein by reference. v. Zone

[0071] As another example, a zone (sometimes referred to herein as a playback zone or a bonded zone) as used herein sometimes refers to a logical container of one or more physical playback devices that are managed together. But in some instances, and in some of the examples described herein, a zone may include only a single, physical playback device. Thus, in operation, a zone may include any one or more playback devices managed as a logical zone entity, including, for example, (i) a single playback device managed as a logical zone entity, (ii) a group of playback devices managed as a logical zone entity, including but not limited to any of (a) a bonded zone that includes two or more playback devices configured to play the same audio, (b) a bonded pair of two playback devices configured to play the same audio, (c) a stereo pair of two playback devices where one of the playback devices is configured to play a left channel of stereo audio content and the other playback device is configured to play a right channel of stereo audio content, or (d) a home theater zone that includes two or more playback devices configured to play home theater and/or surround sound audio content. [0072] In this manner, in some examples, a zone is a type of logical entity implemented by one or more playback devices. When the zone includes two or more playback devices, the two or more playback devices play one stream of audio content. In some instances, the one stream of audio content played by the zone comprises multichannel audio. In some zone scenarios that include two or more playback devices where the one stream of audio content comprises multichannel audio, each playback device (of the two or more physical playback devices) may be configured to play a different channel of the multichannel audio content. [0073] For example, a first physical playback device in the zone may be configured to play a left channel of the audio content and a second physical playback device in the zone may be configured to play a right channel of the audio content. This type of example zone configuration is sometimes referred to as a stereo pair.

[0074] In another example, a first playback device in the zone may play a left channel, a second playback device in the zone may play a right channel, and a third playback device in the zone may play a subwoofer channel. This type of example zone configuration is sometimes referred to as a home theater zone.

[0075] In some existing zone configurations, each of the individual playback devices within the zone communicate with each other in a fully connected control plane configuration to exchange commands, configuration information, events, and state information to each other via dedicated websocket connections between each pair of playback devices within the zone.

[0076] Additional details about zones consistent with some example embodiments are disclosed and described herein and in U.S. Provisional App. 63/502,347, titled “Area Zones,” referred to as Docket No. 22-1002p, filed on May 15, 2023, the contents of which are incorporated herein by reference. vi. Playback Group

[0077] A playback group (sometimes referred to herein simply as a group) is a logical container of two or more logical or physical playback entities. Logical and physical entities that can grouped into a playback group include: (i) a zone, (ii) a playback device, (iii) a Multi-Player Playback Device, and/or (iii) a logical player.

[0078] A playback group can differ from a zone in a few ways. First, a zone can include one or more playback devices, whereas a playback group can include two or more logical or physical playback entities (i.e., zones, playback devices, Multi-Player Playback Device, or logical players). Second, when one or more playback devices are configured into a zone, the playback system treats the zone as a single logical entity even though the zone may include several physical playback devices. By contrast, when two or more playback devices are configured into a group, the playback system manages each playback device separately even though each of the playback entities within the playback group are playing the same audio stream.

[0082] Additional details about playback groups consistent with some example embodiments are disclosed and described herein and in U.S. Provisional App. 63/502,347, titled “Area Zones,” referred to as Docket No. 22-1002p, filed on May 15, 2023, the contents of which are incorporated herein by reference. vii. Area Zone

[0083] An Area Zone includes a set of two or more logical or physical playback entities grouped together. The playback entities that can be grouped together into an Area Zone include: (i) a zone, (ii) a playback device, (iii) a Multi-Player Playback Device, (iv) a logical player, and/or (iv) a playback group.

[0085] In some embodiments, and similar to some zone configurations (described above), the playback system manages all of the playback entities within an Area Zone as a single logical playback entity. And similar to some playback group configurations (described above), in some embodiments, the individual playback entities can be managed and configured independently of each other.

[0086] One difference between some zone implementations (described above) on the one hand, and an Area Zone on the other, is that in some existing systems, prior zone configurations cannot be saved, and then activated or deactivated during operation of the playback system. With some prior zone implementations, the zone is configured typically when the playback devices forming the zone are added to the playback system. In contrast to those prior zone implementations, with some Area Zone embodiments, an individual playback entity can save several different Area Zone configurations and switch between operating in each of the different saved Area Zone configurations.

[0087] Another difference between zones and playback groups (described above) on the one hand, and an Area Zone on the other, is that unlike prior zone and playback group configurations, the playback entities within an Area Zone do not all communicate with each other in a fully-connected control plane configuration to exchange commands, configuration information, events, and state information with each other. In some examples, this fully connected control plane is implemented via dedicated websocket connections between each pair of playback entities within the Area Zone.

[0088] Instead, the Area Zone Primary communicates with each Area Zone Secondary (and each Area Zone Secondary communicates with the Area Zone Primary) to exchange commands, configuration information, events, and state information in a hierarchical control plane. In contrast to how group members within a playback group maintain a fully-connected mesh between each other (e.g., via dedicated websocket connections in some instances) to facilitate communication between the different group members, in normal operation, the Area Zone Secondaries within the same Area Zone typically do not communicate with each other, and Area Zone Secondaries typically do not communicate with any other playback entity in a playback system other than their corresponding Area Zone Primary. Instead, the Area Zone Secondaries communicate with the Area Zone Primary which can, in turn, facilitate any exchange of commands, configuration information, events, or state information that may need to occur between two Area Zone Secondaries.

[0089] Additional details about Area Zones consistent with some example embodiments are disclosed and described herein and in (i) U.S. Provisional App. 63/502,347, titled “Area Zones,” referred to as Docket No. 22-1002p, filed on May 15, 2023, and (ii) U.S. Provisional 63/601,155, titled “Calibrating Playback Entities in Playback Networks Comprising Multiple Playback Entities,” referred to as Docket No. 23-0308p, filed on Nov. 20, 2023. The contents of Apps. 63/502,347 and 63/601,155 are incorporated herein by reference. viii. Area Zone Primary

[0090] An Area Zone Primary is a player (e.g., a playback device or Multi-Player Playback Device) that is configured to handle audio sourcing and control signaling on behalf of itself and all of the Area Zone Secondaries within an Area Zone. In some embodiments, the Area Zone Primary may also perform one or more (or all) functions of a Global State Aggregator as described in U.S. Provisional App. 63/377,978, titled “Global State Service,” referred to as Docket No. 23-0306p, filed on Sep. 30, 2022.

[0091] For media, in some implementations, the Area Zone Primary is configured to function as an audio sourcing device for itself and all of the Area Zone Secondaries within the Area Zone.

[0092] For control signaling, in some implementations, the Area Zone Primary communicates with each Area Zone Secondary to exchange commands, configuration information, events, and state information to implement playback, configuration, and control functions (e.g., volume, mute, playback start/stop, queue management, configuration management and updates) for the Area Zone.

[0093] In addition to exchanging commands, configuration information, events, and state information with each Area Zone Secondary within the Area Zone, each Area Zone Primary in some implementations is also configured to exchange commands, configuration information, events, and state information with (i) each (and every) other Area Zone Primary in the playback system and (ii) any controller device(s) or controller system(s) configured for controlling operation of the playback system. However, rather than exchanging commands, configuration information, events, and state information with each (and every) other Area Zone Primary in the playback system, in some embodiments each Area Zone Primary in some implementations is configured to exchange commands, configuration information, events, and state information with one or more Brokers (not every other Area Zone Primary) in the playback system. The use of Brokers in this and other manners is described in more detail in U.S. Provisional App. 63/377,978, titled “Global State Service,” referred to as Docket No. 23-0306p, filed on Sep. 30, 2022.

[0094] One example scenario that illustrates how an Area Zone Primary exchanges commands, configuration information, events, and state information with Area Zone Secondaries and/or controller device(s) and/or controller system(s) is where, after a playback entity configured as the Area Zone Primary receives a request for configuration or operational information about one or more playback entities within the Area Zone from a requesting device (e.g., a controller device, a controller system, or perhaps another playback entity in the playback system), the playback entity configured as the Area Zone Primary provides the requested information to the requesting device on behalf of the Area Zone. [0095] For example, in response to a request for a listing of playback entities in the Area Zone received from a requesting device, the Area Zone Primary transmits a listing of playback entities within the Area Zone to the requesting device. In another example, in response to a request for configuration information about a particular Area Zone Secondary received from a requesting device, and to the extent that the Area Zone Primary does not already have the requested configuration information for the particular Area Zone Secondary, the Area Zone Primary obtains the requested configuration information from the Area Zone Secondary. And regardless of whether the Area Zone Primary already had the configuration information for the particular Area Zone Secondary or acquired the requested configuration information from the particular Area Zone Secondary, the Area Zone Primary provides the requested configuration information for the particular Area Zone Secondary to the requesting device.

[0096] Additional details about Area Zone primaries consistent with some example embodiments are disclosed and described herein and in U.S. Provisional App. 63/502,347, titled “Area Zones,” referred to as Docket No. 22-1002p, filed on May 15, 2023, the contents of which are incorporated herein by reference. ix. Area Zone Secondary

[0097] An Area Zone Secondary is a logical or physical entity in the Area Zone that is not the Area Zone Primary for the Area Zone. In operation, each Area Zone Secondary within an Area Zone is configured to communicate with the Area Zone Primary to exchange commands, configuration information, events, and state information to implement playback, configuration, and control functions (e.g., volume, mute, playback start/stop, queue management, configuration management and updates) for the Area Zone.

[0098] As mentioned earlier, in normal operation, an Area Zone Secondary typically does not communicate with any other Area Zone Secondary within the Area Zone or any other playback entity within a playback system. However, an Area Zone Secondary in some instances may receive commands and/or inquiries from a controller device, and in some instances can establish a communication session with another Area Zone Secondary to exchange data in some circumstances.

[0099] One example scenario that illustrates how an Area Zone Secondary exchanges commands, configuration information, events, and state information with its Area Zone Primary is where, after a playback entity configured as an Area Zone Secondary receives a request for configuration or operational information about one or more playback entities within the Area Zone from a requesting device (e.g., a controller device, a controller system, or perhaps another playback entity in the playback system), the playback entity configured as the Area Zone Secondary forwards the received request to the Area Zone Primary. In some instances, the Area Zone Primary responds to the requesting device to provide the requested information.

[0100] For example, in response to a request for a listing of playback entities in the Area Zone received from a requesting device, the Area Zone Secondary forwards the request to the Area Zone Primary, and the Area Zone Primary transmits a listing of playback entities within the Area Zone to the requesting device. In another example, in response to a request for configuration information about a particular Area Zone Secondary received from a requesting device (including a request about itself), the Area Zone Secondary forwards the request to the Area Zone Primary, and the Area Zone Primary provides the requested information to the requesting device.

[0101] Additional details about Area Zone secondaries consistent with some example embodiments are disclosed and described herein and in U.S. Provisional App. 63/502,347, titled “Area Zones,” referred to as Docket No. 22-1002p, filed on May 15, 2023, the contents of which are incorporated herein by reference. X. Controller Device

[0102] A controller device (sometimes referred to herein as a controller or a control device) is a computing device or computing system with one or more processors and tangible, non- transitory computer readable media storing program instructions executable by the one or more processors to execute the controller device features and functions described herein. In some scenarios, a controller device is a smartphone, tablet computer, laptop computer, desktop computer, smartwatch, or similar computing device configured to execute a software user interface for configuring and controlling playback entities within a playback system. In some scenarios, a controller device may include one or more cloud server systems configured to communicate with one or more playback entities to configure and control the playback system.

[0103] In operation, user inputs associated with commands for configuring and controlling playback entities within a playback system can take a variety of forms, including but not limited to (i) physical inputs (e.g., actuating physical controls like knobs, sliders, buttons and so forth) (ii) software user interface inputs (e.g., inputs on a touch screen or similar graphical user interface), (iii) voice inputs, (iv) inputs received from another playback entity in the playback system (e.g., in the form of signaling from the playback entity to the controller in connection with effectuating configuration and control commands) and/or (v) any other type of input in any other form now known or later developed that is sufficient for conveying commands for configuring and controlling playback entities.

[0104] Additional details about area controller devices consistent with some example embodiments are disclosed and described herein and in U.S. Provisional App. 63/502,347, titled “Area Zones,” referred to as Docket No. 22-1002p, filed on May 15, 2023, the contents of which are incorporated herein by reference.

B. Hierarchical Control Plane for Command and Control Signaling

[0105] As mentioned above, one aspect of the Area Zone embodiments disclosed herein is a hierarchical control plane.

[0106] Exchanging command and control information via control plane implementations disclosed herein differs from prior command and control distribution schemes such as the ones disclosed in U.S. App. 13/489,674, titled “Device Playback Failure Recovery and Redistribution,” filed on Jun. 6, 2012, and issued on Dec. 2, 2014, as U.S. Pat. 8,903,526, the entire contents of which are incorporated herein by reference. The ‘674 application describes a command-and-control distribution scheme where a confirmed communication list (CCL) is be generated to facilitate communication between playback devices within a playback system. In one example, the CCL is a list of all playback devices in a zone configuration, where the CCL is ordered according to an optimal routing using the least number of hops or transition points through the network between the playback devices. In another case, the CCL is generated without consideration of network routing metrics. In either case, command and control data is passed from playback device to playback device within the zone configuration following the order in the CCL in a linear or serial manner. In one example, a first playback device sends a command to a second playback device in the CCL, and the second playback device in the CCL sends the command to a third playback device in the CCL, and so on until the command reaches its destination (i.e., a playback device in the CCL). For commands to be processed by all playback devices in the zone configuration, the commands are routed from playback device to playback device in the order specified in the CCL until every playback device in the CCL has received the commands. This arrangement is simple to execute, provides reliable transmission of information from playback device to playback device, and tends to work quite well for zones with a few playback devices.

[0107] However, for Area Zones with a lot of playback entities, the CCL-based approach can become impractical because it can take too long to distribute commands to a large number of playback entities or even to send a command to a single playback entity since every command is routed through the group on a serial, hop-by-hop basis according the CCL. [0108] In contrast to the above-described CCL approach and flat control plane implementations where many (or all) playback entities within a playback system communicate directly with each other to exchange configuration and control information throughout the playback system, some Area Zone embodiments disclosed herein employ a hierarchical control plane where commands, configuration information, events, and state information are exchanged only between the Area Zone Primary and each Area Zone Secondary within the Area Zone. For playback systems that may include several Area Zones, the Area Zone Primaries may exchange commands, configuration information, events, and state information with each other.

[0109] In the Area Zone embodiments disclosed herein, and in contrast to flat, fully meshed control plane implementations, the Area Zone Secondaries typically do not exchange commands, configuration information, events, and state information to each other via dedicated websocket connections (or similar communications links or sessions) with each other, except perhaps in a few rare instances described herein. [0110] In some scenarios, any commands, configuration information, events, and state information to be sent from a first Area Zone Secondary to a second Area Zone Secondary are sent from the first Area Zone Secondary to the Area Zone Primary. The Area Zone Primary then, in turn, (i) processes the command(s), configuration information, event(s), and/or state information received from the first Area Zone Secondary and instructs and/or updates the second Area Zone Secondary accordingly, or (ii) forwards the command(s), configuration information, event(s), and/or state information to the second Area Zone Secondary, as necessary.

[OHl] Some Area Zone embodiments additionally or alternatively employ a Configuration and Command (C&C) group (e.g., a multicast group) for distributing commands, configuration information, events, and state information to playback entities within the Area Zone.

[0112] In some instances, the Area Zone Primary creates the C&C group for the Area Zone (or joins the C&C group as a publisher), and provides information required to join and/or subscribe to the C&C group to each Area Zone Secondary. The Area Zone Secondaries join and/or subscribe to the C&C group to receive control information. In such embodiments, the Area Zone Primary publishes commands, configuration information, events, and state information to the C&C group, and each Area Zone Secondary subscribed to the C&C group receives the commands, configuration information, events, and state information that the Area Zone Primary publishes to the C&C group. In some embodiments, Area Zone Secondaries may also publish certain command, configuration, event, and state information to the Area Zone C&C group. In operation, each when an Area Zone Secondary receives a command via the C&C group that requires some action on behalf of the Area Zone Secondary, the Area Zone Secondary executes the received command.

[0113] Additional details about hierarchical control plane features and functionality for Area Zone configurations consistent with some example embodiments are disclosed and described herein and in U.S. Provisional App. 63/502,347, titled “Area Zones,” referred to as Docket No. 22-1002p, filed on May 15, 2023, the contents of which are incorporated herein by reference.

C. Media and Timing Distribution

[0114] Another aspect of some Area Zone embodiments disclosed herein is how audio content and playback timing are distributed to individual playback entities within the Area Zone. In some existing zone or playback group configurations, the playback device designated as the audio sourcing device is configured to transmit audio content and playback timing for the audio content to each other playback device in the zone or playback group via unicast transmissions from the audio sourcing device to each other playback device in the zone or playback group.

[0115] However, in some Area Zone embodiments disclosed herein, the Area Zone Primary is configured to transmit audio content and playback timing for the audio content to a media distribution group (e.g., a multicast group), and Area Zone Secondaries subscribe to the media group to receive the audio content and playback timing, e.g., each Area Zone Secondary joins the media multicast group and receives audio content and playback timing via the media multicast group.

[0116] Some embodiments employ a hybrid unicast / multicast approach where the Area Zone Primary is configured to (i) distribute audio content and playback timing via unicast transmissions to Area Zone Secondaries that are wirelessly connected to the playback system, e.g., via WiFi, Bluetooth, or other suitable wireless connection, and (ii) distribute audio content and playback timing via multicast transmissions to Area Zone Secondaries that are connected to the playback system via wired connections, e.g., Ethernet, Power over Ethernet (PoE), Universal Serial Bus (USB), or other suitable wired connection.

[0117] Some Area Zone embodiments employ a Media and Timing (M&T) group (e.g., a multicast group) for distributing audio content and playback timing to playback entities within the Area Zone. In some Area Zone embodiments, clock timing is also distributed to the playback entities within the Area Zone via the M&T group. In some instances, the Area Zone Primary creates the M&T group for the Area Zone (or joins the M&T group as a publisher), and provides information required to join and/or subscribe to the M&T group to each Area Zone Secondary. The Area Zone Secondaries join and/or subscribe to the M&T group to receive audio content, playback timing, and in some instances, clock timing. In such embodiments, the Area Zone Primary publishes the audio content, playback timing, and clock timing to the M&T group, and each Area Zone Secondary subscribed to the M&T group receives the audio content, playback timing, and clock timing that the Area Zone Primary publishes to the M&T group.

[0118] Additional details about media and timing data for Area Zone configurations consistent with some example embodiments are disclosed and described herein and in U.S. Provisional App. 63/502,347, titled “Area Zones,” referred to as Docket No. 22-1002p, filed on May 15, 2023, the contents of which are incorporated herein by reference. D. Storing and Recalling Area Zone Configuration Information

[0119] Another aspect of some Area Zone embodiments disclosed herein includes the Area Zone configuration data and how the Area Zone configuration data is stored and recalled to activate an Area Zone configuration.

[0120] In contrast to some prior playback group configurations and zone configurations where every playback device maintains an up-to-date version of the group/zone configuration data, Area Zone configurations according to some embodiments include storing the Area Zone configuration information in a single Area Zone configuration package. In some instances, the Area Zone configuration package includes separate Area Zone configuration files for each playback entity in the Area Zone. In some instances, the Area Zone configuration package is stored at one or more of (i) the Area Zone Primary, (ii) a controller device, and/or (iii) a cloud server system.

[0121] The Area Zone configuration information in some embodiments includes information about the Area Zone configuration, including but not limited to one or more (or all) of (i) a name that identifies the Area Zone, (ii) the name of the playback entity that is to function as the Area Zone Primary once the Area Zone is activated, (iii) the name of each playback entity in the Area Zone (i.e., the UUID of each playback device, the UUID of each Multi-Player Playback Device, the ULPI of each logical player, the zone name of each zone, and group name of each playback group in the Area Zone, as applicable), (iv) playback calibration settings for each playback entity in the Area Zone, (v) for Area Zones configured to play multichannel audio, a channel map that defines the channel or channels that each playback entity is configured to play once the Area Zone has been activated, including at least in some instances, which channel each output port of each playback entity is configured to play while the Area Zone is active, and (vi) for individual playback entities, one or more configuration setting(s) that define certain Area Zone specific behavior of the playback entity while the Area Zone is active.

[0122] Examples of Area Zone specific behavior while the Area Zone is active include: (i) behavior of the playback entity in response to receiving a volume control command, a playback control command (e.g., play/pause/skip/etc.), and/or other user command via a physical interface on (or associated with) the playback entity, and (ii) behavior of the playback entity in response to receiving a voice command via an associated microphone. [0123] Additional details about storing and recalling Area Zone configuration consistent with some example embodiments information are disclosed and described herein and in U.S. Provisional App. 63/502,347, titled “Area Zones,” referred to as Docket No. 22-1002p, filed on May 15, 2023, the contents of which are incorporated herein by reference.

[0125] For example, with respect to streaming audio content and playback timing to individual playback devices, U.S. App. 10/816,217, titled “System And Method For Synchronizing Operations Among A Plurality Of Independently Clocked Digital Data Processing Devices,” referred to as Docket No. 04-0401, filed on Apr. 1, 2004, and issued on Jul. 31, 2012, as U.S. Pat. 8,234,395, describes, inter alia, independently-clocked playback devices that are configured to play audio content in synchrony with each other based on playback timing and clock timing information. However, U.S. App. 10/816,217 does not describe the Area Zone configurations, Multi-Player Playback Devices, playback entity implementations, logical players, and/or the multi-stream audio routing architecture or features disclosed herein. The entire contents of U.S. App. 10/816,217 are incorporated herein by reference.

[0127] Other aspects of groupwise control of groups of playback devices are described in U.S. App. 13/910,608, titled “Satellite Volume Control,” referred to as Docket No. 13-0413, filed on June. 5, 2013, and issued on Sep. 6, 2016, as U.S. Pat. 9,438,193. For example, U.S. App. 13/910,608 discloses, inter alia, controlling playback volume of grouped playback devices, including propagating a volume adjustment received via a first playback device to other playback devices that have been grouped with the first playback device. However, U.S. App. 13/910,608 does not describe does not describe the Area Zone configurations, Multi- Player Playback Devices, playback entity implementations, logical players, and/or the multistream audio routing architecture or features disclosed herein. The entire contents of U.S. App. 13/910,608 are incorporated herein by reference.

[0128] Further, several earlier-filed applications describe aspects of specialized groupings of playback devices within a playback system. For example, U.S. App. 13/013,740, titled “Controlling and grouping in a multi -zone media system,” referred to as Docket No. 11-0101, filed on Jan. 25, 2011, and issued on Dec. 1, 2015, as U.S. Pat. 9,202,509, describes, inter alia, configuring and operating two playback devices in a “paired” configuration such as a “stereo pair” configuration, where one playback device is configured to play a right stereo channel of audio content and the other playback device is configured to play a left stereo channel of audio content.

[0130] Further, U.S. App. 13/632,731, titled, “Providing A Multi-Channel And A MultiZone Audio Environment,” referred to as Docket No. 12-0802, filed on Oct. 1, 2012, and issued on Dec. 6, 2016, as U.S. Pat. 9,516,440, describes, inter alia, configuring playback devices to play different types of audio content (e.g., home theater audio vs. music) according to different playback timing arrangements (e.g., with low-latency vs. with ordinary latency). [0131] Additionally, U.S. App. 14/731,119, titled, “Dynamic Bonding of Playback Devices,” referred to as Docket No. 15-0301, filed on Jun. 4, 2015, and issued on Jan. 9, 2018, as U.S. Pat. 9,864,571, discloses, inter alia, dynamic bonding scenarios and playback devices that are “sharable” among different zones. And U.S. App. 14/997,269, titled, “System Limits Based on Known Triggers,” referred to as Docket 15-1104, filed on Jan. 15, 2016, and issued on Feb. 20, 2018, as U.S. Pat. 9,898,245, describes, inter alia, methods of setting up multiple playback devices.

[0132] Further still, U.S. Provisional App. 63/377,948, titled “Playback System Architecture,” referred to as Docket No. 21-0703p, filed on Sep. 30, 2022, describes, inter alia, multi-tier hierarchical playback systems comprising several playback devices, including methods and processes of distributing audio and control signaling between and among playback devices with the multi-tier hierarchical playback system.

[0133] However, U.S. Apps. 13/013,740; 13/083,499; 13/632,731; 14/731,119; 14/997,269; and 63/377,948 do not describe the Area Zone configurations, Multi-Player Playback Devices, playback entity implementations, logical players, and/or the multi-stream audio routing architecture or features disclosed herein. The entire contents of U.S. Apps. 13/013,740; 13/083,499; 13/632,731; 14/731,119; 14/997,269; and 63/377,948 are incorporated herein by reference.

[0134] Additionally, several other advancements over time have improved the overall functionality and usability of playback devices configured in groups for synchronous playback of audio content.

[0135] With respect to managing groups of playback devices configured for groupwise playback of audio content, U.S. App. 14/042,001, titled “Coordinator Device for Paired or Consolidated Players,” referred to as Docket No. 13-0812, filed on Dec. 30, 2013, and issued on Mar. 15, 2016, as U.S. Pat. 9,288,596, and U.S. App. 14/041,989, titled “Group Coordinator Device Selection,” referred to as Docket No. 13-0815, filed on Sep. 30, 2013, and issued on May 16, 2017, as U.S. Pat. 9,654,545 disclose, inter alia, certain techniques whereby individual playback devices configured to operate in a groupwise manner decide which playback device should function as a group coordinator for the group of playback devices.

[0136] U.S. App. 14/988,524, titled, “Multiple-Device Setup,” referred to as Docket No. 15-1103, filed Jan. 5, 2016, and issued May 28, 2019, as U.S. Pat. 10,303,422, discloses, inter alia, techniques for adding several new playback devices to a playback system at the same time, including, in scenarios when two or more of the same type of playback devices are detected during setup, causing one of the two or more playback devices to emit a sound that enables a user to identify the one playback device in connection with playback system configuration and setup.

[0137] U.S. App. 16/119,516, titled, “Media Playback System with Virtual Line-In,” referred to as Docket No. 18-0406, filed on Aug. 31, 2018, and issued on Oct. 22, 2019, as U.S. Pat. 10,452,345, and U.S. App. 16/119,642, titled, “Interoperability Of Native Media Playback System With Virtual Line-In,” referred to as Docket No. 18-0503, filed on Aug. 31, 2018, and issued on May 12, 2020, as U.S. Pat. 10,649,718, (both of which claim priority to U.S. Prov. App. 62/672,020, titled “Media Playback System with Virtual Line-In,” referred to as Docket No. 18-0406p, filed on May 15, 2018, and now expired) describe, inter alia, scenarios where one playback device in one playback system coordinates aspects of playback and control of another playback device in a different playback system in connection with facilitating interoperability between to two different playback systems.

[0138] U.S. App. 16/415,783, titled, “Wireless Multi-Channel Headphone Systems and Methods,” referred to as Docket No. 19-0303, filed on May 17, 2019, and issued on Nov. 16, 2021, as U.S. Pat. 11,178,504, describes, inter alia, a surround sound controller and one or more wireless headphones that switch between operating in various modes that have different latency characteristics. For example, in a first mode, the surround sound controller uses a first Modulation and Coding Scheme (MCS) to transmit first surround sound audio information to a first pair of headphones, and in a second mode, the surround sound controller uses a second MCS to transmit (a) the first surround sound audio information to the first pair of headphones and (b) second surround sound audio information to a second pair of headphones.

[0139] However, U.S. Apps. 13/489,674; 14/042,001; 14/041,989; 14/988,524; 16/119,516; 16/119,642; 62/672,020; and 16/415,783 do not describe the Area Zone configurations, Multi-Player Playback Devices, playback entity implementations, logical players, and/or the multi-stream audio routing architecture or features disclosed herein. The entire contents of U.S. Apps. 13/489,674; 14/042,001; 14/041,989; 14/988,524; 16/119,516; 16/119,642; 62/672,020; and 16/415,783 are incorporated herein by reference.

[0140] Further, some advancements over time have improved the sound quality of audio content played by playback devices via methods of calibrating playback settings (e.g., audio playback settings) for playback devices based on acoustic characteristics of the listening environment in which the playback devices are situated. At a high level, calibrating playback settings includes, inter alia, determining one or more acoustic characteristics of the listening environment, and adjusting one or more audio playback settings (e.g., equalization settings, relative loudness, playing timing delays, and/or perhaps other settings) based on the acoustic characteristics.

[0141] For example, U.S. App. 15/211,822, titled, “Spatial Audio Correction,” referred to as Docket No. 16-0402, filed on Jul. 15, 2016, and issued on Oct. 17, 2017, as U.S. Pat. 9,794,710 describes, inter alia, determining a spatial and/or spectral calibration for one or more playback devices within a listening area. Similarly, U.S. App. 16/115,524, titled “Playback Device Calibration,” referred to as Docket No. 18-0401, filed on Aug. 28, 2018, and issued on May 21, 2019, as U.S. Pat. 10,299,061 describes, inter alia, calibrating a playback device within a room so that the audio output by the playback device accounts for (e.g., offsets) acoustic characteristics of that room, thereby improving sound of the audio playback experienced by a listener within the room.

[0142] Additionally, U.S. App. 15/630,214, titled, “Immersive Audio in a Media Playback System,” referred to as Docket No. 16-0504, filed on Jun. 22, 2017, and issued on Jul. 17, 2018, as U.S. Pat. 10,028,069 describes, inter alia, processes that include obtaining audio responses from several different playback devices in a media playback system. For example, a first playback device at a first time plays back calibration audio while a microphone device records the calibration audio being played back. A second playback device at a second time plays back the calibration audio while the microphone device records the calibration audio being played. The process is repeated until every playback device in the media playback system has played the calibration audio and had its response recorded.

[0143] Further, IntT App. PCT/US22/77233, titled “Audio Parameter Adjustment Based on Playback Device Separation Distance,” referred to as Docket No. 21-0605-PCT, and published as WO 2023/056336 on Apr. 6, 2023, disclosed, inter alia, applying a low frequency filter to two or more devices in a zone based on a distance between the devices. [0144] However, Apps. 15/211,822; 16/115,524; 15/630,214; and PCT/US22/77233 do not describe the Area Zone configurations, Multi-Player Playback Devices, playback entity implementations, and/or the multi-stream audio routing architecture or features disclosed herein. The entire contents of Apps. 15/211,822; 16/115,524; 15/630,214; and PCT/US22/77233 are incorporated herein by reference.

[0145] While some examples described herein may refer to functions performed by given actors such as “users,” “listeners,” and/or other entities, it should be understood that this is for purposes of explanation only. The claims should not be interpreted to require action by any such example actor unless explicitly required by the language of the claims themselves.

[0146] In the Figures, identical reference numbers identify generally similar, and/or identical, elements. To facilitate the discussion of any particular element, the most significant digit or digits of a reference number refers to the Figure in which that element is first introduced. For example, element 110a is first introduced and discussed with reference to Figure 1 A. Many of the details, dimensions, angles and other features shown in the Figures are merely illustrative of particular example configurations of the disclosed technology. Accordingly, other example configurations can have other details, dimensions, angles and features without departing from the spirit or scope of the disclosure. In addition, those of ordinary skill in the art will appreciate that further example configurations of the various disclosed technologies can be practiced without several of the details described below. II. Suitable Operating Environment

[0147] Figure 1 A is a partial cutaway view of a media playback system 100 distributed in an environment 101 (e.g., a house). The media playback system 100 comprises one or more playback devices 110 (identified individually as playback devices 1 lOa-n), one or more network microphone devices (“NMDs”), 120 (identified individually as NMDs 120a-c), and one or more control devices 130 (identified individually as control devices 130a and 130b). [0148] As used herein the term “playback device” can generally refer to a network device configured to receive, process, and output data of a media playback system. For example, a playback device can be a network device that receives and processes audio content. In some example configurations, a playback device includes one or more transducers or speakers powered by one or more amplifiers. In other example configurations, however, a playback device includes one of (or neither of) the speaker and the amplifier. For instance, a playback device can comprise one or more amplifiers configured to drive one or more speakers external to the playback device via a corresponding wire or cable.

[0149] Moreover, as used herein the term NMD (i.e., a “network microphone device”) can generally refer to a network device that is configured for audio detection. In some example configurations, an NMD is a stand-alone device configured primarily for audio detection. In other example configurations, an NMD is incorporated into a playback device (or vice versa). [0150] The term “control device” can generally refer to a network device configured to perform functions relevant to facilitating user access, control, and/or configuration of the media playback system 100.

[0151] Each of the playback devices 110 is configured to receive audio signals or data from one or more media sources (e.g., one or more remote servers, one or more local devices) and play back the received audio signals or data as sound. The one or more NMDs 120 are configured to receive spoken word commands, and the one or more control devices 130 are configured to receive user input. In response to the received spoken word commands and/or user input, the media playback system 100 can play back audio via one or more of the playback devices 110. In certain example configurations, the playback devices 110 are configured to commence playback of media content in response to a trigger. For instance, one or more of the playback devices 110 can be configured to play back a morning playlist upon detection of an associated trigger condition (e.g., presence of a user in a kitchen, detection of a coffee machine operation). In some example configurations, for example, the media playback system 100 is configured to play back audio from a first playback device (e.g., the playback device 100a) in synchrony with a second playback device (e.g., the playback device 100b). Interactions between the playback devices 110, NMDs 120, and/or control devices 130 of the media playback system 100 configured in accordance with the various example configurations of the disclosure are described in greater detail below with respect to Figures 1B-1L.

[0152] In the illustrated embodiment of Figure 1A, the environment 101 comprises a household having several rooms, spaces, and/or playback zones, including (clockwise from upper left) a master bathroom 101a, a master bedroom 101b, a second bedroom 101c, a family room or den 101 d, an office lOle, a living room 10 If, a dining room 101g, a kitchen lOlh, and an outdoor patio lOli. While certain example configurations and examples are described below in the context of a home environment, the technologies described herein may be implemented in other types of environments. In some example configurations, for example, the media playback system 100 can be implemented in one or more commercial settings (e.g., a restaurant, mall, airport, hotel, a retail or other store), one or more vehicles (e.g., a sports utility vehicle, bus, car, a ship, a boat, an airplane), multiple environments (e.g., a combination of home and vehicle environments), and/or another suitable environment where multi-zone audio may be desirable.

[0153] The media playback system 100 can comprise one or more playback zones, some of which may correspond to the rooms in the environment 101. The media playback system 100 can be established with one or more playback zones, after which additional zones may be added, or removed to form, for example, the configuration shown in Figure 1 A. Each zone may be given a name according to a different room or space such as the office lOle, master bathroom 101a, master bedroom 101b, the second bedroom 101c, kitchen lOlh, dining room 101g, living room 10 If, and/or the patio lOli. In some aspects, a single playback zone may include multiple rooms or spaces. In certain aspects, a single room or space may include multiple playback zones.

[0154] In the illustrated embodiment of Figure 1A, the master bathroom 101a, the second bedroom 101c, the office lOle, the living room 10 If, the dining room 101g, the kitchen lOlh, and the outdoor patio lOli each include one playback device 110, and the master bedroom 101b and the den 101 d include a plurality of playback devices 110. In the master bedroom 101b, the playback devices 1101 and 110m may be configured, for example, to play back audio content in synchrony as individual ones of playback devices 110, as a bonded playback zone, as a consolidated playback device, and/or any combination thereof. Similarly, in the den 101 d, the playback devices 1 lOh-j can be configured, for instance, to play back audio content in synchrony as individual ones of playback devices 110, as one or more bonded playback devices, and/or as one or more consolidated playback devices. Additional details regarding bonded and consolidated playback devices are described below with respect to, for example, Figures IB and IE and 1I-1M.

[0155] In some aspects, one or more of the playback zones in the environment 101 may each be playing different audio content. For instance, a user may be grilling on the patio lOli and listening to hip hop music being played by the playback device 110c while another user is preparing food in the kitchen lOlh and listening to classical music played by the playback device 110b. In another example, a playback zone may play the same audio content in synchrony with another playback zone. For instance, the user may be in the office lOle listening to the playback device 1 lOf playing back the same hip hop music being played back by playback device 110c on the patio lOli. In some aspects, the playback devices 110c and 1 lOf play back the hip hop music in synchrony such that the user perceives that the audio content is being played seamlessly (or at least substantially seamlessly) while moving between different playback zones. Additional details regarding audio playback synchronization among playback devices and/or zones can be found, for example, in U.S. Patent No. 8,234,395 entitled, “System and method for synchronizing operations among a plurality of independently clocked digital data processing devices,” which is incorporated herein by reference in its entirety. a. Suitable Media Playback System

[0156] Figure IB is a schematic diagram of the media playback system 100 and a cloud network 102. For ease of illustration, certain devices of the media playback system 100 and the cloud network 102 are omitted from Figure IB. One or more communications links 103 (referred to hereinafter as “the links 103”) communicatively couple the media playback system 100 and the cloud network 102.

[0157] The links 103 can comprise, for example, one or more wired networks, one or more wireless networks, one or more wide area networks (WAN), one or more local area networks (LAN), one or more personal area networks (PAN), one or more telecommunication networks (e.g., one or more Global System for Mobiles (GSM) networks, Code Division Multiple Access (CDMA) networks, Long-Term Evolution (LTE) networks, 5G communication network networks, and/or other suitable data transmission protocol networks), etc. The cloud network 102 is configured to deliver media content (e.g., audio content, video content, photographs, social media content) to the media playback system 100 in response to a request transmitted from the media playback system 100 via the links 103. In some example configurations, the cloud network 102 is further configured to receive data (e.g., voice input data) from the media playback system 100 and correspondingly transmit commands and/or media content to the media playback system 100.

[0158] The cloud network 102 comprises computing devices 106 (identified separately as a first computing device 106a, a second computing device 106b, and a third computing device 106c). The computing devices 106 can comprise individual computers or servers, such as, for example, a media streaming service server storing audio and/or other media content, a voice service server, a social media server, a media playback system control server, etc. In some example configurations, one or more of the computing devices 106 comprise modules of a single computer or server. In certain example configurations, one or more of the computing devices 106 comprise one or more modules, computers, and/or servers. Moreover, while the cloud network 102 is described above in the context of a single cloud network, in some example configurations the cloud network 102 comprises a plurality of cloud networks comprising communicatively coupled computing devices. Furthermore, while the cloud network 102 is shown in Figure IB as having three of the computing devices 106, in some example configurations, the cloud network 102 comprises fewer (or more than) three computing devices 106.

[0159] The media playback system 100 is configured to receive media content from the networks 102 via the links 103. The received media content can comprise, for example, a Uniform Resource Identifier (URI) and/or a Uniform Resource Locator (URL). For instance, in some examples, the media playback system 100 can stream, download, or otherwise obtain data from a URI or a URL corresponding to the received media content. A network 104 communicatively couples the links 103 and at least a portion of the devices (e.g., one or more of the playback devices 110, NMDs 120, and/or control devices 130) of the media playback system 100. The network 104 can include, for example, a wireless network (e.g., a WiFi network, a Bluetooth, a Z-Wave network, a ZigBee, and/or other suitable wireless communication protocol network) and/or a wired network (e.g., a network comprising Ethernet, Universal Serial Bus (USB), and/or another suitable wired communication). As those of ordinary skill in the art will appreciate, as used herein, “WiFi” can refer to several different communication protocols including, for example, Institute of Electrical and Electronics Engineers (IEEE) 802.11a, 802.11b, 802.11g, 802.1 In, 802.1 lac, 802.1 lac, 802. Had, 802.11af, 802.11ah, 802.1 lai, 802.1 laj, 802.11aq, 802.11ax, 802.11ay, 802.15, etc. transmitted at 2.4 Gigahertz (GHz), 5 GHz, and/or another suitable frequency. [0160] In some example configurations, the network 104 comprises a dedicated communication network that the media playback system 100 uses to transmit messages between individual devices and/or to transmit media content to and from media content sources (e.g., one or more of the computing devices 106). In certain example configurations, the network 104 is configured to be accessible only to devices in the media playback system 100, thereby reducing interference and competition with other household devices. In other example configurations, however, the network 104 comprises an existing household communication network (e.g., a household WiFi network). In some example configurations, the links 103 and the network 104 comprise one or more of the same networks. In some aspects, for example, the links 103 and the network 104 comprise a telecommunication network (e.g., an LTE network, a 5G network). Moreover, in some example configurations, the media playback system 100 is implemented without the network 104, and devices comprising the media playback system 100 can communicate with each other, for example, via one or more direct connections, PANs, telecommunication networks, and/or other suitable communications links.

[0161] In some example configurations, audio content sources may be regularly added or removed from the media playback system 100. In some example configurations, for example, the media playback system 100 performs an indexing of media items when one or more media content sources are updated, added to, and/or removed from the media playback system 100. The media playback system 100 can scan identifiable media items in some or all folders and/or directories accessible to the playback devices 110, and generate or update a media content database comprising metadata (e.g., title, artist, album, track length) and other associated information (e.g., URIs, URLs) for each identifiable media item found. In some example configurations, for example, the media content database is stored on one or more of the playback devices 110, network microphone devices 120, and/or control devices 130. [0162] In the illustrated embodiment of Figure IB, the playback devices 1101 and 110m comprise a group 107a. The playback devices 1101 and 110m can be positioned in different rooms in a household and be grouped together in the group 107a on a temporary or permanent basis based on user input received at the control device 130a and/or another control device 130 in the media playback system 100. When arranged in the group 107a, the playback devices 1101 and 110m can be configured to play back the same or similar audio content in synchrony from one or more audio content sources. In certain example configurations, for example, the group 107a comprises a bonded zone in which the playback devices 1101 and 110m comprise left audio and right audio channels, respectively, of multi- channel audio content, thereby producing or enhancing a stereo effect of the audio content. In some example configurations, the group 107a includes additional playback devices 110. In other example configurations, however, the media playback system 100 omits the group 107a and/or other grouped arrangements of the playback devices 110. Additional details regarding groups and other arrangements of playback devices are described in further detail below with respect to Figures 1-1 through IM.

[0163] The media playback system 100 includes the NMDs 120a and 120d, each comprising one or more microphones configured to receive voice utterances from a user. In the illustrated embodiment of Figure IB, the NMD 120a is a standalone device and the NMD 120d is integrated into the playback device 1 lOn. The NMD 120a, for example, is configured to receive voice input 121 from a user 123. In some example configurations, the NMD 120a transmits data associated with the received voice input 121 to a voice assistant service (VAS) configured to (i) process the received voice input data and (ii) transmit a corresponding command to the media playback system 100. In some aspects, for example, the computing device 106c comprises one or more modules and/or servers of a VAS (e.g., a VAS operated by one or more of SONOS®, AMAZON®, GOOGLE® APPLE®, MICROSOFT®). The computing device 106c can receive the voice input data from the NMD 120a via the network 104 and the links 103. In response to receiving the voice input data, the computing device 106c processes the voice input data (i.e., “Play Hey Jude by The Beatles”), and determines that the processed voice input includes a command to play a song (e.g., “Hey Jude”). The computing device 106c accordingly transmits commands to the media playback system 100 to play back “Hey Jude” by the Beatles from a suitable media service (e.g., via one or more of the computing devices 106) on one or more of the playback devices 110. b. Suitable Playback Devices

[0164] Figure 1C is a block diagram of the playback device 110a comprising an input/output 111. The input/output 111 can include an analog I/O I l la (e.g., one or more wires, cables, and/or other suitable communications links configured to carry analog signals) and/or a digital I/O 111b (e.g., one or more wires, cables, or other suitable communications links configured to carry digital signals). In some example configurations, the analog I/O 11 la is an audio line-in input connection comprising, for example, an auto-detecting 3.5mm audio line-in connection. In some example configurations, the digital I/O 111b comprises a Sony/Philips Digital Interface Format (S/PDIF) communication interface and/or cable and/or a Toshiba Link (TOSLINK) cable. In some example configurations, the digital I/O 111b comprises a High-Definition Multimedia Interface (HDMI) interface and/or cable. In some example configurations, the digital I/O 111b includes one or more wireless communications links comprising, for example, a radio frequency (RF), infrared, WiFi, Bluetooth, or another suitable communication protocol. In certain example configurations, the analog VO I l la and the digital VO 111b comprise interfaces (e.g., ports, plugs, jacks) configured to receive connectors of cables transmitting analog and digital signals, respectively, without necessarily including cables.

[0165] The playback device 110a, for example, can receive media content (e.g., audio content comprising music and/or other sounds) from a local audio source 105 via the input/output 111 (e.g., a cable, a wire, a PAN, a Bluetooth connection, an ad hoc wired or wireless communication network, and/or another suitable communications link). The local audio source 105 can comprise, for example, a mobile device (e.g., a smartphone, a tablet, a laptop computer) or another suitable audio component (e.g., a television, a desktop computer, an amplifier, a phonograph, a Blu-ray player, a memory storing digital media files). In some aspects, the local audio source 105 includes local music libraries on a smartphone, a computer, a networked-attached storage (NAS), and/or another suitable device configured to store media files. In certain example configurations, one or more of the playback devices 110, NMDs 120, and/or control devices 130 comprise the local audio source 105. In other example configurations, however, the media playback system omits the local audio source 105 altogether. In some example configurations, the playback device 110a does not include an input/output 111 and receives all audio content via the network 104.

[0166] The playback device 110a further comprises electronics 112, a user interface 113 (e.g., one or more buttons, knobs, dials, touch-sensitive surfaces, displays, touchscreens), and one or more transducers 114 (referred to hereinafter as “the transducers 114”). The electronics 112 is configured to receive audio from an audio source (e.g., the local audio source 105) via the inpu output 111, one or more of the computing devices 106a-c via the network 104 (Figure IB)), amplify the received audio, and output the amplified audio for playback via one or more of the transducers 114. In some example configurations, the playback device 110a optionally includes one or more microphones 115 (e.g., a single microphone, a plurality of microphones, a microphone array) (hereinafter referred to as “the microphones 115”). In certain example configurations, for example, the playback device 110a having one or more of the optional microphones 115 can operate as an NMD configured to receive voice input from a user and correspondingly perform one or more operations based on the received voice input. [0167] In the illustrated embodiment of Figure 1C, the electronics 112 comprise one or more processors 112a (referred to hereinafter as “the processors 112a”), memory 112b, software components 112c, a network interface 112d, one or more audio processing components 112g (referred to hereinafter as “the audio components 112g”), one or more audio amplifiers 112h (referred to hereinafter as “the amplifiers 112h”), and power 112i (e.g., one or more power supplies, power cables, power receptacles, batteries, induction coils, Power-over Ethernet (POE) interfaces, and/or other suitable sources of electric power). In some example configurations, the electronics 112 optionally include one or more other components 112j (e.g., one or more sensors, video displays, touchscreens, battery charging bases).

[0168] The processors 112a can comprise clock-driven computing component(s) configured to process data, and the memory 112b can comprise a computer-readable medium (e.g., a tangible, non-transitory computer-readable medium, data storage loaded with one or more of the software components 112c) configured to store instructions for performing various operations and/or functions. The processors 112a are configured to execute the instructions stored on the memory 112b to perform one or more of the operations. The operations can include, for example, causing the playback device 110a to retrieve audio information from an audio source (e.g., one or more of the computing devices 106a-c (Figure IB)), and/or another one of the playback devices 110. In some example configurations, the operations further include causing the playback device 110a to send audio information to another one of the playback devices 110a and/or another device (e.g., one of the NMDs 120). Certain example configurations include operations causing the playback device 110a to pair with another of the one or more playback devices 110 to enable a multi-channel audio environment (e.g., a stereo pair, a bonded zone).

[0172] In the illustrated embodiment of Figure 1C, the network interface 112d comprises one or more wireless interfaces 112e (referred to hereinafter as “the wireless interface 112e”). The wireless interface 112e (e.g., a suitable interface comprising one or more antennae) can be configured to wirelessly communicate with one or more other devices (e.g., one or more of the other playback devices 110, NMDs 120, and/or control devices 130) that are communicatively coupled to the network 104 (Figure IB) in accordance with a suitable wireless communication protocol (e.g., WiFi, Bluetooth, LTE). In some example configurations, the network interface 112d optionally includes a wired interface 112f (e.g., an interface or receptacle configured to receive a network cable such as an Ethernet, a USB-A, USB-C, and/or Thunderbolt cable) configured to communicate over a wired connection with other devices in accordance with a suitable wired communication protocol. In certain example configurations, the network interface 112d includes the wired interface 112f and excludes the wireless interface 112e. In some example configurations, the electronics 112 excludes the network interface 112d altogether and transmits and receives media content and/or other data via another communication path (e.g., the input/output 111).

[0173] The audio processing components 112g are configured to process and/or filter data comprising media content received by the electronics 112 (e.g., via the input/output 111 and/or the network interface 112d) to produce output audio signals. In some example configurations, the audio processing components 112g comprise, for example, one or more digital-to-analog converters (DAC), audio preprocessing components, audio enhancement components, digital signal processors (DSPs), and/or other suitable audio processing components, modules, circuits, etc. In certain example configurations, one or more of the audio processing components 112g can comprise one or more subcomponents of the processors 112a. In some example configurations, the electronics 112 omits the audio processing components 112g. In some aspects, for example, the processors 112a execute instructions stored on the memory 112b to perform audio processing operations to produce the output audio signals.

[0174] The amplifiers 112h are configured to receive and amplify the audio output signals produced by the audio processing components 112g and/or the processors 112a. The amplifiers 112h can comprise electronic devices and/or components configured to amplify audio signals to levels sufficient for driving one or more of the transducers 114. In some example configurations, for example, the amplifiers 112h include one or more switching or class-D power amplifiers. In other example configurations, however, the amplifiers include one or more other types of power amplifiers (e.g., linear gain power amplifiers, class-A amplifiers, class-B amplifiers, class-AB amplifiers, class-C amplifiers, class-D amplifiers, class-E amplifiers, class-F amplifiers, class-G and/or class H amplifiers, and/or another suitable type of power amplifier). In certain example configurations, the amplifiers 112h comprise a suitable combination of two or more of the foregoing types of power amplifiers. Moreover, in some example configurations, individual ones of the amplifiers 112h correspond to individual ones of the transducers 114. In other example configurations, however, the electronics 112 includes a single one of the amplifiers 112h configured to output amplified audio signals to a plurality of the transducers 114. In some other example configurations, the electronics 112 omits the amplifiers 112h.

[0175] The transducers 114 (e.g., one or more speakers and/or speaker drivers) receive the amplified audio signals from the amplifier 112h and render or output the amplified audio signals as sound (e.g., audible sound waves having a frequency between about 20 Hertz (Hz) and 20 kilohertz (kHz)). In some example configurations, the transducers 114 can comprise a single transducer. In other example configurations, however, the transducers 114 comprise a plurality of audio transducers. In some example configurations, the transducers 114 comprise more than one type of transducer. For example, the transducers 114 can include one or more low frequency transducers (e.g., subwoofers, woofers), mid-range frequency transducers (e.g., mid-range transducers, mid-woofers), and one or more high frequency transducers (e.g., one or more tweeters). As used herein, “low frequency” can generally refer to audible frequencies below about 500 Hz, “mid-range frequency” can generally refer to audible frequencies between about 500 Hz and about 2 kHz, and “high frequency” can generally refer to audible frequencies above 2 kHz. In certain example configurations, however, one or more of the transducers 114 comprise transducers that do not adhere to the foregoing frequency ranges. For example, one of the transducers 114 may comprise a mid-woofer transducer configured to output sound at frequencies between about 200 Hz and about 5 kHz.

[0176] By way of illustration, SONOS, Inc. presently offers (or has offered) for sale certain playback devices including, for example, a “SONOS ONE,” “PLAY:1,” “PLAY:3,” “PLAYA,” “PLAYBAR,” “PLAYBASE,” “CONNECT: AMP,” “CONNECT,” and “SUB.” Other suitable playback devices may additionally or alternatively be used to implement the playback devices of example configurations disclosed herein. Additionally, one of ordinary skilled in the art will appreciate that a playback device is not limited to the examples described herein or to SONOS product offerings. In some example configurations, for example, one or more playback devices 110 comprises wired or wireless headphones (e.g., over-the-ear headphones, on-ear headphones, in-ear earphones). In other example configurations, one or more of the playback devices 110 comprise a docking station and/or an interface configured to interact with a docking station for personal mobile media playback devices. In certain example configurations, a playback device may be integral to another device or component such as a television, a lighting fixture, or some other device for indoor or outdoor use. In some example configurations, a playback device omits a user interface and/or one or more transducers. For example, FIG. ID is a block diagram of a playback device 1 lOp comprising the input/output 111 and electronics 112 without the user interface 113 or transducers 114.

[0178] Figure IF is a block diagram of the NMD 120a (Figures 1 A and IB). The NMD 120a includes one or more voice processing components 124 (hereinafter “the voice components 124”) and several components described with respect to the playback device 110a (Figure 1C) including the processors 112a, the memory 112b, and the microphones 115. The NMD 120a optionally comprises other components also included in the playback device 110a (Figure 1C), such as the user interface 113 and/or the transducers 114. In some example configurations, the NMD 120a is configured as a media playback device (e.g., one or more of the playback devices 110), and further includes, for example, one or more of the audio processing components 112g (Figure 1C), the transducers 114, and/or other playback device components. In certain example configurations, the NMD 120a comprises an Internet of Things (loT) device such as, for example, a thermostat, alarm panel, fire and/or smoke detector, etc. In some example configurations, the NMD 120a comprises the microphones 115, the voice processing 124, and only a portion of the components of the electronics 112 described above with respect to Figure IB. In some aspects, for example, the NMD 120a includes the processor 112a and the memory 112b (Figure IB), while omitting one or more other components of the electronics 112. In some example configurations, the NMD 120a includes additional components (e.g., one or more sensors, cameras, thermometers, barometers, hygrometers). [0179] In some example configurations, an NMD can be integrated into a playback device. Figure 1G is a block diagram of a playback device 1 lOr comprising an NMD 120d. The playback device 1 lOr can comprise many or all of the components of the playback device 110a and further include the microphones 115 and voice processing 124 (Figure IF). The playback device 1 lOr optionally includes an integrated control device 130c. The control device 130c can comprise, for example, a user interface (e.g., the user interface 113 of Figure IB) configured to receive user input (e.g., touch input, voice input) without a separate control device. In other example configurations, however, the playback device 1 lOr receives commands from another control device (e.g., the control device 130a of Figure IB). Additional NMD example configurations are described in further detail below with respect to Figures 3A-3F.

[0180] Referring again to Figure IF, the microphones 115 are configured to acquire, capture, and/or receive sound from an environment (e.g., the environment 101 of Figure 1 A) and/or a room in which the NMD 120a is positioned. The received sound can include, for example, vocal utterances, audio played back by the NMD 120a and/or another playback device, background voices, ambient sounds, etc. The microphones 115 convert the received sound into electrical signals to produce microphone data. The voice processing 124 receives and analyzes the microphone data to determine whether a voice input is present in the microphone data. The voice input can comprise, for example, an activation word followed by an utterance including a user request. As those of ordinary skill in the art will appreciate, an activation word is a word or other audio cue that signifying a user voice input. For instance, in querying the AMAZON® VAS, a user might speak the activation word "Alexa." Other examples include "Ok, Google" for invoking the GOOGLE® VAS and "Hey, Siri" for invoking the APPLE® VAS.

[0181] After detecting the activation word, voice processing 124 monitors the microphone data for an accompanying user request in the voice input. The user request may include, for example, a command to control a third-party device, such as a thermostat (e.g., NEST® thermostat), an illumination device (e.g., a PHILIPS HUE ® lighting device), or a media playback device (e.g., a Sonos® playback device). For example, a user might speak the activation word “Alexa” followed by the utterance “set the thermostat to 68 degrees” to set a temperature in a home (e.g., the environment 101 of Figure 1 A). The user might speak the same activation word followed by the utterance “turn on the living room” to turn on illumination devices in a living room area of the home. The user may similarly speak an activation word followed by a request to play a particular song, an album, or a playlist of music on a playback device in the home. Additional description regarding receiving and processing voice input data can be found in further detail below with respect to Figures 3 A- 3F. d. Suitable Control Devices

[0182] Figure 1H is a partially schematic diagram of the control device 130a (Figures 1 A and IB). As used herein, the term “control device” can be used interchangeably with “controller” or “control system.” Among other features, the control device 130a is configured to receive user input related to the media playback system 100 and, in response, cause one or more devices in the media playback system 100 to perform an action(s) or operation(s) corresponding to the user input. In the illustrated embodiment, the control device 130a comprises a smartphone (e.g., an iPhone™ an Android phone) on which media playback system controller application software is installed. In some example configurations, the control device 130a comprises, for example, a tablet (e.g., an iPad™), a computer (e.g., a laptop computer, a desktop computer), and/or another suitable device (e.g., a television, an automobile audio head unit, an loT device). In certain example configurations, the control device 130a comprises a dedicated controller for the media playback system 100. In other example configurations, as described above with respect to Figure 1G, the control device 130a is integrated into another device in the media playback system 100 (e.g., one more of the playback devices 110, NMDs 120, and/or other suitable devices configured to communicate over a network).

[0183] The control device 130a includes electronics 132, a user interface 133, one or more speakers 134, and one or more microphones 135. The electronics 132 comprise one or more processors 132a (referred to hereinafter as “the processors 132a”), a memory 132b, software components 132c, and a network interface 132d. The processor 132a can be configured to perform functions relevant to facilitating user access, control, and configuration of the media playback system 100. The memory 132b can comprise data storage that can be loaded with one or more of the software components executable by the processor 302 to perform those functions. The software components 132c can comprise applications and/or other executable software configured to facilitate control of the media playback system 100. The memory 112b can be configured to store, for example, the software components 132c, media playback system controller application software, and/or other data associated with the media playback system 100 and the user. [0184] The network interface 132d is configured to facilitate network communications between the control device 130a and one or more other devices in the media playback system 100, and/or one or more remote devices. In some example configurations, the network interface 132d is configured to operate according to one or more suitable communication industry standards (e.g., infrared, radio, wired standards including IEEE 802.3, wireless standards including IEEE 802.11a, 802.11b, 802.11g, 802.11n, 802.11ac, 802.15, 4G, LTE). The network interface 132d can be configured, for example, to transmit data to and/or receive data from the playback devices 110, the NMDs 120, other ones of the control devices 130, one of the computing devices 106 of Figure IB, devices comprising one or more other media playback systems, etc. The transmitted and/or received data can include, for example, playback device control commands, state variables, playback zone and/or zone group configurations. For instance, based on user input received at the user interface 133, the network interface 132d can transmit a playback device control command (e.g., volume control, audio playback control, audio content selection) from the control device 304 to one or more of playback devices. The network interface 132d can also transmit and/or receive configuration changes such as, for example, adding/removing one or more playback devices to/from a zone, adding/removing one or more zones to/from a zone group, forming a bonded or consolidated player, separating one or more playback devices from a bonded or consolidated player, among others. Additional description of zones and groups can be found below with respect to Figures 1-1 through IM.

[0185] The user interface 133 is configured to receive user input and can facilitate 'control of the media playback system 100. The user interface 133 includes media content art 133a (e.g., album art, lyrics, videos), a playback status indicator 133b (e.g., an elapsed and/or remaining time indicator), media content information region 133c, a playback control region 133d, and a zone indicator 133e. The media content information region 133c can include a display of relevant information (e.g., title, artist, album, genre, release year) about media content currently playing and/or media content in a queue or playlist. The playback control region 133d can include selectable (e.g., via touch input and/or via a cursor or another suitable selector) icons to cause one or more playback devices in a selected playback zone or zone group to perform playback actions such as, for example, play or pause, fast forward, rewind, skip to next, skip to previous, enter/exit shuffle mode, enter/exit repeat mode, enter/exit cross fade mode, etc. The playback control region 133d may also include selectable icons to modify equalization settings, playback volume, and/or other suitable playback actions. In the illustrated embodiment, the user interface 133 comprises a display presented on a touch screen interface of a smartphone (e.g., an iPhone™ an Android phone). In some example configurations, however, user interfaces of varying formats, styles, and interactive sequences may alternatively be implemented on one or more network devices to provide comparable control access to a media playback system.

[0186] The one or more speakers 134 (e.g., one or more transducers) can be configured to output sound to the user of the control device 130a. In some example configurations, the one or more speakers comprise individual transducers configured to correspondingly output low frequencies, mid-range frequencies, and/or high frequencies. In some aspects, for example, the control device 130a is configured as a playback device (e.g., one of the playback devices 110). Similarly, in some example configurations the control device 130a is configured as an NMD (e.g., one of the NMDs 120), receiving voice commands and other sounds via the one or more microphones 135.

[0187] The one or more microphones 135 can comprise, for example, one or more condenser microphones, electret condenser microphones, dynamic microphones, and/or other suitable types of microphones or transducers. In some example configurations, two or more of the microphones 135 are arranged to capture location information of an audio source (e.g., voice, audible sound) and/or configured to facilitate filtering of background noise. Moreover, in certain example configurations, the control device 130a is configured to operate as playback device and an NMD. In other example configurations, however, the control device 130a omits the one or more speakers 134 and/or the one or more microphones 135. For instance, the control device 130a may comprise a device (e.g., a thermostat, an loT device, a network device) comprising a portion of the electronics 132 and the user interface 133 (e.g., a touch screen) without any speakers or microphones. Additional control device example configurations are described in further detail below with respect to Figures 4A-4D and 5. e. Suitable Playback Device Configurations

[0188] Figures 1-1 through IM show example configurations of playback devices in zones and zone groups. Referring first to Figure IM, in one example, a single playback device may belong to a zone. For example, the playback device 110g in the second bedroom 101c (FIG.

1 A) may belong to Zone C. In some implementations described below, multiple playback devices may be “bonded” to form a “bonded pair” which together form a single zone. For example, the playback device 1101 (e.g., a left playback device) can be bonded to the playback device 1101 (e.g., a left playback device) to form Zone A. Bonded playback devices may have different playback responsibilities (e.g., channel responsibilities). In another implementation described below, multiple playback devices may be merged to form a single zone. For example, the playback device 1 lOh (e.g., a front playback device) may be merged with the playback device 1 lOi (e.g., a subwoofer), and the playback devices 1 lOj and 110k (e.g., left and right surround speakers, respectively) to form a single Zone D. In another example, the playback devices 110g and 1 lOh can be merged to form a merged group or a zone group 108b. The merged playback devices 110g and 1 lOh may not be specifically assigned different playback responsibilities. That is, the merged playback devices 1 lOh and 1 lOi may, aside from playing audio content in synchrony, each play audio content as they would if they were not merged.

[0189] Each zone in the media playback system 100 may be provided for control as a single user interface (UI) entity. For example, Zone A may be provided as a single entity named Master Bathroom. Zone B may be provided as a single entity named Master Bedroom. Zone C may be provided as a single entity named Second Bedroom.

[0190] Playback devices that are bonded may have different playback responsibilities, such as responsibilities for certain audio channels. For example, as shown in Figure 1-1, the playback devices 1101 and 110m may be bonded so as to produce or enhance a stereo effect of audio content. In this example, the playback device 1101 may be configured to play a left channel audio component, while the playback device 110k may be configured to play a right channel audio component. In some implementations, such stereo bonding may be referred to as “pairing.”

[0191] Additionally, bonded playback devices may have additional and/or different respective speaker drivers. As shown in Figure 1 J, the playback device 1 lOh named Front may be bonded with the playback device 1 lOi named SUB. The Front device 1 lOh can be configured to render a range of mid to high frequencies and the SUB device 1 lOi can be configured render low frequencies. When unbonded, however, the Front device 1 lOh can be configured render a full range of frequencies. As another example, Figure IK shows the Front and SUB devices 1 lOh and 1 lOi further bonded with Left and Right playback devices 1 lOj and 110k, respectively. In some implementations, the Right and Left devices 1 lOj and 102k can be configured to form surround or “satellite” channels of a home theater system. The bonded playback devices 1 lOh, 1 lOi, 1 lOj, and 110k may form a single Zone D (FIG. IM). [0192] Playback devices that are merged may not have assigned playback responsibilities, and may each render the full range of audio content the respective playback device is capable of. Nevertheless, merged devices may be represented as a single UI entity (i.e., a zone, as discussed above). For instance, the playback devices 110a and 11 On in the master bathroom have the single UI entity of Zone A. In one embodiment, the playback devices 110a and 1 lOn may each output the full range of audio content each respective playback devices 110a and 1 lOn are capable of, in synchrony.

[0193] In some example configurations, an NMD is bonded or merged with another device so as to form a zone. For example, the NMD 120b may be bonded with the playback device 1 lOe, which together form Zone F, named Living Room. In other example configurations, a stand-alone network microphone device may be in a zone by itself. In other example configurations, however, a stand-alone network microphone device may not be associated with a zone. Additional details regarding associating network microphone devices and playback devices as designated or default devices may be found, for example, in previously referenced U.S. Patent Application No. 15/438,749.

[0194] Zones of individual, bonded, and/or merged devices may be grouped to form a zone group. For example, referring to Figure IM, Zone A may be grouped with Zone B to form a zone group 108a that includes the two zones. Similarly, Zone G may be grouped with Zone H to form the zone group 108b. As another example, Zone A may be grouped with one or more other Zones C-I. The Zones A-I may be grouped and ungrouped in numerous ways. For example, three, four, five, or more (e.g., all) of the Zones A-I may be grouped. When grouped, the zones of individual and/or bonded playback devices may play back audio in synchrony with one another, as described in previously referenced U.S. Patent No. 8,234,395. Playback devices may be dynamically grouped and ungrouped to form new or different groups that synchronously play back audio content.

[0195] In various implementations, the zones in an environment may be the default name of a zone within the group or a combination of the names of the zones within a zone group. For example, Zone Group 108b can be assigned a name such as “Dining + Kitchen”, as shown in Figure IM. In some example configurations, a zone group may be given a unique name selected by a user.

[0196] Certain data may be stored in a memory of a playback device (e.g., the memory 112b of Figure 1C) as one or more state variables that are periodically updated and used to describe the state of a playback zone, the playback device(s), and/or a zone group associated therewith. The memory may also include the data associated with the state of the other devices of the media system, and shared from time to time among the devices so that one or more of the devices have the most recent data associated with the system.

[0197] In some example configurations, the memory may store instances of various variable types associated with the states. Variables instances may be stored with identifiers (e.g., tags) corresponding to type. For example, certain identifiers may be a first type “al” to identify playback device(s) of a zone, a second type “bl” to identify playback device(s) that may be bonded in the zone, and a third type “cl” to identify a zone group to which the zone may belong. As a related example, identifiers associated with the second bedroom 101c may indicate that the playback device is the only playback device of the Zone C and not in a zone group. Identifiers associated with the Den may indicate that the Den is not grouped with other zones but includes bonded playback devices 11 Oh- 110k. Identifiers associated with the Dining Room may indicate that the Dining Room is part of the Dining + Kitchen zone group 108b and that devices 110b and 1 lOd are grouped (FIG. IL). Identifiers associated with the Kitchen may indicate the same or similar information by virtue of the Kitchen being part of the Dining + Kitchen zone group 108b. Other example zone variables and identifiers are described below.

[0198] In yet another example, the media playback system 100 may variables or identifiers representing other associations of zones and zone groups, such as identifiers associated with Areas, as shown in Figure IM. An area may involve a cluster of zone groups and/or zones not within a zone group. For instance, Figure IM shows an Upper Area 109a including Zones A- D, and a Lower Area 109b including Zones E-I. In one aspect, an Area may be used to invoke a cluster of zone groups and/or zones that share one or more zones and/or zone groups of another cluster. In another aspect, this differs from a zone group, which does not share a zone with another zone group. Further examples of techniques for implementing Areas may be found, for example, in U.S. Application No. 15/682,506 filed August 21, 2017, and titled “Room Association Based on Name,” and U.S. Patent No. 8,483,853 filed September 11, 2007, and titled “Controlling and manipulating groupings in a multi-zone media system.” Each of these applications is incorporated herein by reference in its entirety. In some example configurations, the media playback system 100 may not implement Areas, in which case the system may not store variables associated with Areas.

III. Example Systems and Devices

[0199] Figure 2A is a front isometric view of a playback device 210 configured in accordance with aspects of the disclosed technology. Figure 2B is a front isometric view of the playback device 210 without a grille 216e. Figure 2C is an exploded view of the playback device 210. Referring to Figures 2A-2C together, the playback device 210 comprises a housing 216 that includes an upper portion 216a, a right or first side portion 216b, a lower portion 216c, a left or second side portion 216d, the grille 216e, and a rear portion 216f. A plurality of fasteners 216g (e.g., one or more screws, rivets, clips) attaches a frame 216h to the housing 216. A cavity 216j (Figure 2C) in the housing 216 is configured to receive the frame 216h and electronics 212. The frame 216h is configured to carry a plurality of transducers 214 (identified individually in Figure 2B as transducers 214a-f). The electronics 212 (e.g., the electronics 112 of Figure 1C) is configured to receive audio content from an audio source and send electrical signals corresponding to the audio content to the transducers 214 for playback.

[0200] The transducers 214 are configured to receive the electrical signals from the electronics 112, and further configured to convert the received electrical signals into audible sound during playback. For instance, the transducers 214a-c (e.g., tweeters) can be configured to output high frequency sound (e.g., sound waves having a frequency greater than about 2 kHz). The transducers 214d-f (e.g., mid-woofers, woofers, midrange speakers) can be configured output sound at frequencies lower than the transducers 214a-c (e.g., sound waves having a frequency lower than about 2 kHz). In some example configurations, the playback device 210 includes a number of transducers different than those illustrated in Figures 2A-2C. For example, as described in further detail below with respect to Figures 3 A- 3C, the playback device 210 can include fewer than six transducers (e.g., one, two, three). In other example configurations, however, the playback device 210 includes more than six transducers (e.g., nine, ten). Moreover, in some example configurations, all or a portion of the transducers 214 are configured to operate as a phased array to desirably adjust (e.g., narrow or widen) a radiation pattern of the transducers 214, thereby altering a user’s perception of the sound emitted from the playback device 210.

[0201] In the illustrated embodiment of Figures 2A-2C, a filter 216i is axially aligned with the transducer 214b. The filter 216i can be configured to desirably attenuate a predetermined range of frequencies that the transducer 214b outputs to improve sound quality and a perceived sound stage output collectively by the transducers 214. In some example configurations, however, the playback device 210 omits the filter 216i. In other example configurations, the playback device 210 includes one or more additional filters aligned with the transducers 214b and/or at least another of the transducers 214.

[0202] Figures 3 A and 3B are front and right isometric side views, respectively, of an NMD 320 configured in accordance with example configurations of the disclosed technology. Figure 3C is an exploded view of the NMD 320. Figure 3D is an enlarged view of a portion of Figure 3B including a user interface 313 of the NMD 320. Referring first to Figures 3 A- 3C, the NMD 320 includes a housing 316 comprising an upper portion 316a, a lower portion 316b and an intermediate portion 316c (e.g., a grille). A plurality of ports, holes or apertures 316d in the upper portion 316a allow sound to pass through to one or more microphones 315 (Figure 3C) positioned within the housing 316. The one or more microphones 315 are configured to received sound via the apertures 316d and produce electrical signals based on the received sound. In the illustrated embodiment, a frame 316e (Figure 3C) of the housing 316 surrounds cavities 316f and 316g configured to house, respectively, a first transducer 314a (e.g., a tweeter) and a second transducer 314b (e.g., a mid-woofer, a midrange speaker, a woofer). In other example configurations, however, the NMD 320 includes a single transducer, or more than two (e.g., two, five, six) transducers. In certain example configurations, the NMD 320 omits the transducers 314a and 314b altogether.

[0203] Electronics 312 (Figure 3C) includes components configured to drive the transducers 314a and 314b, and further configured to analyze audio information corresponding to the electrical signals produced by the one or more microphones 315. In some example configurations, for example, the electronics 312 comprises many or all of the components of the electronics 112 described above with respect to Figure 1C. In certain example configurations, the electronics 312 includes components described above with respect to Figure IF such as, for example, the one or more processors 112a, the memory 112b, the software components 112c, the network interface 112d, etc. In some example configurations, the electronics 312 includes additional suitable components (e.g., proximity or other sensors).

[0204] Referring to Figure 3D, the user interface 313 includes a plurality of control surfaces (e.g., buttons, knobs, capacitive surfaces) including a first control surface 313a (e.g., a previous control), a second control surface 313b (e.g., a next control), and a third control surface 313c (e.g., a play and/or pause control). A fourth control surface 313d is configured to receive touch input corresponding to activation and deactivation of the one or microphones 315. A first indicator 313e (e.g., one or more light emitting diodes (LEDs) or another suitable illuminator) can be configured to illuminate only when the one or more microphones 315 are activated. A second indicator 313f (e.g., one or more LEDs) can be configured to remain solid during normal operation and to blink or otherwise change from solid to indicate a detection of voice activity. In some example configurations, the user interface 313 includes additional or fewer control surfaces and illuminators. In one embodiment, for example, the user interface 313 includes the first indicator 313e, omitting the second indicator 313f Moreover, in certain example configurations, the NMD 320 comprises a playback device and a control device, and the user interface 313 comprises the user interface of the control device. [0205] Referring to Figures 3 A-3D together, the NMD 320 is configured to receive voice commands from one or more adjacent users via the one or more microphones 315. As described above with respect to Figure IB, the one or more microphones 315 can acquire, capture, or record sound in a vicinity (e.g., a region within 10m or less of the NMD 320) and transmit electrical signals corresponding to the recorded sound to the electronics 312. The electronics 312 can process the electrical signals and can analyze the resulting audio data to determine a presence of one or more voice commands (e.g., one or more activation words). In some example configurations, for example, after detection of one or more suitable voice commands, the NMD 320 is configured to transmit a portion of the recorded audio data to another device and/or a remote server (e.g., one or more of the computing devices 106 of Figure IB) for further analysis. The remote server can analyze the audio data, determine an appropriate action based on the voice command, and transmit a message to the NMD 320 to perform the appropriate action. For instance, a user may speak “Sonos, play Michael Jackson.” The NMD 320 can, via the one or more microphones 315, record the user’s voice utterance, determine the presence of a voice command, and transmit the audio data having the voice command to a remote server (e.g., one or more of the remote computing devices 106 of Figure IB, one or more servers of a VAS and/or another suitable service). The remote server can analyze the audio data and determine an action corresponding to the command. The remote server can then transmit a command to the NMD 320 to perform the determined action (e.g., play back audio content related to Michael Jackson). The NMD 320 can receive the command and play back the audio content related to Michael Jackson from a media content source. As described above with respect to Figure IB, suitable content sources can include a device or storage communicatively coupled to the NMD 320 via a LAN (e.g., the network 104 of Figure IB), a remote server (e.g., one or more of the remote computing devices 106 of Figure IB), etc. In certain example configurations, however, the NMD 320 determines and/or performs one or more actions corresponding to the one or more voice commands without intervention or involvement of an external device, computer, or server. [0206] Figure 3E is a functional block diagram showing additional features of the NMD 320 in accordance with aspects of the disclosure. The NMD 320 includes components configured to facilitate voice command capture including voice activity detector component s) 312k, beam former components 3121, acoustic echo cancellation (AEC) and/or self-sound suppression components 312m, activation word detector components 312n, and voice/speech conversion components 312o (e.g., voice-to-text and text-to-voice). In the illustrated embodiment of Figure 3E, the foregoing components 312k-312o are shown as separate components. In some example configurations, however, one or more of the components 312k-312o are subcomponents of the processors 112a.

[0207] The beamforming and self-sound suppression components 3121 and 312m are configured to detect an audio signal and determine aspects of voice input represented in the detected audio signal, such as the direction, amplitude, frequency spectrum, etc. The voice activity detector activity components 312k are operably coupled with the beamforming and AEC components 3121 and 312m and are configured to determine a direction and/or directions from which voice activity is likely to have occurred in the detected audio signal. Potential speech directions can be identified by monitoring metrics which distinguish speech from other sounds. Such metrics can include, for example, energy within the speech band relative to background noise and entropy within the speech band, which is measure of spectral structure. As those of ordinary skill in the art will appreciate, speech typically has a lower entropy than most common background noise.

[0208] The speech/text conversion components 312o may facilitate processing by converting speech in the voice input to text. In some example configurations, the electronics 312 can include voice recognition software that is trained to a particular user or a particular set of users associated with a household. Such voice recognition software may implement voice-processing algorithms that are tuned to specific voice profile(s). Tuning to specific voice profiles may require less computationally intensive algorithms than traditional voice activity services, which typically sample from a broad base of users and diverse requests that are not targeted to media playback systems.

[0209] Figure 3F is a schematic diagram of an example voice input 328 captured by the NMD 320 in accordance with aspects of the disclosure. The voice input 328 can include an activation word portion 328a and a voice utterance portion 328b. In some example configurations, the activation word 557a can be a known activation word, such as “Alexa,” which is associated with AMAZON'S ALEXA®. In other example configurations, however, the voice input 328 may not include an activation word. In some example configurations, a network microphone device may output an audible and/or visible response upon detection of the activation word portion 328a. In addition, or alternately, an NMD may output an audible and/or visible response after processing a voice input and/or a series of voice inputs.

[0210] The voice utterance portion 328b may include, for example, one or more spoken commands (identified individually as a first command 328c and a second command 328e) and one or more spoken keywords (identified individually as a first keyword 328d and a second keyword 328f). In one example, the first command 328c can be a command to play music, such as a specific song, album, playlist, etc. In this example, the keywords may be one or words identifying one or more zones in which the music is to be played, such as the Living Room and the Dining Room shown in Figure 1 A. In some examples, the voice utterance portion 328b can include other information, such as detected pauses (e.g., periods of nonspeech) between words spoken by a user, as shown in Figure 3F. The pauses may demarcate the locations of separate commands, keywords, or other information spoke by the user within the voice utterance portion 328b.

[0211] In some example configurations, the media playback system 100 is configured to temporarily reduce the volume of audio content that it is playing while detecting the activation word portion 557a. The media playback system 100 may restore the volume after processing the voice input 328, as shown in Figure 3F. Such a process can be referred to as ducking, examples of which are disclosed in U.S. Patent Application No. 15/438,749, incorporated by reference herein in its entirety.

[0212] Figures 4A-4D are schematic diagrams of a control device 430 (e.g., the control device 130a of Figure 1H, a smartphone, a tablet, a dedicated control device, an loT device, and/or another suitable device) showing corresponding user interface displays in various states of operation. A first user interface display 431a (Figure 4A) includes a display name 433a (i.e., “Rooms”). A selected group region 433b displays audio content information (e.g., artist name, track name, album art) of audio content played back in the selected group and/or zone. Group regions 433c and 433d display corresponding group and/or zone name, and audio content information audio content played back or next in a playback queue of the respective group or zone. An audio content region 433 e includes information related to audio content in the selected group and/or zone (i.e., the group and/or zone indicated in the selected group region 433b). A lower display region 433f is configured to receive touch input to display one or more other user interface displays. For example, if a user selects “Browse” in the lower display region 433f, the control device 430 can be configured to output a second user interface display 431b (Figure 4B) comprising a plurality of music services 433g (e.g., Spotify, Radio by Tunein, Apple Music, Pandora, Amazon, TV, local music, line-in) through which the user can browse and from which the user can select media content for play back via one or more playback devices (e.g., one of the playback devices 110 of Figure 1A). Alternatively, if the user selects “My Sonos” in the lower display region 433f, the control device 430 can be configured to output a third user interface display 431c (Figure 4C). A first media content region 433h can include graphical representations (e.g., album art) corresponding to individual albums, stations, or playlists. A second media content region 433i can include graphical representations (e.g., album art) corresponding to individual songs, tracks, or other media content. If the user selections a graphical representation 433j (Figure 4C), the control device 430 can be configured to begin play back of audio content corresponding to the graphical representation 433j and output a fourth user interface display 43 Id fourth user interface display 43 Id includes an enlarged version of the graphical representation 433j , media content information 433k (e.g., track name, artist, album), transport controls 433m (e.g., play, previous, next, pause, volume), and indication 433n of the currently selected group and/or zone name.

[0213] Figure 5 is a schematic diagram of a control device 530 (e.g., a laptop computer, a desktop computer). The control device 530 includes transducers 534, a microphone 535, and a camera 536. A user interface 531 includes a transport control region 533a, a playback status region 533b, a playback zone region 533c, a playback queue region 533d, and a media content source region 533e. The transport control region comprises one or more controls for controlling media playback including, for example, volume, previous, play/pause, next, repeat, shuffle, track position, crossfade, equalization, etc. The audio content source region 533e includes a listing of one or more media content sources from which a user can select media items for play back and/or adding to a playback queue. [0214] The playback zone region 533b can include representations of playback zones within the media playback system 100 (Figures 1A and IB). In some example configurations, the graphical representations of playback zones may be selectable to bring up additional selectable icons to manage or configure the playback zones in the media playback system, such as a creation of bonded zones, creation of zone groups, separation of zone groups, renaming of zone groups, etc. In the illustrated embodiment, a “group” icon is provided within each of the graphical representations of playback zones. The “group” icon provided within a graphical representation of a particular zone may be selectable to bring up options to select one or more other zones in the media playback system to be grouped with the particular zone. Once grouped, playback devices in the zones that have been grouped with the particular zone can be configured to play audio content in synchrony with the playback device(s) in the particular zone. Analogously, a “group” icon may be provided within a graphical representation of a zone group. In the illustrated embodiment, the “group” icon may be selectable to bring up options to deselect one or more zones in the zone group to be removed from the zone group. In some example configurations, the control device 530 includes other interactions and implementations for grouping and ungrouping zones via the user interface 531. In certain example configurations, the representations of playback zones in the playback zone region 533b can be dynamically updated as playback zone or zone group configurations are modified.

[0215] The playback status region 533c includes graphical representations of audio content that is presently being played, previously played, or scheduled to play next in the selected playback zone or zone group. The selected playback zone or zone group may be visually distinguished on the user interface, such as within the playback zone region 533b and/or the playback queue region 533d. The graphical representations may include track title, artist name, album name, album year, track length, and other relevant information that may be useful for the user to know when controlling the media playback system 100 via the user interface 531.

[0216] The playback queue region 533d includes graphical representations of audio content in a playback queue associated with the selected playback zone or zone group. In some example configurations, each playback zone or zone group may be associated with a playback queue containing information corresponding to zero or more audio items for playback by the playback zone or zone group. For instance, each audio item in the playback queue may comprise a uniform resource identifier (URI), a uniform resource locator (URL) or some other identifier that may be used by a playback device in the playback zone or zone group to find and/or retrieve the audio item from a local audio content source or a networked audio content source, possibly for playback by the playback device. In some example configurations, for example, a playlist can be added to a playback queue, in which information corresponding to each audio item in the playlist may be added to the playback queue. In some example configurations, audio items in a playback queue may be saved as a playlist. In certain example configurations, a playback queue may be empty, or populated but “not in use” when the playback zone or zone group is playing continuously streaming audio content, such as Internet radio that may continue to play until otherwise stopped, rather than discrete audio items that have playback durations. In some example configurations, a playback queue can include Internet radio and/or other streaming audio content items and be “in use” when the playback zone or zone group is playing those items.

[0217] When playback zones or zone groups are “grouped” or “ungrouped,” playback queues associated with the affected playback zones or zone groups may be cleared or reassociated. For example, if a first playback zone including a first playback queue is grouped with a second playback zone including a second playback queue, the established zone group may have an associated playback queue that is initially empty, that contains audio items from the first playback queue (such as if the second playback zone was added to the first playback zone), that contains audio items from the second playback queue (such as if the first playback zone was added to the second playback zone), or a combination of audio items from both the first and second playback queues. Subsequently, if the established zone group is ungrouped, the resulting first playback zone may be re-associated with the previous first playback queue, or be associated with a new playback queue that is empty or contains audio items from the playback queue associated with the established zone group before the established zone group was ungrouped. Similarly, the resulting second playback zone may be re-associated with the previous second playback queue, or be associated with a new playback queue that is empty, or contains audio items from the playback queue associated with the established zone group before the established zone group was ungrouped.

[0218] Figure 6 is a message flow diagram illustrating data exchanges between devices of the media playback system 100 (Figures 1A-1M).

[0219] At step 650a, the media playback system 100 receives an indication of selected media content (e.g., one or more songs, albums, playlists, podcasts, videos, stations) via the control device 130a. The selected media content can comprise, for example, media items stored locally on or more devices (e.g., the audio source 105 of Figure 1C) connected to the media playback system and/or media items stored on one or more media service servers (one or more of the remote computing devices 106 of Figure IB). In response to receiving the indication of the selected media content, the control device 130a transmits a message 651a to the playback device 110a (Figures 1 A-1C) to add the selected media content to a playback queue on the playback device 110a.

[0220] At step 650b, the playback device 110a receives the message 651a and adds the selected media content to the playback queue for play back.

[0221] At step 650c, the control device 130a receives input corresponding to a command to play back the selected media content. In response to receiving the input corresponding to the command to play back the selected media content, the control device 130a transmits a message 651b to the playback device 110a causing the playback device 110a to play back the selected media content. In response to receiving the message 651b, the playback device 110a transmits a message 651c to the first computing device 106a requesting the selected media content. The first computing device 106a, in response to receiving the message 651c, transmits a message 65 Id comprising data (e.g., audio data, video data, a URL, a URI) corresponding to the requested media content.

[0222] At step 650d, the playback device 110a receives the message 65 Id with the data corresponding to the requested media content and plays back the associated media content. [0223] At step 650e, the playback device 110a optionally causes one or more other devices to play back the selected media content. In one example, the playback device 110a is one of a bonded zone of two or more players (Figure IM). The playback device 110a can receive the selected media content and transmit all or a portion of the media content to other devices in the bonded zone. In another example, the playback device 110a is a coordinator of a group and is configured to transmit and receive timing information from one or more other devices in the group. The other one or more devices in the group can receive the selected media content from the first computing device 106a, and begin playback of the selected media content in response to a message from the playback device 110a such that all of the devices in the group play back the selected media content in synchrony.

IV. Example Multi-Player Playback Device

[0224] Figure 7 shows an example block diagram of a Multi-Player Playback Device 700 configurable to implement from one to eight logical players according to some example embodiments.

[0227] Multi-Player Playback Device 700 also includes one or more network interfaces 706. The network interfaces may include any one or more (i) wired network interfaces, e.g., Ethernet, Universal Serial Bus, Firewire, Power-over-Ethernet, or any other type of wired network interface now known or later developed that is suitable for transmitting and receiving the types of data described herein and/or (ii) wireless network interfaces, e.g., WiFi, Bluetooth, 4G/5G, or any other type of wireless interface now known or later developed that is suitable for transmitting and receive the types of data described herein.

[0228] Multi-Player Playback Device 700 includes amplifiers 710-1, 710-2, 710-3, and 710-4, where each amplifier is connected to two audio outputs. In particular, (i) amplifier 710-1 is connected to audio outputs 712-1 and 712-2, (ii) amplifier 710-2 is connected to audio outputs 712-3 and 712-4, (iii) amplifier 710-3 is connected to audio outputs 712-5 and 712-6; and (iv) amplifier 710-4 is connected to audio outputs 712-7 and 712-8. Multi-Player Playback Device 700 includes four amplifiers 710-1 through 710-4 and eight audio outputs 712-1 through 712-8 as an illustrative example. Other embodiments may have a different number of amplifiers, a different number of audio outputs, and/or a different ratio of amplifiers to audio outputs.

[0229] In operation, the Multi-Player Playback Device 700 is configurable to implement from one to eight “logical” players. Each logical player (sometimes referred to herein as simply a player) can be addressed and managed within a playback system as a distinct playback entity.

[0230] In some embodiments, the Multi-Player Playback Device 700 is configured to operate within any of several different operating modes. In some embodiments, the Multi- Player Playback Device 700 is reconfigurable to switch between operating in one mode to operating in a different mode.

[0231] For example, in some embodiments, the Multi-Player Playback Device 700 is configurable to operate in a first mode where the Multi-Player Playback Device 700 is configured to implement eight single-channel logical players, where each logical player is configured to output single-channel audio from one of the eight audio outputs.

[0232] In this first mode of operation, each of the audio outputs 712-1 through 712-8 plays the same channel of audio. In some embodiments, the single-channel audio comprises a mono audio stream. In some embodiments, the single-channel audio comprises one channel from a multi-channel audio stream. In some scenarios, each of the eight logical players can be connected to a different speaker (e.g., free-standing speakers, speakers mounted in walls or ceilings, and so on) so that each connected speaker plays the same channel of audio. In some embodiments, all eight of the logical players are configured to output the same audio stream. In other embodiments, one or more of the logical players may be configured to output different audio than one or more other logical players.

[0233] In some embodiments, the Multi-Player Playback Device 700 is configurable to operate in a second mode where the Multi-Player Playback Device 700 is configured to implement four two-channel logical players, wherein each logical player is configured to output two-channel audio from two of the eight audio outputs.

[0234] In this second mode of operation, (i) a first logical player uses amp 710-1 to drive audio outputs 712-1 and 712-2 to output two channels of audio, (ii) a second logical player uses amp 710-2 to drive audio outputs 712-3 and 712-4 to output two channels of audio, (ii) a third logical player uses amp 710-3 to drive audio outputs 712-5 and 712-6 to output two channels of audio, and (iv) a fourth logical player uses amp 710-4 to drive audio outputs 712- 7 and 712-8 to output two channels of audio. The two channels of audio could be left and right channels of a stereo audio stream. Alternatively, the two channels of audio could be any two different channels of a multichannel audio stream. In some embodiments, all four of the logical players are configured to output the same audio stream. In other embodiments, one or more of the logical players may be configured to output different audio than one or more other logical players. [0235] In some embodiments, the Multi-Player Playback Device 700 is configurable to operate in a third mode where the Multi-Player Playback Device 700 is configured to implement two four-channel logical players, wherein each logical player is configured to output four-channel audio from four of the eight audio outputs.

[0236] In this third mode of operation, (i) a first logical player uses amps 710-1 and 710-2 to drive audio outputs 712-1, 712-2, 712-3, and 712-4 to output four channels of audio, and (ii) a second logical player uses amps 710-3 and 710-4 to drive audio outputs 712-5, 712-6, 712-7, and 712-8 to output four channels of audio. The four channels of audio could be front left, front right, rear left, and rear right channels of a surround sound audio stream. Alternatively, the four channels of audio could be any four different channels of a multichannel audio stream. In some embodiments, both of the logical players are configured to output the same audio stream. In other embodiments, one of the logical players may be configured to output different audio than the other logical player.

[0237] In some embodiments, the Multi-Player Playback Device 700 is configurable to operate in a fourth mode where the Multi-Player Playback Device 700 is configured to implement one eight-channel logical player, where the logical player is configured to output eight-channel audio from the eight audio outputs.

[0238] In this fourth mode of operation, a single logical player uses amps 710-1 through 710-4 to drive audio outputs 712-1 through 712-8 to output eight channels of audio. The eight channels of audio could be front left, front center, front right, rear left, rear center, rear right, left subwoofer, and right subwoofer channels of a surround sound audio stream.

Alternatively, the eight channels of audio could be any eight different channels of a multichannel audio stream. In some embodiments, the logical player is configured to output the same channel of audio via each of the eight audio outputs.

[0239] Although four operating modes are described here for illustrative purposes, the Multi-Player Playback Device 700 in some embodiments is configurable to operate in other operating modes that include combinations of differently configured players. For example, in some embodiments, the Multi-Player Playback Device 700 is configurable to operate in a mode that includes (i) a first logical player that uses amps 710-1, 710-2, and 710-3 to drive audio outputs 712-1, 712-2, and 712-3 to output six channels of audio, and (ii) a second logical player uses amp 710-4 to drive audio output 712-8 to output one channel of audio.

The Multi-Player Playback Device 700 can implement other combinations and configurations of amps and audio outputs to output different channels of audio content. [0240] In some embodiments, several Multi-Player Playback Devices (e.g., several Multi- Player Playback Devices like Multi-Player Playback Device 700) can be bonded together to implement more sophisticated configurations of one or more logical players.

[0241] For example, in some embodiments, the Multi-Player Playback Device 700 is configured to selectively operate in any of a plurality of bonded modes comprising a first bonded mode and a second bonded mode.

[0242] In the first bonded mode, the Multi-Player Playback Device 700 is configured to operate in a bonded configuration with a second Multi-Player Playback Device that has eight audio outputs similar to Multi-Player Playback Device 700. In the first bonded mode, the Multi-Player Playback Device 700 and the second Multi-Player Playback Device are configured to implement a single logical player configured to output any of (i) one set of sixteen channels of audio, (ii) two sets of eight channels of audio, (iii) four sets of four channels of audio, (iv) eight sets of two channels of audio, or (v) sixteen single channels of audio.

V. Example Multi-Stream Audio Routing Architectures

[0244] Figure 8 shows a block diagram of an example multi-stream audio routing architecture 800 for an example Multi-Player Playback Device 801 according to some embodiments. The Multi-Player Playback Device 801 may be the same as or similar to any of the example Multi-Player Playback Device embodiments disclosed and described herein, including but not limited to Multi-Player Playback Device 700 shown and described with reference to Figure 7. [0245] Figure 8 shows certain aspects of the multi-stream audio routing architecture 800 implemented in Multi-Player Playback Device 801 for ease of illustration. However, persons of skill in the art will understand that Multi-Player Playback Device 801 and the multi-stream audio routing architecture 800 include additional components not depicted in Figure 8. In some instances, the Multi-Player Playback Device 801 can be configured to play different audio streams (or different combinations of audio streams) in different playback zones, and thus, the multi-stream audio routing architecture 800 is sometimes referred to a multi-zone routing architecture.

[0246] As described in more detail in this section, the multi-steam audio routing architecture 800 enables the Multi-Player Playback Device 801 to flexibly mix (via mixer 808) incoming audio streams 802 received from any (or all) of the audio input ports 804, process the mixed audio streams via one or more DSP processing stages 810 and 816, and route (via a stream selector 812) the mixed and processed audio streams to any (or all) of the audio outputs 818 and/or other playback entities 850.

[0247] In some embodiments, the multi-steam audio routing architecture 800 enables the Multi-Player Playback Device 801 to be switched between and among different routing/mixing/processing/output configurations based on a particular operating mode (or operating state). In one example, the Multi-Player Playback Device 801 can be configured for several different operating modes (e.g., 2, 3, 4, or more operating modes), where each operating mode has a different routing/mixing/processing/output configuration implemented by the multi-steam audio routing architecture 800. In operation, the Multi-Player Playback Device 801 routes, mixes, processes, and outputs audio streams according to a first configuration of the multi-steam audio routing architecture 800 while operating in the first operating mode. And while the Multi-Player Playback Device 801 is operating in the first operating mode, the Multi-Player Playback Device 801 can switch to operating in a second operating mode in which the Multi-Player Playback Device 801 routes, mixes, processes, and outputs audio streams according to a second configuration of the multi-steam audio routing architecture 800.

[0248] For example, one operating mode (or operating state) may include a typical 1: 1 mapping of audio input ports 804 to audio outputs 818, i.e., where a first input audio stream 802a received on a first audio input port 804a is routing to a corresponding first audio output 818a, a second input audio stream 802b received on a second audio input port 804b is routing to a corresponding second audio output 818b, and so on. [0249] Another example operating mode (or operating state) may comprise a first plurality of input audio streams (e.g., various audio sources such as streaming media, television, or perhaps another local source) routed to a set of audio outputs 818 configured to play the first plurality of input audio streams, and a separate audio input stream (e.g., voice assistant, intercom, doorbell) that the Multi-Player Playback Device 801 (i) selectively mixes with some or all the other input audio streams and (ii) routes to the audio outputs 818 configured to play the first plurality of input audio streams.

[0250] In another example operating mode (or operating state), the Multi-Player Playback Device 801 is configured to implement a home theater listening scenario (or state) in which six channels (e.g., left, center, right, left rear, right rear, and low frequency effects (LFE) channels) are mapped to a first set of Multi-Player Playback Device 801 audio outputs 818 (and/or to other playback entities 850 in a home theater configuration), and two channels of the same home theater audio (e.g., either a downmix of the 5.1 audio or a dedicated stereo stream) are sent to a different set of Multi-Player Playback Device 801 audio outputs 818 (and/or to other playback entities 850).

[0251] Features and aspects of the multi-steam audio routing architecture 800 are described further below with reference to Figure 8 and Figure 9, where Figure 8 illustrates functional components of the multi-steam audio routing architecture 800 and Figure 9 illustrates aspects of a method 900 implemented by a Multi-Player Playback Device 801 incorporating the multi-steam audio routing architecture 800 of Figure 8.

[0252] The example Multi-Player Playback Device 801 depicted in Figure 8 includes (i) one or more network interfaces 814 configured to transmit and receive audio streams 802 comprising audio information, (ii) one or more audio input ports 804 configured to receive audio streams 802 comprising audio information, and (iii) one or more audio outputs 818 configured to output analog audio signals to one or more loudspeakers (not shown).

[0253] The one or more network interfaces 814 may be similar to or the same as any of the network interfaces disclosed herein, including but not limited to the one or more network interfaces 706 described with reference to Figure 7. For example, the one or more network interfaces 814 may any one or more (i) wired network interfaces, e.g., Ethernet, Universal Serial Bus, Firewire, Power-over-Ethernet, or any other type of wired network interface now known or later developed that is suitable for transmitting and receiving the types of data described herein and/or (ii) wireless network interfaces, e.g., WiFi, Bluetooth, 4G/5G, or any other type of wireless interface now known or later developed that is suitable for transmitting and receive the types of data described herein. Network interface(s) 814 is depicted as transmitting an audio stream to another playback entity 850 (which may be a playback device, another Multi-Player Playback Device, a logical player) for illustration purposes. Persons of skill in the art will understand that network interfaces 814 are bi-directional and can both transmit and receive audio streams to/from other network devices, e.g., audio sources and/or other playback entities.

[0254] The one or more audio input ports 804a, 804b, 804c, and 804d (collectively referred to as audio input ports 804) may be similar to or the same as any of the audio input ports disclosed and described herein, including but not limited to digital line-in inputs, analog line- in inputs, optical line-in inputs, wireless and/or wired network inputs (e.g., when audio data is routed from the network interfaces 814 to one of the audio input ports 804), or any other type of audio input now known or later developed that is suitable for receiving audio content from an audio source for processing and/or playback by the Multi-Player Playback Device 801. In some embodiments, data from the one or more network interfaces 814 is routed to one or more of the audio input ports 804 so that the audio stream 802 received at the audio input port 804 is an audio stream from one of the one or more network interfaces 814.

[0255] For illustration purposes only, Figure 8 depicts four audio input ports 804a, 804b, 804c, and 804d, where each audio input is configured to receive two channels of a single audio stream. Thus, each audio stream 802a, 802b, 802c, and 802d is depicted as a two- channel audio stream. However, the Multi-Player Playback Device 801 may include more or fewer audio input ports than the four illustrated in Figure 8. Further, each audio input may be configured to receive an audio stream having more or fewer than two channels, e.g., a single channel (mono) audio stream, a three-channel audio stream, a five-channel audio stream, and so on.

[0256] The one or more audio outputs 818a, 818b, 818c, and 818d (collectively referred to audio outputs 818) may be the same as or similar to any of the audio outputs disclosed and described herein, including but not limited to audio outputs 712-1, 712-2, 712-3, and 712-4 shown and described with reference to Figure 7 or any other type of audio output now known or later developed that is suitable for providing an audio signal capable of driving either (or both) of (i) one or more loudspeakers integrated with the Multi-Player Playback Device 801 and/or (ii) one or more external loudspeakers (active or passive) connected to the Multi- Player Playback Device 801.

[0257] The multi-stream audio routing architecture 800 of Multi-Player Playback Device 801 also includes a router 806 that is configured to route incoming audio streams 802a, 802b, and 802c, and 802d (collectively referred to as audio streams 802) containing audio content for processing and/or playback by the Multi-Player Playback Device 801 from the one or more audio input ports 804 to a mixer 808. In some instances, because the Multi-Player Playback Device 801 can be configured to play different audio streams (or different combinations of audio streams) in different playback zones, the router 806 is sometimes referred to herein as a zone router.

[0258] The router 806 in some embodiments includes a plurality of ports 807a, 807b, 807c, and 807d (collectively referred to as router ports 807). Each router port is configured to route one incoming audio stream 802 from one of the audio input ports 804 to the mixer 808 (i.e., to one or more mixer inputs of the mixer 808). When two or more audio streams 802 from the audio input ports are routed to the same input port of the mixer 808, the mixer 808 (i) mixes the two or more incoming audio streams into a single mixed audio stream, and (ii) passes the mixed audio stream to a stream-specific digital signal processing (DSP) stage 810. [0259] For example, in the scenario shown in Figure 8, the router 806 routes audio stream 802a and audio stream 802b to the first input of the mixer 808, and the router 806 routes audio stream 802c and audio stream 802d to the fourth input of the mixer 808. In the example shown in Figure 8, the router 806 does not route any of the audio streams to the second input or the third input of the mixer 808. However, the routing of the audio streams 802 from the router 806 to the mixer 808 depicted in Figure 8 is shown for illustration purposes only. In operation, the router 806 is configured to route any one or more of the audio streams 802 from any of the audio input ports 804 to any one or more input ports of the mixer 808.

[0260] As mentioned earlier, the mixer 808 is configured to (i) mix the audio streams 802 routed by the router 806 to each input of the mixer 808, and (ii) pass the mixed audio stream to a stream-specific digital signal processing (DSP) stage 810.

[0261] For example, in Figure 8, the router 806 routes audio stream 802a and audio stream 802b to the first input of the mixer 808, where mixer 808 (i) mixes audio stream 802a and audio stream 802b to generate a mixed audio stream 802a+b, and (ii) passes the mixed audio stream 802a+b to the stream-specific DSP stage 810 for processing. Also, in Figure 8, the router 806 routes audio stream 802c and audio stream 802d to the fourth input of the mixer 808, where mixer 808 (i) mixes audio stream 802c and audio stream 802d to generate a mixed audio stream 802c+d, and (ii) passes the mixed audio stream 802c+d to the streamspecific DSP stage 810 for processing.

[0262] For each mixed stream received from the mixer 808, the stream-specific DSP stage 810 performs stream-specific DSP processing on the mixed stream and outputs a processed stream to a stream selector 812. The stream-specific DSP stage 810 includes DSP block 810a, DSP block 810b, DSP block 810c, and DSP block 810d. In some embodiments, the stream-specific DSP stage 810 may include more or fewer DSP blocks than shown in Figure 8. In some instances, because the Multi-Player Playback Device 801 can be configured to play different audio streams (or different combinations of audio streams) in different playback zones, the stream-specific DSP stage 810 is sometimes referred to a zone-specific DSP stage.

[0263] In some embodiments, each DSP block of the DSP blocks 810a through 810d is a separate hardware DSP chip. However, in other embodiments, stream-specific DSP stage 810 includes a single DSP chip configured to process multiple audio streams. In other configurations, the stream-specific DSP stage 810 is implemented with software configured to preform DSP functions. Any other combination and/or configuration of hardware DSP chips and/or DSP software that is suitable for performing stream-specific DSP processing could be used as well.

[0264] In the example shown in Figure 8, DSP block 810a processes mixed stream 802a+b, and DSP block 810d processes mixed stream 802c+d.

[0267] The stream selector 812 selects and sends one or more processed streams received from the stream-specific DSP stage 810 to one or more audio outputs 818a, 818b, 818c, and 818d (collectively referred to as audio outputs 818) via a second DSP processing stage 816. However, in some embodiments, stream selector 812 selects and sends the one or more processed streams received from the stream-specific DSP stage 810 to one or more of the audio outputs 818 without any second stage DSP processing. In some configurations, the stream selector 812 (i) selects and sends some of the processed audio streams received from the stream-specific DSP stage 810 to one or more audio outputs 818 via the second DSP processing stage 816 and (ii) selects and sends other processed audio streams received from the stream-specific DSP stage 810 to the one or more audio outputs 818 without second stage DSP processing. In some instances, because the Multi-Player Playback Device 801 can be configured to play different audio streams (or different combinations of audio streams) in different playback zones, the selector 812 is sometimes referred to a zone selector.

[0268] In Figure 8, stream selector 812 sends processed audio stream 802a+b received from DSP block 810a of the stream-specific DSP stage 810 to audio outputs 818a, 818b, and 818c via second DSP processing stage 816. Stream selector also sends processed audio stream 802c+d received from DSP block 810d of the stream-specific DSP stage 810 to (i) audio outputs 818c, and 818d via second DSP processing stage 816 and (ii) to playback entity 850 via network interface(s) 814.

[0269] In contrast to the stream-specific DSP stage 810 that is configurable to apply different DSP processing to each audio stream individually, in some embodiments, the second DSP processing stage 816 is configured to apply the same DSP processing to all of the processed audio streams output via the audio outputs 818 of the Multi-Player Playback Device 801. Accordingly, in some embodiments, the second DSP processing stage 816 is sometimes referred to as the device-specific DSP stage (or the Multi-Player Playback Device specific DSP stage). However, in other embodiments, the second stage DSP processing performed by the second DSP processing stage 816 is on a stream-by-stream basis similar to the stream-specific DSP stage 810.

[0270] In some instances, the second stage DSP processing performed by the second DSP processing stage 816 can be advantageous in scenarios where the stream selector 812 routes two or more processed streams from two or more different inputs of the stream selector 812 to the same audio output. For example, as mentioned above, in Figure 8, stream selector 812 sends processed audio stream 802a+b and processed audio stream 802c+d to audio output 818c. However, when both processed audio stream 802a+b and processed audio stream 802c+d are output from the same audio output 818c, aspects of the audio streams may combine

[0271] In some instances, combining several processed streams from the stream-specific DSP stage 810 for output via the same audio output 818 can potentially cause equalization problems where audio content at certain frequencies add up at certain points during playback to cause poor sound quality. Similarly, combining several processed streams can in some instances cause phase cancellations where certain frequencies out of phase from each other at the same time may cancel each other out, which can also cause poor sound quality.

[0272] Second stage DSP processing performed by the second DSP processing stage 816 avoids or at least ameliorates the problems that potentially arise when the stream selector 812 sends several processed streams to the same audio output by, for example, analyzing and processing the streams to eliminate or reduce any undesirable audio effects appearing in the combined audio streams.

[0273] In some embodiments, one or both of the stream-specific DSP processing performed by the stream-specific DSP stage 810 and/or the second-stage DSP processing performed by the second DSP processing stage 816 are based at least in part on one or more microphone inputs 820 from one or more microphones positioned within the listening area. In operation, the microphone inputs 820 include audio data collected from (i) one or more microphones integrated with the Multi-Player Playback Device 801, (ii) one or more microphones positioned in a listening area where the Multi-Player Playback Device 801 is configured to play audio, and/or (iii) one or more microphones integrated with other playback devices and/or Multi-Player Playback Devices that are configured to provide audio data to the Multi- Player Playback Device 801 implementing the stream-specific DSP processing at the streamspecific DSP stage 810 and/or the second-stage DSP processing at the second DSP processing stage 816.

[0274] For example, after detecting certain acoustic characteristics within a listening area via the one or more microphone inputs 820, some embodiments include the Multi-Player Playback Device 801 using the detected acoustic characteristics to control aspects of the stream-specific DSP processing performed by the stream-specific DSP stage 810 and/or the second-stage DSP processing performed by the second DSP processing stage 816. For instance, if the listening area tends to amplify or attenuate certain frequencies, the Multi- Player Playback Device 801 can correct for the detected amplification or attenuation in DSP stage 810 and/or DSP stage 816. Similarly, in some embodiments, the Multi-Player Playback Device 801 can use DSP stage 810 and/or DSP 816 to alter playback timing and/or playback timing delays based on detected acoustic characteristics (e.g., arrival times of acoustic signals detected by microphones within the listening area) to implement surround sound or soundscape effects and/or eliminate or ameliorate echo effects in different parts of the listening area resulting from loudspeaker placement or playback volumes.

VI. Example Methods

[0275] Figure 9 shows an example method implemented by a Multi-Player Playback Device according to some embodiments. Method 900 may be performed by any of the Multi- Player Playback Device embodiments disclosed herein, including but not limited to Multi- Player Playback Device 700 shown and described with reference to Figure 7. In operation, any of the Multi-Player Playback Device embodiments disclosed herein may be configured to implement one or more (or all) of the features and functions of method 900 via a multi-stream audio routing architecture such as the multi-stream audio routing architecture examples shown and described with reference to Figure 8.

[0276] Method 900 is illustrated in different functional blocks for ease of explanation. In operation, the functions described in each block can be performed in any order and/or concurrently with each other, without limitation. Additionally, one or more (or all) of the additional/altemative embodiments can be performed in any combination, any order, and/or concurrently with each other, without limitation.

[0277] Method 900 begins at block 902, which includes selectively operating the Multi- Player Playback Device in one of a plurality of operating modes. The operating modes of block 902 include any of the operating modes disclosed herein.

[0278] At block 904, method 900 includes routing each audio stream to at least one mixer input of one or more mixer inputs based at least in part on a current operating mode in which the Multi-Player Playback Device is operating.

[0279] At block 906, method 900 includes for the one or more audio streams routed to an individual mixer input of the one or more mixer inputs, (a) generating a mixed stream based on the one or more audio streams routed to the mixer input, and (b) routing the mixed stream to at least one audio output of the one or more audio outputs based at least in part on a channel map associated with the current operating mode in which the Multi-Player Playback Device is operating. In some embodiments, generating a mixed stream based on the one or more audio streams routed to the mixer input in block 906 comprises generating the mixed stream by mixing a first audio stream with a second audio stream to generate the mixed stream.

[0280] At block 908, method 900 includes playing first audio corresponding to a first audio stream via at least one first loudspeaker connected to a first audio output in synchrony with at least one of (i) playback of second audio corresponding to a second audio stream via at least one second loudspeaker connected to a second audio output or (ii) playback of the first audio stream or playback of the second audio stream by a second playback entity (e.g., a playback device, a playback entity, and/or another Multi-Player Playback Device).

[0281] In some embodiments, Multi-Player Playback Device is configured to implement synchronous playback according to any of the synchronous playback approaches described in: (i) U.S. App. 10/816,217, titled “System And Method For Synchronizing Operations Among A Plurality Of Independently Clocked Digital Data Processing Devices,” referred to as Docket No. 04-0401, filed on Apr. 1, 2004, and issued on Jul. 31, 2012, as U.S. Pat. 8,234,395; (ii) U.S. App. 10/816,217, titled “Distributed Synchronization,” referred to as Docket No. 18-0603, filed Oct. 15, 2018, and issued on Aug. 16, 2022, as U.S. Pat.

11,416,209; and/or (iii) U.S. App. 18/478,063, titled “Multichannel Content Distribution,” referred to as Docket 22-0207, filed Sep. 29, 2023, and currently pending. The entire contends of U.S. Apps. 10/816,217; 10/816,217; and 18/478,063 are incorporated herein by reference

[0282] At block 910, method 900 includes while operating in a first operating mode, the Multi-Player Playback Device switching to operating in one of a second operating mode or a third operating mode. In some embodiments, block 910 includes the Multi-Player Playback Device any one or more (or all) of (i) while operating in a first operating mode, switching to operating in one of a second operating mode or a third operating mode, (ii) while operating in the second operating mode, switching to operating in one of the first operating mode or the second operating mode; and/or (iii) while operating in the third operating mode, switching to operating in one of the first operating mode or the second operating mode.

[0283] In some embodiments, selectively operating the Multi-Player Playback Device in one of a plurality of operating modes at block 902 includes operating the Multi-Player Playback Device in a first operating mode.

[0284] In some embodiments where selectively operating the Multi-Player Playback Device in one of a plurality of operating modes at block 902 includes operating the Multi- Player Playback Device in the first operating mode: (i) the block 904 step of routing each audio stream to at least one mixer input of one or more mixer inputs includes routing each audio stream to a common mixer input; (ii) the block 906 step of generating at least one mixed stream based on one or more audio streams routed to the mixer input includes generating one mixed stream based on the audio streams routed to the common mixer input; and (iii) the block 906 step of routing the mixed stream to at least one audio output of the one or more audio outputs based at least in part on a channel map associated with the first operating mode includes routing the one mixed stream to one or more separate audio outputs of the one or more of audio outputs.

[0285] In some embodiments, selectively operating the Multi-Player Playback Device in one of a plurality of operating modes at block 902 includes operating the Multi-Player Playback Device in a second operating mode.

[0286] In some embodiments where selectively operating the Multi-Player Playback Device in one of a plurality of operating modes at block 902 includes operating the Multi- Player Playback Device in the second operating mode: (i) the block 904 step of routing each audio stream to at least one mixer input of one or more mixer inputs includes (a) routing each audio stream in a first set of one or more audio streams to a separate mixer input of the plurality of mixer inputs, and (b) routing each audio stream in a second set of one or more audio streams to every mixer input of the plurality of mixer inputs; and (ii) the block 906 step of generating a mixed stream based on the one or more audio streams routed to the mixer input includes generating a plurality of mixed audio streams, wherein each mixed audio stream comprises (a) at least one audio stream in the first set of one or more audio streams and (b) each audio stream in the second set of one or more audio streams.

[0287] In some embodiments, selectively operating the Multi-Player Playback Device in one of a plurality of operating modes at block 902 includes operating the Multi-Player Playback Device in a third operating mode.

[0288] In some embodiments where selectively operating the Multi-Player Playback Device in one of a plurality of operating modes at block 902 includes operating the Multi- Player Playback Device in the third operating mode: (i) the block 904 step of routing each audio stream to at least one mixer input of one or more mixer inputs includes (a) routing three or more audio streams to a first set of three or more corresponding mixer inputs and (b) routing the three or more audio streams to a separate mixer input; (ii) the block 906 step of generating a mixed stream based on the one or more audio streams routed to the mixer input includes generating mixed stream based on the three or more audio streams routed to the separate mixer input; and (iii) the block 906 step of routing the mixed stream to at least one audio output of the one or more audio outputs based at least in part on a channel map associated with the third operating mode includes (i) routing each of the three or more streams to a separate corresponding audio output, and (ii) routing the mixed stream to a separate audio output.

[0289] In some embodiments, in the third operating mode, the three or more audio streams include (i) a first audio stream comprising left front channel audio information, (ii) a second audio stream comprising right front channel audio information, (iii) a third audio stream comprising center channel audio information, and (iv) a fourth audio stream comprising subwoofer audio information. In some embodiments, in the third operating mode, the three or more audio streams further comprise (i) a fifth audio stream comprising left rear channel audio information, (ii) a sixth audio stream comprising right rear channel audio information. [0290] In some examples, the Multi-Player Playback Device 700 includes one or more generative artificial intelligence (Al) models (e.g., stored on the memory(s) 704) and/or separate generative Al components (e.g., one or modules, processors, GPUs) (not shown). In these examples, one or more steps of the method 900, for instance, are performed via a machine learning model such as one of the one or more generative Al models. In some cases, the one or more generative Al models may perform all of the steps of the method 900 in fewer or more operations than described. Additional details can be found in, for example (i) U.S. App. 18/181,727, titled “Playback of Generative Media Content,” referred to as Docket No. 20-0703, filed on Mar. 10, 2023, and issued on May 14, 2024, as U.S. Patent 11,985,376; and (ii) U.S. App. 18/636,089, titled “Generative Audio Playback Via Wearable Playback Devices,” referred to as Docket No. 22-0403, filed on Apr. 15, 2024, and issued on Dec. 24, 2024, as U.S. Patent 12,175,161. The entire contents of Apps. 18/181,727 and 18/636,089 are incorporated herein by reference.

[0291] In some examples, the Multi-Player Playback Device 700 interacts with one or more distributed ledgers to retrieve data therefrom and/or store data thereon. For instance, the Multi-Player Playback Device can store information related to the content consumption data on a content experience record set (CERS) and/or information related to device and/or user contextual data on a content network record set (CNRS). In some examples, the Multi-Player Playback Devices is a node on a local distributed ledger that comprises one or more additional network devices. In some examples, the data stored on the local distributed ledger is used as input to the one or more generative Al models described above. Additional details can be found in, for example (i) U.S. App. 18/671,824, titled “Generating Digital Media Based on Blockchain Data,” referred to as Docket No. 22-0402, filed on May 22, 2024, and published on Sep. 19, 2024, as U.S. Pub. 2024/0311416; and (ii) PCT App.

PCT/US2024/039870, titled “Systems and Methods for Maintaining Distributed Media Content History and Preferences,” referred to as Docket No. 24-0602-PCT, filed on July 26, 2024, published on Feb. 6, 2025, as WO/2025/029673, and currently pending. The entire contents of Apps. 18/671,824 and PCT/US2024/039870 are incorporated herein by reference.

VII. Example Embodiments

[0292] The following section summarizes several examples. The examples (and features thereof) summarized in this section are for illustration purposes. The invention(s) disclosed and described herein are not limited to the examples summarized in this section or to any other example disclosed elsewhere herein. Any of the examples disclosed in this section, and any features of any of the examples, may be used together with each other in any combination, so long as the example (or feature(s) thereof) are not mutually exclusive. Further, any example (or feature(s) thereof) disclosed in any other section of this disclosure may be combined with any other example (or feature(s) thereof) disclosed in this section and/or any other section, in any combination, so long as the example (or feature(s) thereof) are not mutually exclusive.

[0293] Example 1: A playback device comprising: (i) one or more network interfaces configured to transmit and receive audio streams comprising audio information; (ii) one or more audio input ports configured to receive audio streams comprising audio information; (iii) one or more audio outputs configured to output analog audio signals to one or more loudspeakers; (iv) one or more processors; and (v) tangible, non-transitory computer-readable media having program instructions stored therein. In some examples, the program instructions, when executed by the one or more processors, cause the playback device to perform functions comprising selectively operating in one of a plurality of operating modes, wherein, in each operating mode, the playback device is configured to implement a multistage audio processing procedure comprising: (i) routing each audio stream to at least one mixer input of one or more mixer inputs based at least in part on a current operating mode in which the playback device is operating; (ii) for the one or more audio streams routed to an individual mixer input of the one or more mixer inputs, (a) generating a mixed stream based on the one or more audio streams routed to the mixer input, and (b) routing the mixed stream to at least one audio output of the one or more audio outputs based at least in part on a channel map associated with the current operating mode in which the playback device is operating.

[0294] Example 2: The playback device of Example 1, wherein generating a mixed stream based on the one or more audio streams routed to the mixer input comprises generating the mixed stream by mixing a first audio stream with a second audio stream.

[0295] Example 3: The playback device of Example 1 (individually or in combination with any other suitable preceding Example), wherein the plurality of operating modes comprises a first operating mode, and wherein in the first operating mode: (i) routing each audio stream to at least one mixer input of one or more mixer inputs based at least in part on a current operating mode in which the playback device is operating comprises routing each audio stream to a common mixer input; (ii) generating at least one mixed stream based on one or more audio streams routed to the mixer input comprises, generating one mixed stream based on the audio streams routed to the common mixer input; and (iii) routing the mixed stream to at least one audio output of the one or more audio outputs based at least in part on a channel map associated with the current operating mode in which the playback device is operating comprises routing the one mixed stream to one or more separate audio outputs of the one or more of audio outputs.

[0296] Example 4: The playback device of Example 1 (individually or in combination with any other suitable preceding Example), wherein the plurality of operating modes comprises a second operating mode, and wherein in the second operating mode: (A) routing each audio stream to at least one mixer input of one or more mixer inputs based at least in part on a current operating mode in which the playback device is operating comprises (i) routing each audio stream in a first set of one or more audio streams to a separate mixer input of the plurality of mixer inputs, and (ii) routing each audio stream in a second set of one or more audio streams to every mixer input of the plurality of mixer inputs; and (B) generating a mixed stream based on the one or more audio streams routed to the mixer input comprises generating a plurality of mixed audio streams, wherein each mixed audio stream comprises (i) at least one audio stream in the first set of one or more audio streams and (ii) each audio stream in the second set of one or more audio streams.

[0297] Example 5: The playback device of Example 1 (individually or in combination with any other suitable preceding Example), wherein the plurality of operating modes comprises a third operating mode, and wherein in the third operating mode: (A) routing each audio stream to at least one mixer input of one or more mixer inputs based at least in part on a current operating mode in which the playback device is operating comprises (i) routing three or more audio streams to a first set of three or more corresponding mixer inputs and (ii) routing the three or more audio streams to a separate mixer input; (B) generating a mixed stream based on the one or more audio streams routed to the mixer input comprises generating mixed stream based on the three or more audio streams routed to the separate mixer input; and (C) routing the mixed stream to at least one audio output of the one or more audio outputs based at least in part on a channel map associated with the current operating mode in which the playback device is operating comprises (i) routing each of the three or more streams to a separate corresponding audio output, and (ii) routing the mixed stream to a separate audio output.

[0298] Example 6: The playback device of Example 5 (individually or in combination with any other suitable preceding Example), wherein the three or more audio streams comprise (i) a first audio stream comprising left front channel audio information, (ii) a second audio stream comprising right front channel audio information, (iii) a third audio stream comprising center channel audio information, and (iv) a fourth audio stream comprising subwoofer audio information. [0299] Example 7: The playback device of Example 6 (individually or in combination with any other suitable preceding Example), wherein the three or more audio streams further comprise (i) a fifth audio stream comprising left rear channel audio information, (ii) a sixth audio stream comprising right rear channel audio information.

[0300] Example 8: The playback device of Example 1 (individually or in combination with any other suitable preceding Example), wherein the program instructions comprise program instructions that, when executed by the one or more processors, cause the playback device to perform further functions comprising: (i) while operating in a first operating mode, switch to operating in one of a second operating mode or a third operating mode; (ii) while operating in the second operating mode, switch to operating in one of the first operating mode or the second operating mode; and (iii) while operating in the third operating mode, switch to operating in one of the first operating mode or the second operating mode.

[0301] Example 9: The playback device of Example 1 (individually or in combination with any other suitable preceding Example), further comprising: (i) at least one router configured to perform one or more aspects of routing each audio stream to at least one mixer input of one or more mixer inputs based at least in part on a current operating mode in which the playback device is operating; (ii) at least one mixer configured to perform one or more aspects of, for the one or more audio streams routed to an individual mixer input of the one or more mixer inputs, generating a mixed stream based on the one or more audio streams routed to the mixer input; and (iii) at least one zone selector configured to perform one or more aspects of, for the one or more audio streams routed to an individual mixer input of the one or more mixer inputs, routing the mixed stream to at least one audio output of the one or more audio outputs based at least in part on a channel map associated with the current operating mode in which the playback device is operating.

[0302] Example 10: The playback device of Example 1 (individually or in combination with any other suitable preceding Example), wherein the program instructions comprise program instructions that, when executed by the one or more processors, cause the playback device to perform further functions comprising playing first audio corresponding to a first audio stream via at least one first loudspeaker connected to a first audio output in synchrony with at least one of (i) playback of second audio corresponding to a second audio stream via at least one second loudspeaker connected to a second audio output or (ii) playback of the first audio stream or playback of the second audio stream by a second playback device.

[0303] Example 11: Tangible, non-transitory computer-readable media comprising program instructions, wherein the program instructions, when executed by one or more processors, cause a playback device to perform functions comprising: (A) transmitting and receiving audio streams comprising audio information via one or more network interfaces;

(B) receiving audio streams comprising audio information via one or more audio interfaces;

(C) output analog audio signals to one or more loudspeakers via one or more audio outputs;

(D) selectively operating in one of a plurality of operating modes, wherein, in each operating mode, the playback device is configured to implement a multi-stage audio processing procedure comprising: (i) routing each audio stream to at least one mixer input of one or more mixer inputs based at least in part on a current operating mode in which the playback device is operating; (ii) for the one or more audio streams routed to an individual mixer input of the one or more mixer inputs, (a) generating a mixed stream based on the one or more audio streams routed to the mixer input, and (b) routing the mixed stream to at least one audio output of the one or more audio outputs based at least in part on a channel map associated with the current operating mode in which the playback device is operating.

[0304] Example 12: The tangible, non-transitory computer-readable media of Example 11, wherein generating a mixed stream based on the one or more audio streams routed to the mixer input comprises generating the mixed stream by mixing a first audio stream with a second audio stream.

[0305] Example 13: The tangible, non-transitory computer-readable media of Example 11 (individually or in combination with any other suitable preceding Example), wherein the plurality of operating modes comprises a first operating mode, and wherein in the first operating mode, the playback device is configured to perform functions comprising: (i) routing each audio stream to at least one mixer input of one or more mixer inputs based at least in part on a current operating mode in which the playback device is operating comprises routing each audio stream to a common mixer input; (ii) generating at least one mixed stream based on one or more audio streams routed to the mixer input comprises, generating one mixed stream based on the audio streams routed to the common mixer input; and (iii) routing the mixed stream to at least one audio output of the one or more audio outputs based at least in part on a channel map associated with the current operating mode in which the playback device is operating comprises routing the one mixed stream to one or more separate audio outputs of the one or more of audio outputs.

[0306] Example 14: The tangible, non-transitory computer-readable media of Example 11 (individually or in combination with any other suitable preceding Example), wherein the plurality of operating modes comprises a second operating mode, and wherein in the second operating mode, the playback device is configured to perform functions comprising: (A) routing each audio stream to at least one mixer input of one or more mixer inputs based at least in part on a current operating mode in which the playback device is operating comprises (i) routing each audio stream in a first set of one or more audio streams to a separate mixer input of the plurality of mixer inputs, and (ii) routing each audio stream in a second set of one or more audio streams to every mixer input of the plurality of mixer inputs; and (B)generating a mixed stream based on the one or more audio streams routed to the mixer input comprises generating a plurality of mixed audio streams, wherein each mixed audio stream comprises (i) at least one audio stream in the first set of one or more audio streams and (ii) each audio stream in the second set of one or more audio streams.

[0307] Example 15: The tangible, non-transitory computer-readable media of Example 11 (individually or in combination with any other suitable preceding Example), wherein the plurality of operating modes comprises a third operating mode, and wherein in the third operating mode, the playback device is configured to perform functions comprising: (A) routing each audio stream to at least one mixer input of one or more mixer inputs based at least in part on a current operating mode in which the playback device is operating comprises (i) routing three or more audio streams to a first set of three or more corresponding mixer inputs and (ii) routing the three or more audio streams to a separate mixer input; (B) generating a mixed stream based on the one or more audio streams routed to the mixer input comprises generating mixed stream based on the three or more audio streams routed to the separate mixer input; and (C) routing the mixed stream to at least one audio output of the one or more audio outputs based at least in part on a channel map associated with the current operating mode in which the playback device is operating comprises (i) routing each of the three or more streams to a separate corresponding audio output, and (ii) routing the mixed stream to a separate audio output.

[0308] Example 16: The tangible, non-transitory computer-readable media of Example 15 (individually or in combination with any other suitable preceding Example), wherein the three or more audio streams comprise (i) a first audio stream comprising left front channel audio information, (ii) a second audio stream comprising right front channel audio information, (iii) a fourth audio stream comprising center channel audio information, and (iv) a fourth audio stream comprising subwoofer audio information.

[0309] Example 17: The tangible, non-transitory computer-readable media of Example 16 (individually or in combination with any other suitable preceding Example), wherein the three or more audio streams further comprise (i) a fifth audio stream comprising left rear channel audio information, (ii) a sixth audio stream comprising right rear channel audio information.

[0310] Example 18: The tangible, non-transitory computer-readable media of Example 11 (individually or in combination with any other suitable preceding Example), wherein the functions further comprise: (i) while operating in a first operating mode, switch to operating in one of a second operating mode or a third operating mode; (ii) while operating in the second operating mode, switch to operating in one of the first operating mode or the second operating mode; and (iii) while operating in the third operating mode, switch to operating in one of the first operating mode or the second operating mode.

[0311] Example 19: The tangible, non-transitory computer-readable media of Example 11 (individually or in combination with any other suitable preceding Example), wherein the functions further comprise playing first audio corresponding to a first audio stream via at least one first loudspeaker connected to a first audio output in synchrony with at least one of (i) playback of second audio corresponding to a second audio stream via at least one second loudspeaker connected to a second audio output or (ii) playback of the first audio stream or playback of the second audio stream by a second playback device.

[0312] Example 20: A method performed by a playback device, the method comprising: (A) transmitting and receiving audio streams comprising audio information via one or more network interfaces; (B) receiving audio streams comprising audio information via one or more audio interfaces; (C) output analog audio signals to one or more loudspeakers via one or more audio outputs; (D) selectively operating in one of a plurality of operating modes, wherein, in each operating mode, the playback device is configured to implement a multistage audio processing procedure comprising: (i) routing each audio stream to at least one mixer input of one or more mixer inputs based at least in part on a current operating mode in which the playback device is operating; (ii) for the one or more audio streams routed to an individual mixer input of the one or more mixer inputs, (a) generating a mixed stream based on the one or more audio streams routed to the mixer input, and (b) routing the mixed stream to at least one audio output of the one or more audio outputs based at least in part on a channel map associated with the current operating mode in which the playback device is operating.

[0313] Example 21: The playback device of Example 1 (individually or in combination with any other suitable preceding Example), wherein each operating mode of the plurality of operating modes comprises a distinct channel map that defines routing of mixed streams to audio outputs differently than channel maps of other operating modes. [0314] Example 22: The tangible, non-transitory computer-readable media of Example 11 (individually or in combination with any other suitable preceding Example), wherein each operating mode of the plurality of operating modes comprises a distinct channel map that defines routing of mixed streams to audio outputs differently than channel maps of other operating modes.

[0315] Example 23: The method of Example 20, wherein each operating mode of the plurality of operating modes comprises a distinct channel map that defines routing of mixed streams to audio outputs differently than channel maps of other operating modes.

VIII. Conclusions

[0316] The above discussions relating to Multi-Player Playback Devices, playback devices, controller devices, playback zone configurations, and media/audio content sources provide only some examples of operating environments within which functions and methods described below may be implemented. Other operating environments and configurations of media playback systems, Multi-Player Playback Devices, playback devices, and network devices not explicitly described herein may also be applicable and suitable for implementation of the functions and methods.

[0317] The description above discloses, among other things, various example systems, methods, apparatus, and articles of manufacture including, among other components, firmware and/or software executed on hardware. It is understood that such examples are merely illustrative and should not be considered as limiting. For example, it is contemplated that any or all of the firmware, hardware, and/or software aspects or components can be embodied exclusively in hardware, exclusively in software, exclusively in firmware, or in any combination of hardware, software, and/or firmware. Accordingly, the examples provided are not the only ways) to implement such systems, methods, apparatus, and/or articles of manufacture.

[0318] Additionally, references herein to “embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one example embodiment of an invention. The appearances of this phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative example configurations mutually exclusive of other example configurations. As such, the example configurations described herein, explicitly and implicitly understood by one skilled in the art, can be combined with other example configurations. [0319] The specification is presented largely in terms of illustrative environments, systems, procedures, steps, logic blocks, processing, and other symbolic representations that directly or indirectly resemble the operations of data processing devices coupled to networks. These process descriptions and representations are typically used by those skilled in the art to most effectively convey the substance of their work to others skilled in the art. Numerous specific details are set forth to provide a thorough understanding of the present disclosure. However, it is understood to those skilled in the art that certain example configurations of the present disclosure can be practiced without certain, specific details. In other instances, well known methods, procedures, components, and circuitry have not been described in detail to avoid unnecessarily obscuring aspects of the example configurations. Accordingly, the scope of the present disclosure is defined by the appended claims rather than the foregoing description of example configurations.

[0320] When any of the appended claims are read to cover a purely software and/or firmware implementation, at least one of the elements in at least one example is hereby expressly defined to include a tangible, non-transitory medium such as a memory, DVD, CD, Blu-ray, and so on, storing the software and/or firmware.

Claims

CLAIMS What is claimed is:

1. A method performed by a playback device, the method comprising: transmitting and receiving audio streams comprising audio information via one or more network interfaces of the playback device; receiving audio streams comprising audio information via one or more audio interfaces of the playback device; outputting analog audio signals to one or more loudspeakers via one or more audio outputs of the playback device; selectively operating the playback device in one of a plurality of operating modes, wherein, in each operating mode, the playback device is configured to implement a multistage audio processing procedure comprising: (i) routing each audio stream to at least one mixer input of one or more mixer inputs of the playback device based at least in part on a current operating mode in which the playback device is operating; (ii) for the one or more audio streams routed to an individual mixer input of the one or more mixer inputs, (a) generating a mixed stream based on the one or more audio streams routed to the mixer input, and (b) routing the mixed stream to at least one audio output of the one or more audio outputs based at least in part on a channel map associated with the current operating mode in which the playback device is operating.

2. The method of claim 1, wherein generating a mixed stream based on the one or more audio streams routed to the mixer input comprises: generating the mixed stream by mixing a first audio stream with a second audio stream.

3. The method of any preceding claim, wherein the plurality of operating modes comprises a first operating mode, and wherein in the first operating mode: routing each audio stream to at least one mixer input of one or more mixer inputs based at least in part on a current operating mode in which the playback device is operating comprises routing each audio stream to a common mixer input; generating at least one mixed stream based on one or more audio streams routed to the mixer input comprises, generating one mixed stream based on the audio streams routed to the common mixer input; and routing the mixed stream to at least one audio output of the one or more audio outputs based at least in part on a channel map associated with the current operating mode in which the playback device is operating comprises routing the one mixed stream to one or more separate audio outputs of the one or more of audio outputs.

4. The method of any preceding claim, wherein the plurality of operating modes comprises a second operating mode, and wherein in the second operating mode: routing each audio stream to at least one mixer input of one or more mixer inputs based at least in part on a current operating mode in which the playback device is operating comprises (i) routing each audio stream in a first set of one or more audio streams to a separate mixer input of the plurality of mixer inputs, and (ii) routing each audio stream in a second set of one or more audio streams to every mixer input of the plurality of mixer inputs; and generating a mixed stream based on the one or more audio streams routed to the mixer input comprises generating a plurality of mixed audio streams, wherein each mixed audio stream comprises (i) at least one audio stream in the first set of one or more audio streams and (ii) each audio stream in the second set of one or more audio streams.

5. The method of any preceding claim, wherein the plurality of operating modes comprises a third operating mode, and wherein in the third operating mode: routing each audio stream to at least one mixer input of one or more mixer inputs based at least in part on a current operating mode in which the playback device is operating comprises (i) routing three or more audio streams to a first set of three or more corresponding mixer inputs and (ii) routing the three or more audio streams to a separate mixer input; generating a mixed stream based on the one or more audio streams routed to the mixer input comprises generating mixed stream based on the three or more audio streams routed to the separate mixer input; and routing the mixed stream to at least one audio output of the one or more audio outputs based at least in part on a channel map associated with the current operating mode in which the playback device is operating comprises (i) routing each of the three or more streams to a separate corresponding audio output, and (ii) routing the mixed stream to a separate audio output.

6. The method of claim 5, wherein the three or more audio streams comprise (i) a first audio stream comprising left front channel audio information, (ii) a second audio stream comprising right front channel audio information, (iii) a third audio stream comprising center channel audio information, and (iv) a fourth audio stream comprising subwoofer audio information.

7. The method of claim 6, wherein the three or more audio streams further comprise (i) a fifth audio stream comprising left rear channel audio information, (ii) a sixth audio stream comprising right rear channel audio information.

8. The method of any preceding claim, further comprising: while operating in a first operating mode, switch to operating in one of a second operating mode or a third operating mode; while operating in the second operating mode, switch to operating in one of the first operating mode or the second operating mode; and while operating in the third operating mode, switch to operating in one of the first operating mode or the second operating mode.

9. The method of any preceding claim, wherein routing each audio stream to at least one mixer input of one or more mixer inputs based at least in part on a current operating mode in which the playback device is operating comprises routing each audio stream via at least one router component of the playback device.

10. The method of any preceding claim, wherein for the one or more audio streams routed to an individual mixer input of the one or more mixer inputs, generating a mixed stream based on the one or more audio streams routed to the mixer input comprises generating the mixed audio stream via at least one mixer component of the playback device.

11. The method of any preceding claim, wherein for the one or more audio streams routed to an individual mixer input of the one or more mixer inputs, routing the mixed stream to at least one audio output of the one or more audio outputs based at least in part on a channel map associated with the current operating mode in which the playback device is operating comprises routing the mixed stream via at least one zone selector component of the playback device.

12. The method of any preceding claim, further comprising: playing first audio corresponding to a first audio stream via at least one first loudspeaker connected to a first audio output in synchrony with at least one of (i) playback of second audio corresponding to a second audio stream via at least one second loudspeaker connected to a second audio output or (ii) playback of the first audio stream or playback of the second audio stream by a second playback device.

13. The method of any preceding claim, wherein each operating mode of the plurality of operating modes comprises a distinct channel map that defines routing of mixed streams to audio outputs differently than channel maps of other operating modes.

14. Tangible, non-transitory computer-readable media comprising program instructions, wherein the program instructions, when executed by one or more processors, cause a playback device to perform the method of any of claims 1 through 13.

15. A playback device comprising: one or more processors; tangible, non-transitory computer-readable memory; and program instructions stored on the tangible, non-transitory computer-readable memory, wherein the program instructions, when executed by the one or more processors, cause the playback device to perform the method of any of claims 1 through 13.

16. A system comprising: one or more playback devices according to claim 15; and one or more loudspeakers.