CN116668658A

Movatterモバイル変換

Info

Publication number: CN116668658A
Application number: CN202310174552.3A
Authority: CN
Inventors: C·A·史密斯; B·H·博伊塞尔; D·H·黄; J·S·诺里斯; J·佩伦; J·A·卡泽米亚斯; 任淼; 邱诗善
Original assignee: Apple Inc
Current assignee: Apple Inc
Priority date: 2022-02-28
Filing date: 2023-02-27
Publication date: 2023-08-29

Abstract

The present disclosure relates to systems and methods for three-dimensional placement and refinement in multi-user communication sessions. Some examples of the present disclosure relate to methods for spatial placement of avatars in communication sessions. In some examples, the first electronic device may receive an input corresponding to a request to enter a communication session with the second electronic device while the first electronic device is presenting a three-dimensional environment. In some examples, in response to receiving the input, the first electronic device may scan an environment surrounding the first electronic device. In some examples, the first electronic device may identify a placement location in the three-dimensional environment where a virtual object representing a user of the second electronic device is displayed. In some examples, the first electronic device displays a virtual object representing a user of the second electronic device at a placement location in the three-dimensional environment. Some examples of the present disclosure relate to a method for spatial refinement of a communication session.

Description

System and method for three-dimensional placement and refinement in multi-user communication sessions

Cross Reference to Related Applications

The present application claims the benefit of U.S. provisional application No. 63/268,692 filed on 28 at 2022 and U.S. patent application No. 18/174,403 filed on 24 at 2023, the contents of both of which are incorporated herein by reference in their entirety for all purposes.

Technical Field

The present disclosure relates generally to systems and methods for three-dimensional placement and refinement of objects in a multi-user communication session.

Background

Some computer graphics environments provide two-dimensional and/or three-dimensional environments in which at least some objects displayed for viewing by a user are virtual and computer-generated. In some examples, the three-dimensional environment is presented by a plurality of devices communicating in a multi-user communication session. In some examples, an avatar (e.g., a representation) of each user participating in the multi-user communication session (e.g., via a computing device) is displayed in a three-dimensional environment of the multi-user communication session. In some examples, content may be shared in a three-dimensional environment for viewing and interaction by multiple users participating in a multi-user communication session. In some examples, shared content and/or avatars corresponding to users participating in a multi-user communication session may be moved in a three-dimensional environment.

Disclosure of Invention

In some examples, the first set of criteria includes a first criterion that is met when the identified placement location does not include any objects. In some examples, the first electronic device identifies an updated placement location in the three-dimensional environment presented at the first electronic device based on a determination that the identified placement location does not satisfy the first set of criteria because the identified placement location includes the object. In some examples, the updated placement location may be a location that does not include any objects in the three-dimensional environment. In some examples, when the first electronic device identifies an updated placement location that meets the first set of criteria, the first electronic device displays a virtual object representing a user of the second electronic device at the updated placement location in the three-dimensional environment.

Some examples of the present disclosure relate to systems and methods for spatial refinement of multi-user communication sessions. In some examples, the first electronic device and the second electronic device may be communicatively linked in a multi-user communication session. In some examples, a first electronic device may present a three-dimensional environment including a first shared object and an avatar corresponding to a user of a second electronic device. In some examples, the first electronic device may receive the first input while the first electronic device is rendering the three-dimensional environment. In some examples, in accordance with a determination that the first input corresponds to movement of an avatar corresponding to a user of the second electronic device, the first electronic device may move the avatar and the first shared object in the three-dimensional environment in accordance with the first input. In some examples, in accordance with a determination that the first input corresponds to movement of the first shared object, the first electronic device may move the first shared object in the three-dimensional environment in accordance with the first input without moving an avatar corresponding to the user of the second electronic device.

A full description of these examples is provided in the accompanying drawings and detailed description, and it is to be understood that this summary is not in any way limiting the scope of the disclosure.

Drawings

For a better understanding of the various examples described herein, reference should be made to the following detailed description and the accompanying drawings. Like reference numerals generally refer to corresponding parts throughout the drawings.

Fig. 1 illustrates an electronic device presenting an augmented reality environment according to some examples of the present disclosure.

Fig. 2 illustrates a block diagram of an exemplary architecture of a system in accordance with some examples of this disclosure.

FIG. 3 illustrates a flow chart showing an exemplary process of facilitating spatial placement of an avatar in a multi-user communication session in a computer-generated environment, according to some examples of the disclosure.

Fig. 4A-4I illustrate an exemplary process of spatial placement of avatars in a multi-user communication session according to some examples of the present disclosure.

Fig. 5A-5I illustrate exemplary interactions involving spatial refinement of a multi-user communication session according to some examples of the present disclosure.

Fig. 6A-6D illustrate exemplary interactions involving spatial refinement of a multi-user communication session according to some examples of the present disclosure.

Fig. 7A-7B illustrate a flowchart illustrating an exemplary process of spatial placement of an avatar in a multi-user communication session at an electronic device, according to some examples of the present disclosure.

Fig. 8 illustrates a flow chart showing an exemplary process of spatial refinement in a multi-user communication session at an electronic device, according to some examples of the disclosure.

Fig. 9 illustrates a flow chart showing an exemplary process of spatial refinement in a multi-user communication session at an electronic device, according to some examples of the disclosure.

Detailed Description

Fig. 1 illustrates an electronic device 101 presenting an augmented reality (XR) environment (e.g., a computer-generated environment) according to some examples of the present disclosure. In some examples, the electronic device 101 is a handheld device or mobile device, such as a tablet, laptop, smart phone, or head mounted display. An example of device 101 is described below with reference to the architecture block diagram of fig. 2. As shown in fig. 1, the electronic device 101, table 106, and coffee cup 152 are located in a physical environment 100. In some examples, the electronic device 101 may be configured to capture an image (shown in the field of view of the electronic device 101) of the physical environment 100 including the table 106 and the coffee cup 152. In some examples, in response to the trigger, the electronic device 101 may be configured to display a virtual object 114 (e.g., two-dimensional virtual content) in a computer-generated environment (e.g., represented by the rectangle shown in fig. 1) that is not present in the physical environment 100, but is displayed in a computer-generated environment that is located (e.g., anchored to) on top of the computer-generated representation 106' of the real-world table 106. For example, in response to detecting a flat surface of the table 106 in the physical environment 100, the virtual object 114 may be displayed on a surface of the computer-generated representation 106 'of the table in the computer-generated environment next to the computer-generated representation 152' of the real-world coffee cup 152 displayed via the device 101.

It should be appreciated that virtual object 110 is a representative virtual object and may include and render one or more different virtual objects (e.g., virtual objects having various dimensions, such as two-dimensional or three-dimensional virtual objects) in a three-dimensional computer-generated environment. For example, the virtual object may represent an application or user interface displayed in a computer-generated environment. In some examples, the virtual object may represent content corresponding to an application and/or displayed via a user interface in a computer-generated environment. In some examples, virtual object 114 is optionally configured to be interactive and responsive to user input such that a user may virtually touch, tap, move, rotate, or otherwise interact with the virtual object. In some implementations, the virtual object 114 can be displayed in a three-dimensional computer-generated environment within a multi-user communication session ("multi-user communication session", "communication session"). In some such embodiments, as described in more detail below, virtual object 114 may be viewable and/or configured to be interactive and responsive to a plurality of users represented by a virtual representation (e.g., an avatar, such as avatar 115) and/or user inputs provided by a plurality of users, respectively. Additionally, it should be appreciated that the 3D environment (or 3D virtual object) described herein may be a representation of a 3D environment (or three-dimensional virtual object) projected or rendered at an electronic device.

In the following discussion, an electronic device is described that communicates with a display generation component and one or more input devices. It should be appreciated that the electronic device optionally communicates with one or more other physical user interface devices, such as a touch-sensitive surface, a physical keyboard, a mouse, a joystick, a hand tracking device, an eye tracking device, a stylus, and the like. Further, as noted above, it should be understood that the described electronic device, display, and touch-sensitive surface are optionally distributed between two or more devices. Thus, as used in this disclosure, information displayed on or by an electronic device is optionally used to describe information output by the electronic device for display on a separate display device (touch-sensitive or non-touch-sensitive). Similarly, as used in this disclosure, input received on an electronic device (e.g., touch input received on a touch-sensitive surface of the electronic device, or touch input received on a surface of a stylus) is optionally used to describe input received on a separate input device from which the electronic device receives input information.

The device typically supports a variety of applications, such as one or more of the following: drawing applications, presentation applications, word processing applications, website creation applications, disk editing applications, spreadsheet applications, gaming applications, telephony applications, video conferencing applications, email applications, instant messaging applications, fitness support applications, photo management applications, digital camera applications, digital video camera applications, web browsing applications, digital music player applications, television channel browsing applications, and/or digital video player applications.

Fig. 2 illustrates a block diagram of an exemplary architecture of a system 201, according to some examples of the present disclosure. In some examples, system 201 includes multiple devices. For example, the system 201 includes a first electronic device 260 and a second electronic device 270, wherein the first electronic device 260 and the second electronic device 270 are in communication with each other. In some embodiments, the first electronic device 260 and the second electronic device 270 are portable devices, such as mobile phones, smart phones, tablets, laptops, auxiliary devices that communicate with another device, and the like, respectively.

As shown in fig. 2, the first device 260 optionally includes various sensors (e.g., one or more hand tracking sensors 202A, one or more position sensors 204A, one or more image sensors 206A, one or more touch-sensitive surfaces 209A, one or more motion and/or orientation sensors 210A, one or more eye tracking sensors 212A, one or more microphones 213A or other audio sensors, etc.), one or more display generating components 214A, one or more speakers 216A, one or more processors 218A, one or more memories 220A, and/or communication circuitry 222A. In some embodiments, the second device 270 optionally includes various sensors (e.g., one or more hand tracking sensors 202B, one or more position sensors 204B, one or more image sensors 206B, one or more touch-sensitive surfaces 209B, one or more motion and/or orientation sensors 210B, one or more eye tracking sensors 212B, one or more microphones 213B or other audio sensors, etc.), one or more display generating components 214B, one or more speakers 216, one or more processors 218B, one or more memories 220B, and/or communication circuitry 222B. One or more communication buses 208A and 208B are optionally used for communication between the above-described components of devices 260 and 270, respectively. The first device 260 and the second device 270 optionally communicate via a wired or wireless connection between the two devices (e.g., via communication circuits 222A-222B).

The communication circuits 222A, 222B optionally include circuitry for communicating with electronic devices, networks, such as the internet, intranets, wired and/or wireless networks, cellular networks, and wireless Local Area Networks (LANs). The communication circuits 222A, 222B optionally include circuitry for using Near Field Communication (NFC) and/or short range communication such asAnd a circuit for performing communication.

The processors 218A, 218B include one or more general purpose processors, one or more graphics processors, and/or one or more digital signal processors. In some embodiments, the memories 220A, 220B are non-transitory computer-readable storage media (e.g., flash memory, random access memory, or other volatile or non-volatile memory or storage devices) storing computer-readable instructions configured to be executed by the processors 218A, 218B to perform the techniques, processes, and/or methods described below. In some implementations, the memory 220A, 220B may include more than one non-transitory computer-readable storage medium. A non-transitory computer-readable storage medium may be any medium (e.g., excluding signals) that can tangibly contain or store computer-executable instructions for use by or in connection with an instruction execution system, apparatus, and device. In some embodiments, the storage medium is a transitory computer readable storage medium. In some embodiments, the storage medium is a non-transitory computer readable storage medium. The non-transitory computer readable storage medium may include, but is not limited to, magnetic storage devices, optical storage devices, and/or semiconductor storage devices. Examples of such storage devices include magnetic disks, optical disks based on CD, DVD, or blu-ray technology, and persistent solid state memories such as flash memory, solid state drives, etc.

In some embodiments, the display generation component 214A, 214B includes a single display (e.g., a Liquid Crystal Display (LCD), an Organic Light Emitting Diode (OLED), or other type of display). In some embodiments, the display generation component 214A, 214B includes a plurality of displays. In some implementations, the display generation component 214A, 214B can include a display with touch capabilities (e.g., a touch screen), a projector, a holographic projector, a retinal projector, etc. In some implementations, the devices 260 and 270 include touch-sensitive surfaces 209A and 209B, respectively, for receiving user inputs such as tap inputs and swipe inputs or other gestures. In some implementations, the display generation component 214A, 214B and the touch-sensitive surface 209A, 209B form a touch-sensitive display (e.g., a touch screen integrated with the devices 260 and 270, respectively, or a touch screen external to the devices 260 and 270 in communication with the devices 260 and 270, respectively).

Devices 260 and 270 optionally include image sensors 206A and 206B, respectively. The image sensors 206A/206B optionally include one or more visible light image sensors, such as Charge Coupled Device (CCD) sensors, and/or Complementary Metal Oxide Semiconductor (CMOS) sensors operable to obtain images of physical objects from a real world environment. The image sensors 206A/206B also optionally include one or more Infrared (IR) sensors, such as passive IR sensors or active IR sensors, for detecting infrared light from the real world environment. For example, active IR sensors include an IR emitter for emitting infrared light into the real world environment. The image sensor 206A/206B also optionally includes one or more cameras configured to capture movement of the physical object in the real world environment. The image sensors 206A/206B also optionally include one or more depth sensors configured to detect the distance of the physical object from the device 260/270. In some embodiments, information from one or more depth sensors may allow a device to identify and distinguish objects in a real-world environment from other objects in the real-world environment. In some embodiments, one or more depth sensors may allow the device to determine the texture and/or topography of objects in the real world environment.

In some embodiments, devices 260 and 270 use a CCD sensor, an event camera, and a depth sensor in combination to detect the physical environment surrounding devices 260 and 270. In some embodiments, the image sensor 206A/206B includes a first image sensor and a second image sensor. The first image sensor and the second image sensor work in tandem and are optionally configured to capture different information of a physical object in a real world environment. In some embodiments, the first image sensor is a visible light image sensor and the second image sensor is a depth sensor. In some embodiments, the device 260/270 uses the image sensor 206A/206B to detect the position and orientation of the device 260/270 and/or the display generating component 214A/214B in a real-world environment. For example, the device 260/270 uses the image sensor 206A/206B to track the position and orientation of the display generation component 214A/214B relative to one or more stationary objects in the real-world environment.

In some embodiments, the device 260/270 includes a microphone 213A/213B or other audio sensor. The device 260/270 uses the microphone 213A/213B to detect sound from the user and/or the user's real world environment. In some embodiments, microphones 213A/213B include a microphone array (plurality of microphones) that optionally operate in tandem to identify ambient noise or locate sound sources in space of the real world environment.

The device 260/270 includes a position sensor 204A/204B for detecting the position of the device 260/270 and/or the display generating component 214A/214B. For example, the location sensor 204A/204B may include a GPS receiver that receives data from one or more satellites and allows the device 260/270 to determine the absolute location of the device in the physical world.

The device 260/270 includes an orientation sensor 210A/210B for detecting the orientation and/or movement of the device 260/270 and/or the display generating component 214A/214B. For example, the device 260/270 uses the orientation sensor 210A/210B to track changes in the position and/or orientation of the device 260/270 and/or the display generation component 214A/214B, such as changes relative to physical objects in the real-world environment. Orientation sensor 210A/210B optionally includes one or more gyroscopes and/or one or more accelerometers.

In some embodiments, the device 260/270 includes a hand tracking sensor 202A/202B and/or an eye tracking sensor 212A/212B. The hand tracking sensors 202A/202B are configured to track the position/location of one or more portions of the user's hand, and/or the movement of one or more portions of the user's hand relative to the augmented reality environment, relative to the display generating component 214A/214B, and/or relative to another defined coordinate system. The eye tracking sensors 212A/212B are configured to track the position and movement of a user's gaze (more generally, eyes, face, or head) relative to the real world or augmented reality environment and/or relative to the display generating component 214A/214B. In some embodiments, the hand tracking sensor 202A/202B and/or the eye tracking sensor 212A/212B are implemented with the display generation component 214A/214B. In some embodiments, the hand tracking sensor 202A/202B and/or the eye tracking sensor 212A/212B are implemented separately from the display generation component 214A/214B.

In some embodiments, the hand tracking sensor 202A/202B may use an image sensor 206A/206B (e.g., one or more IR cameras, 3D cameras, depth cameras, etc.) that captures three-dimensional information from the real world including one or more hands (e.g., one or more hands of a human user). In some embodiments, the hands may be resolved with sufficient resolution to distinguish between fingers and their respective locations. In some embodiments, one or more image sensors 206A/206B are positioned relative to the user to define a field of view of the image sensors 206A/206B and an interaction space in which finger/hand positions, orientations, and/or movements captured by the image sensors are used as input (e.g., to distinguish from the user's hands that are idle or other hands of other people in the real-world environment). Tracking the finger/hand (e.g., gesture, touch, tap, etc.) for input may be advantageous because it does not require the user to touch, hold, or wear any type of beacon, sensor, or other indicia.

In some embodiments, the eye-tracking sensor 212A/212B includes at least one eye-tracking camera (e.g., an Infrared (IR) camera) and/or an illumination source (e.g., an IR light source, such as an LED) that emits light toward the user's eye. The eye tracking camera may be directed at the user's eye to receive reflected IR light from the light source directly or indirectly from the eye. In some embodiments, both eyes are tracked separately by respective eye tracking cameras and illumination sources, and focus/gaze may be determined by tracking both eyes. In some embodiments, one eye (e.g., the dominant eye) is tracked by a corresponding eye tracking camera/illumination source.

The device 260/270 and the system 201 are not limited to the components and configurations of fig. 2, but may include fewer, additional, or additional components in various configurations. In some embodiments, the system 201 may be implemented in a single device. One or more persons using system 201 are optionally referred to herein as one or more users of the device.

Attention is now turned to an exemplary simultaneous display of a three-dimensional environment on a first electronic device (e.g., corresponding to device 260) and a second electronic device (e.g., corresponding to device 270). As described below, a first electronic device may communicate with a second electronic device in a multi-user communication session. In some examples, an avatar (e.g., a representation) of a user of a first electronic device may be displayed in a three-dimensional environment at a second electronic device, and an avatar of a user of a second electronic device may be displayed in a three-dimensional environment at the first electronic device. In some examples, the content may be shared within a three-dimensional environment while the first electronic device and the second electronic device are in a multi-user communication session.

FIG. 3 illustrates a flow chart showing an exemplary process 300 that facilitates spatial placement of an avatar in a multi-user communication session in a computer-generated environment, according to some examples of the disclosure. As described herein, when in a multi-communication session, a plurality of electronic devices may present a shared three-dimensional environment that may include one or more shared virtual objects, one or more non-shared virtual objects (e.g., representations thereof), and/or a three-dimensional representation (e.g., depiction) of a user of the plurality of electronic devices. As described below, at a first electronic device, a user may initiate a multi-user communication session with one or more remote electronic devices, optionally including placement of one or more three-dimensional representations of the user of the one or more remote electronic devices within a three-dimensional environment presented at the first electronic device. The exemplary process shown in fig. 3 and described below will be illustrated and described in more detail when referring to fig. 4A-4I below.

In some examples, at 302, a first electronic device may receive a request to initiate a multi-user communication session with one or more remote electronic devices. For example, a user of a first electronic device may provide user input at the first electronic device corresponding to a request to enter a multi-user communication session with a second electronic device. As shown in fig. 3, at 304, in response to receiving the request, the first electronic device may scan a physical environment surrounding the first electronic device. For example, the first electronic device may be located within a physical room that includes one or more physical objects (e.g., a chair, a table, a sofa, a light, etc.). In some examples, the first electronic device may generate an occupancy map of a portion of the physical environment within a field of view of a user of the first electronic device (e.g., a portion of the physical environment visible to the user due to the presence of a display blocking the user's view of the physical environment surrounding the user). In some examples, as described in more detail below, the occupancy map optionally identifies a unique location of the physical object within the field of view of the user.

In some embodiments, the first electronic device may scan a three-dimensional environment within a field of view of a user of the first electronic device. For example, the three-dimensional environment within the field of view of the user of the first electronic device may include one or more virtual objects (e.g., three-dimensional application windows, three-dimensional and/or two-dimensional content, virtual models, three-dimensional representations of physical objects, etc.). In some examples, the occupancy map generated by the first electronic device optionally identifies a unique location of the virtual object within the field of view of the user, as described in more detail below with reference to fig. 4A-4I.

In some examples, the determination of the placement location is based on satisfaction of a first set of criteria, as discussed in more detail with reference to fig. 4A-4I. As described above, the first electronic device and the second electronic device may each generate an occupancy map identifying locations of physical and/or virtual objects within a field of view of a user of the first electronic device and the second electronic device, respectively. In some examples, the first set of criteria includes a first criterion that is met when the placement location does not include an object (e.g., a physical object or a virtual object). In some such examples, the first electronic device and the second electronic device may each determine whether the first criterion is met using respective occupancy maps generated by scanning respective environments within fields of view of the first electronic device and the second electronic device. As described in more detail below, if a location within the three-dimensional environment at the electronic device that is at the center of the user's field of view and/or a predefined distance from the user's point of view includes an object (e.g., determined using an occupancy map), the placement location may instead be updated (e.g., modified) to a location within the three-dimensional environment that is different from the center of the user's field of view and/or that does not include an object that is not a predefined distance from the user's point of view. In some examples, the determined placement location is optionally provided to the user as a suggestion (e.g., as a visual element such as a virtual needle display), and the user can confirm or change the placement location (e.g., by selecting and moving the visual element). In some examples, the distance between the selected placement location and the viewpoint of the initiating user is used as the distance between the viewpoint of another user and the placement location at the electronic device of the other user, or vice versa, as described in more detail later herein. In some examples, the other user has the option to move the placement locations radially about their respective viewpoints by a defined distance to maintain spatial authenticity between the user's viewpoints (e.g., because spatial authenticity requires the same offset distance, but radial changes to the user's placement locations may be accommodated by causing the other user's avatar to turn/orient in a corresponding direction, as discussed herein).

Fig. 4A-4I illustrate an exemplary process of spatial placement of avatars in a multi-user communication session according to some examples of the present disclosure. In some examples, the first electronic device 460 may present the three-dimensional environment 450A and the second electronic device 470 may present the three-dimensional environment 450B. The first electronic device 460 and the second electronic device 470 may be similar to the devices 101 or 260/270, and/or may be a head-mountable system/device and/or a projection-based system/device (including hologram-based systems/devices) configured to generate and render a three-dimensional environment, such as a heads-up display (HUD), a head-mounted display (HMD), a window with integrated display capabilities, a display formed as a lens (e.g., similar to a contact lens) designed to be placed on a person's eye, respectively. In the example of fig. 4A-4I, a first user optionally wears an electronic device 460 and a second user optionally wears an electronic device 470 such that the three-dimensional environment 450A/450B may be defined by X, Y and a Z-axis viewed from the perspective of the electronic device (e.g., a viewpoint associated with the user of the electronic device 460/470).

As described above, in some examples, the first electronic device 460 may enter into a multi-user communication session with the second electronic device 470. For example, in the multi-user communication session, the first electronic device 460 and the second electronic device 470 are configured (e.g., via communication circuitry 222A/222B) to present a shared three-dimensional environment that includes one or more shared virtual objects (e.g., content such as images, video, audio, etc., representations of user interfaces of applications, etc.). As used herein, the term "shared three-dimensional environment" refers to a three-dimensional environment that is independently presented, displayed, and/or visible at two or more electronic devices via which content, applications, data, etc., may be shared and/or presented to users of the two or more electronic devices.

As shown in fig. 4A, a user of the first electronic device 460 is optionally providing a selection input 472A relating to the first virtual object 410A. For example, while the user's gaze is directed toward the first virtual object 410A, the user may provide pinch input (e.g., where the index finger and thumb of the user's hand make contact) (or other suitable input such as tap input, gaze exceeding a threshold period of time, etc.). In some examples, the second electronic device 470 optionally receives an indication of a request from the first electronic device 460 to enter into a multi-user communication session with the first electronic device 460 in response to receiving the selection input 472A related to the first virtual object 410A. For example, as shown in fig. 4A, in response to receiving an indication from the first electronic device 460, the second electronic device 470 optionally displays a first user interface element 418 (e.g., a notification) corresponding to a request to enter a multi-communication session with the first electronic device 460. In some examples, first user interface element 418 includes option 419A that can be selected to accept the invitation and thereby initiate a process of initiating a multi-user communication session between first electronic device 460 and second electronic device 470.

In some examples, when two or more electronic devices are communicatively linked in a multi-user communication session, avatars corresponding to users of the two or more electronic devices are optionally displayed within a shared three-dimensional environment presented at the two or more electronic devices. Thus, as described above with reference to FIG. 3, initiating a multi-user communication session between the first electronic device 460 and the second electronic device 470 includes identifying a suitable placement location within the three-dimensional environment 450A/450B for placement of avatars corresponding to users of the first electronic device 460 and the second electronic device 470. For example, the first electronic device optionally identifies a suitable placement location within the three-dimensional environment 450A for placement of an avatar corresponding to the user of the second electronic device 470, and the second electronic device optionally identifies a placement location within the three-dimensional environment 450B for placement of an avatar corresponding to the user of the first electronic device 460.

In some examples, it may be advantageous to automatically control the placement of avatars corresponding to users of electronic devices communicatively linked in a multi-user communication session. As described above, the three-dimensional environment optionally includes an avatar corresponding to a user of the electronic device in a communication session, wherein the avatar provides a spatial reference point of relative distance and orientation between virtual locations of users within the shared environment of the multi-user communication session. In some cases, the physical environment surrounding the respective electronic device may include physical objects within a field of view of a user of the respective electronic device. Placement of an avatar at a location within a three-dimensional environment occupied by physical (and/or virtual) objects may result in portions of the avatar being obscured, cut off, or otherwise rendered unrealistic from the perspective of a user of an electronic device in a multi-user communication session. In some such examples, allowing users to manually select placement locations of avatars in a shared three-dimensional environment in a multi-user communication session may avoid avatar placement in locations intersecting physical objects, but coordinating such placement locations among multiple users may increase the cognitive burden of the users. Thus, as discussed in more detail below, when placing an avatar in a three-dimensional environment in a multi-user communication session, the electronic device may automatically select a placement location that displays the avatar that avoids physical and/or virtual objects within the field of view of the user and maintains spatial consistency between users of the electronic device.

In some examples, when in a multi-user communication session, avatars 415/417 are displayed in three-dimensional environment 450A/450B in respective orientations that correspond to and/or are based on the orientation of electronic device 460/470 in the physical environment surrounding electronic device 460/470. For example, as shown in fig. 4C, in three-dimensional environment 450A avatar 415 optionally faces the viewpoint of the user of first electronic device 460 (e.g., viewpoint 418A), and in three-dimensional environment 450B avatar 417 optionally faces the viewpoint of the user of second electronic device 470 (e.g., viewpoint 418B). When a particular user moves the electronic device in a physical environment, the user's point of view changes according to the movement, which may thus also change the orientation of the user's avatar in a three-dimensional environment. For example, referring to fig. 4C, if the user of the first electronic device 460 would look left in the three-dimensional environment 450A such that the first electronic device 460 rotates left (e.g., counter-clockwise) by a corresponding amount, the user of the second electronic device 470 would see that the avatar 417 corresponding to the user of the first electronic device 460 rotates right (e.g., clockwise) according to the movement of the first electronic device 460.

Additionally, in some examples, the position of the viewpoint of the three-dimensional environment 450A/450B and/or the viewpoint of the three-dimensional environment 450A/450B when in a multi-user communication session optionally changes according to movement of the electronic device 450A/450B (e.g., by a user of the electronic device 450A/450B). For example, while in a communication session, if the electronic device 460 is moved closer toward the representation 406' of the table and/or the avatar 415 (e.g., because the user of the electronic device 460 is moving forward in the physical environment surrounding the electronic device 460), the viewpoint (e.g., viewpoint 418A) of the user of the first electronic device 460 will change accordingly such that the representation 406' of the table, the representation 409' of the window, and the avatar 415 appear larger in the field of view of the three-dimensional environment 450A.

In some examples, the avatar 415/417 is a representation (e.g., a whole body depiction) of each of the users of the electronic devices 470/460. In some examples, avatars 415/417 are each representations (e.g., depictions of head, face, head, torso, etc.) of a portion of a user of electronic device 470/460. In some examples, the avatar 415/417 is a user-personalized, user-selected, and/or user-created representation displayed in the three-dimensional environment 450A/450B that represents the user of the electronic device 470/460. It should be appreciated that while the avatars 415/417 shown in FIG. 4C correspond to a simplified representation of the whole body of each of the users of the electronic devices 470/460, respectively, alternative avatars may be provided, such as those described above.

It should be appreciated that in some examples, more than two electronic devices may be communicatively linked in a multi-user communication session. For example, in the case where three electronic devices are communicatively linked in a multi-user communication session, the first electronic device will display two avatars corresponding to users of the other two electronic devices, rather than just one avatar. Accordingly, it should be appreciated that the various processes and exemplary interactions described herein with reference to initiating and facilitating a multi-user communication session between the first electronic device 460 and the second electronic device 470 are optionally applicable to situations in which more than two electronic devices are communicatively linked in the multi-user communication session. For example, where more than two electronic devices (e.g., three, four, five, eight, ten, etc.) are initiating a multi-communication session, the electronic devices optionally scan the environment surrounding each electronic device separately to generate an occupancy map at each electronic device. A spatial arrangement for placing an avatar corresponding to a user of the electronic device may be determined based on an occupancy map generated by the electronic device. In some examples, particularly where more than two electronic devices are joining a multi-communication session, alternative placement arrangements may be utilized when placing avatars corresponding to users of the electronic devices in a three-dimensional environment. For example, to accommodate a greater number of avatars to be displayed in a three-dimensional environment (e.g., due to a greater number of electronic devices in a multi-user communication session), the electronic devices may employ a circular/elliptical (or some other shape) placement arrangement such that avatars corresponding to users of the electronic devices are displayed in a circular/elliptical (or some other shape) in the three-dimensional environment. In some such examples, the electronic device may utilize the generated occupancy map to identify placement locations (e.g., points) along a circle/oval (or other shape) where the avatar is placed.

In some examples, the identified first placement location 425A and/or the identified second placement location 425B may not meet the first set of criteria described above. For example, the identified first placement location 425A and/or the identified second placement location 425B may include one or more objects (e.g., physical and/or virtual objects) without optionally meeting the first set of criteria. Accordingly, as discussed in detail below with reference to fig. 4D-4I, the first electronic device 460 and/or the second electronic device 470 may change the placement locations within the three-dimensional environment and identify one or more candidate placement locations that satisfy the first set of criteria.

As shown in fig. 4D, in some examples, the first electronic device 460 may present a three-dimensional environment 450A that optionally includes a plurality of virtual objects 410, as similarly described above with reference to fig. 4A. In the example of fig. 4D, the three-dimensional environment 450B may include a representation of a captured portion of the physical environment surrounding the first electronic device 460, such as a representation 406 'of a table and a representation 409' of a window. As shown in fig. 4D, a representation 406' of the table is optionally located behind the plurality of virtual objects 410. Similarly, the second electronic device 470 may present a three-dimensional environment 450B, optionally including representations of captured portions of the physical environment surrounding the second electronic device, such as a representation 407 'of a floor light and a representation 408' of a coffee table.

In some examples, first electronic device 460 and second electronic device 470 may initiate initiation of the multi-user communication session in response to receiving input accepting an invitation to join the multi-user communication session with first electronic device 460 (e.g., via a selection input received at accept option 419B). As described above, in some examples, the first electronic device 460 and the second electronic device 470 may each scan the physical environment surrounding the electronic device to identify corresponding locations (e.g., locations that do not include objects) of physical objects and/or open spaces within the field of view of the user of the electronic device. For example, the first electronic device 460 may scan the physical environment surrounding the first electronic device 460 to generate an occupancy map characterizing the environment within the field of view of the user of the first electronic device 460, such as identifying the location of a physical object (e.g., the table 406) within the field of view of the user of the first electronic device 460. Similarly, the second electronic device 470 may scan the physical environment surrounding the second electronic device 470 to generate an occupancy map that characterizes the environment within the field of view of the user of the second electronic device 470, such as identifying the location of physical objects (e.g., the floor lights 407 and/or the coffee table 408) within the field of view of the user of the second electronic device 470.

As shown in fig. 4E, in some examples, the first electronic device 460 may utilize an occupancy map generated at the first electronic device 460 to identify updated placement locations 425C within the three-dimensional environment 450A. As shown, the updated placement location 425C is optionally offset (e.g., angularly offset 3, 5, 10, 15, 20, 25, 30 degrees, etc.) from the viewpoint 418A of the user of the first electronic device 460 to the left of the preferred placement location 425A and/or a predefined distance 426A from the viewpoint 418A. It should be appreciated that although the relative lengths of the arrows representing predefined distance 426A may appear different in fig. 4E, the distance between viewpoint 418A and each of placement locations 425A and 425C is optionally the same.

In some examples, identifying the updated placement location may include changing a distance between the identified placement location (e.g., 425A) and a point of view (e.g., 418A) of a user of the electronic device. In some examples, the relative boundaries determined by the environment surrounding the electronic device may limit the extent to which the distance increases or decreases. For example, as shown in fig. 4E, at the first electronic device 460, the first boundary 421A may impose a minimum distance between the candidate placement location and the viewpoint 418A of the user of the first electronic device 460. In some examples, the minimum distance may be a predetermined value (e.g., 0.5m, 0.8m, 1m, 1.2m, etc.) and may limit how close an avatar corresponding to the user of another electronic device is initially placed in the three-dimensional environment (e.g., from the user's point of view). In addition, for example, at the first electronic device 460, the second boundary 422A may impose a maximum distance between the candidate placement location and the viewpoint 418A of the user of the first electronic device 460. In some examples, the maximum distance may be a predetermined value (e.g., 4m, 4.5m, 5m, 7m, 8m, 10m, etc.), and/or may be determined based on a limit in a physical environment surrounding the electronic device. For example, the second boundary 422A in fig. 4E may be determined based on a far wall in the physical environment in the field of view of the user of the first electronic device 460, and thus may limit how far an avatar corresponding to the user of the second electronic device is initially placed in the three-dimensional environment 450A (e.g., from the viewpoint 418A). Because there is open space to the left of the representation 406' of the table at the predefined distance 426A, the first electronic device 460 optionally foregoes adjusting the distance between the identified placement location 425C and the viewpoint 418A. As shown in fig. 4E, the updated placement location 425C is not occupied by any object, which meets the first criteria at the first electronic device 460, and thus meets the first set of criteria.

As shown in fig. 4E, the second electronic device 470 may also identify a placement location 425B within the three-dimensional environment 450B that is centered within the field of view of the user of the second electronic device 470 and/or a predefined distance 426B from the viewpoint 418B of the user of the second electronic device 470. In some examples, because the identified placement location 425B within the three-dimensional environment 450B is not occupied by any objects, the identified placement location 425B optionally meets the first criteria at the second electronic device 470 and, thus, meets the first set of criteria. In some examples, the predefined distance 426A at the first electronic device 460 may be equal to the predefined distance 426B at the second electronic device 470.

As described above with reference to fig. 4C, avatars 415 and 417 may be displayed in three dimensional environments 450A and 450B in respective orientations corresponding to the orientations of electronic devices 460 and 470 rendering three dimensional environments 450A and 450B. In some examples, when avatars 415 and 417 are initially placed at determined placement locations within three dimensional environments 450A and 450B, the respective orientations of avatars 415 and 417 may indicate the determined off-center (e.g., relative to the center of the field of view of the user of the electronic device) placement locations. For example, as shown in fig. 4F, at the second electronic device 470, since the determined placement location 425B in the three-dimensional environment 450B is at the center of the field of view of the user of the second electronic device 470, and the orientation of the user of the second electronic device 470 optionally corresponds to the center of the field of view of the user of the second electronic device 470, the first electronic device 460 may display the avatar 415 corresponding to the user of the second electronic device 470 in an orientation (e.g., shown by the orientation of the face of the avatar 415) facing the viewpoint (e.g., viewpoint 418A) of the user of the first electronic device 460. At the first electronic device 460, for example, because the determined placement location 425C in the three-dimensional environment 450A is to the left of the center of the field of view of the user of the first electronic device 460, and the orientation of the user of the first electronic device 460 optionally corresponds to the center of the field of view of the user of the first electronic device 460, the second electronic device 470 may display the avatar 417 of the user of the first electronic device 460 in an orientation facing to the left (e.g., shown by the orientation of the face of the avatar 417) relative to the viewpoint (e.g., viewpoint 418B) of the user of the second electronic device 470, as shown.

It should be appreciated that while the orientation of the face of the avatar 415/417 is utilized in FIG. 4F to indicate the respective orientation of the avatar 415/417 (e.g., a change thereof) of the user within the three-dimensional environment 450A/450B, additional or alternative characteristics of the avatar 415/417 may be utilized to convey the change to the respective orientation of the avatar 415/417. For example, for avatars that include a whole body and/or upper body depiction of a user of the electronic device, the torso (e.g., including shoulders, arms, and/or chest) of each of the avatars may indicate the respective orientation of the avatar within the three-dimensional environment 450A/450B. Similarly, for avatars that include a whole-body depiction of a user of the electronic device, the lower body (e.g., including the hips, legs, and/or feet) of each of the avatars may indicate the respective orientation of the avatar within the three-dimensional environment 450A/450B.

As described above, the first electronic device 460 and the second electronic device 470 are communicatively linked in a multi-user communication session as the avatars 415 and 417 are placed within the three dimensional environments 450A and 450B, respectively. Thus, as outlined above, one advantage of the disclosed method for avatar placement in a multi-user communication session is that it facilitates the automatic placement of avatars individually at each electronic device based on the location of objects in the environment surrounding the electronic device, thereby reducing the cognitive burden on the user when initiating and/or operating in the multi-user communication session. Focusing now on an example where the object is located at a preferred placement location at both the first electronic device and the second electronic device, a space tradeoff between the first electronic device and the second electronic device is required, as described below.

As shown in fig. 4G, in some examples, the first electronic device 460 may present a three-dimensional environment 450A that optionally includes a plurality of virtual objects 410, as similarly described above with reference to fig. 4A and 4D. Additionally, as shown, the three-dimensional environment 450A may include respective user interface elements 424 corresponding to respective applications (e.g., application a) running on the first electronic device 460. In the example of fig. 4G, the three-dimensional environment 450A may include a representation of a captured portion of the physical environment surrounding the first electronic device 460, such as a representation 406 'of a table and a representation 409' of a window. As shown in fig. 4G, a representation 406' of the table is optionally located behind the plurality of virtual objects 410. Similarly, the second electronic device 470 may present a three-dimensional environment 450B, optionally including a representation of a captured portion of the physical environment surrounding the second electronic device, such as a representation 407' of a floor light, a representation 408' of a coffee table, and a representation 411' of a sofa.

In some examples, first electronic device 460 and second electronic device 470 may initiate initiation of the multi-user communication session in response to receiving input accepting an invitation to join the multi-user communication session with first electronic device 460 (e.g., via a selection input received at accept option 419C). As described above, in some examples, the first electronic device 460 and the second electronic device 470 may each scan the physical environment surrounding the electronic device to identify corresponding locations (e.g., locations that do not include objects) of physical objects and/or open spaces within the field of view of the user of the electronic device. For example, the first electronic device 460 may scan the physical environment surrounding the first electronic device 460 to generate an occupancy map characterizing the environment within the field of view of the user of the first electronic device 460, such as identifying the location of a physical object (e.g., the table 406) within the field of view of the user of the first electronic device 460. Similarly, the second electronic device 470 may scan the physical environment surrounding the second electronic device 470 to generate an occupancy map that characterizes the environment within the field of view of the user of the second electronic device 470, such as identifying the locations of physical objects (e.g., the floor lights 407, the coffee table 408, and/or the sofa 411) within the field of view of the user of the second electronic device 470.

As shown in fig. 4H, in some examples, the first electronic device 460 may utilize an occupancy map generated at the first electronic device 460 to identify updated placement locations 425C within the three-dimensional environment 450A. As shown, the updated placement location 425C is optionally a location offset (e.g., angularly offset 3, 5, 10, 15, 20, 25, or 30 degrees) from the viewpoint 418A of the user of the first electronic device 460 to the left of the preferred placement location 425A and/or an updated distance 426C from the viewpoint 418A. As described above with reference to fig. 4E, in some examples, identifying the updated placement location may include changing a distance between the identified placement location (e.g., 425A) and a point of view (e.g., 418A) of a user of the electronic device. As shown in fig. 4H, the updated distance 426C is different from the predefined distance 426A at the first electronic device 460 and is selected from a range defined by a first boundary 421A (e.g., a minimum distance) and a second boundary 422A (e.g., a maximum distance determined by physical limitations of a distant wall) at the first electronic device 460. In the example of fig. 4H, since the respective user interface element 424 at least partially occupies a position to the left of the preferred placement location 425A that is a predefined distance 426A, the updated distance 426C between the updated placement location 425C and the viewpoint 418A is less than the predefined distance 426A (e.g., if the predefined distance 426A is 1.5m, the updated distance 426C is 1.4 m) and within the tolerable range defined by the first and second boundaries 421A and 422A. As shown in fig. 4H, the updated placement location 425C is not occupied by any object and, thus, meets a first criterion of a first set of criteria at the first electronic device 460.

As shown in fig. 4H, the second electronic device 470 may also identify a preferred placement location 425B within the three-dimensional environment 450B that is centered within the field of view of the user of the second electronic device 470 and/or a predefined distance 426B from the viewpoint 418B of the user of the second electronic device 470. As shown in fig. 4H, the identified placement location 425B within the three-dimensional environment 450B at least partially intersects a portion of the object within the three-dimensional environment 450B (i.e., a portion of the coffee table 408 in the physical environment surrounding the second electronic device 470). Because a portion of the representation 408' of the coffee table intersects/overlaps with the preferred placement location 425B, the preferred placement location 425B fails to meet a first criterion of the first set of criteria at the second electronic device. Thus, in some examples, the second electronic device 470 may proceed to identify an updated/candidate placement location within the three-dimensional environment 450B that is different from the preferred placement location 425B.

As shown in fig. 4H, in some examples, the second electronic device 470 may identify an updated placement location 425D within the three-dimensional environment 450B using an occupancy map generated at the second electronic device 470. As shown, the updated placement location 425C is optionally a location that is an updated distance 426D after the preferred placement location 425B and/or from the viewpoint 418B of the user of the second electronic device 470. In some examples, the updated distance 426D is different from the predefined distance 426B at the second electronic device 470 and is selected from a range defined by the first boundary 421B (e.g., minimum distance) and the second boundary 422B (e.g., maximum distance determined by physical limitations of a distant wall) at the second electronic device 470. In the example of fig. 4H, because the space between the representation 408' of the coffee table and the representation 411' of the sofa may be insufficient (e.g., below a threshold amount, such as 2, 3, or 5 square meters), the updated placement position 425D is selected such that the updated distance 426D between the updated placement position 425D and the viewpoint 418B is greater than the predefined distance 426B (e.g., if the predefined distance 426B is 1.5m, the updated distance 426D is 2 m) after the representation 408' of the coffee table and is within the tolerable range defined by the first boundary 421B and the second boundary 422B. Thus, as shown in fig. 4H, the updated placement location 425D is not occupied by any object and therefore meets a first criterion of the first set of criteria at the second electronic device 470.

As described above with reference to fig. 4C and 4F, avatars 415 and 417 may be displayed in three dimensional environments 450A and 450B in respective orientations corresponding to the orientations of electronic devices 460 and 470 rendering three dimensional environments 450A and 450B. In some examples, when avatars 415 and 417 are initially placed at determined placement locations within three dimensional environments 450A and 450B, the respective orientations of avatars 415 and 417 may indicate the determined off-center (e.g., relative to the center of the field of view of the user of the electronic device) placement locations. For example, as shown in fig. 4I, at the second electronic device 470, since the determined placement location 425E in the three-dimensional environment 450B is to the left of the center of the field of view of the user of the second electronic device 470, and the orientation of the user of the second electronic device 470 optionally corresponds to the center of the field of view of the user of the second electronic device 470, the first electronic device 460 may display the avatar 415 corresponding to the user of the second electronic device 470 in a leftward orientation (e.g., shown by the orientation of the face of the avatar 415) facing toward the viewpoint (e.g., viewpoint 418A) of the user of the first electronic device 460. At the first electronic device 460, for example, because the determined placement location 425C in the three-dimensional environment 450A is to the left of the center of the field of view of the user of the first electronic device 460, and the orientation of the user of the first electronic device 460 optionally corresponds to the center of the field of view of the user of the first electronic device 460, the second electronic device 470 may display an avatar 417 corresponding to the user of the first electronic device 460 in an orientation (e.g., shown by the orientation of the face of the avatar 417) facing to the right (e.g., a point in the three-dimensional environment 450B behind the viewpoint of the user of the second electronic device 470) relative to the viewpoint of the user of the second electronic device 470, as shown.

As described above, when in a multi-user communication session, content may be shared between the first electronic device and the second electronic device such that the content may be interacted with (e.g., viewed, moved, modified, etc.) by users of the first electronic device and the second electronic device. In some examples, the shared content may be moved within a shared three-dimensional environment presented by the first electronic device and the second electronic device by directly or indirectly interacting with the shared content. However, in some such examples, moving the shared content closer to the viewpoint of one user optionally moves the shared content farther from the viewpoint of another user in the multiple communication session. Accordingly, it may be advantageous to provide a method for spatially refining (e.g., movement and/or repositioning of an avatar and/or shared object) in a shared three-dimensional environment when multiple devices are in a multi-communication session that will allow for moving content at one electronic device but not another electronic device. Attention is now directed to exemplary interactions involving spatial refinement of shared content in a multi-user communication session between a first electronic device and a second electronic device.

In some examples, the three-dimensional environment shared between the first electronic device 560 and the second electronic device 570 may include one or more shared virtual objects. For example, as shown in fig. 5A, the first electronic device 560 and the second electronic device 570 may each display a virtual tray 514 that includes a virtual mug 552, which may be shared between the electronic devices 560/570. As shown, the shared virtual object may be displayed with a crawler or handle affordance 535 that is optionally selectable to initiate movement of the shared virtual object (e.g., virtual tray 514 and virtual mug 552) within the three-dimensional environment 550A/550B. As shown in fig. 5A, in some examples, the shared virtual object may be positioned closer to the viewpoint of one user than another user in the three-dimensional environment 550A/550B (e.g., when the shared virtual object is initially displayed in the three-dimensional environment 550A/550B). For example, in fig. 5A, at the second electronic device 570, the shared virtual objects 514 and 552 are displayed in the three-dimensional environment 550B at a first location that is optionally a first distance (e.g., "near," or within a threshold distance, such as 0.2, 0.4, 0.5m, 0.7m, 1m, 1.2m, etc.) from the viewpoint of the user of the second electronic device 570. Because the objects in the three-dimensional environment 550A/550B maintain spatial realism while the first electronic device 560 and the second electronic device 570 are in the multi-user communication session, the shared virtual objects 514 and 552 are optionally displayed at a second location in the three-dimensional environment 550A different from the first location, at a second distance (e.g., "away," or greater than a threshold distance, such as greater than 1m, 1.4m, 1.5m, 2m, etc.) from the viewpoint of the user of the first electronic device 560, at the first electronic device 560.

Additionally, in some examples, the position of avatars 515 and 517 within three dimensional environment 550A/550B may reflect/indicate the relative distance between shared virtual objects 514 and 552 and the point of view of the user of electronic device 560/570. For example, as shown in fig. 5A, because shared virtual objects 552 and 514 are located a first distance from the viewpoint of the user of the second electronic device 570 in the three-dimensional environment 550B, shared virtual objects 552 and 514 are displayed at the first electronic device 560 at a first distance from avatar 515 corresponding to the user of the second electronic device 570 in the three-dimensional environment 550A. Similarly, as shown, because shared virtual objects 552 and 514 are located a second distance from the viewpoint of the user of first electronic device 560 in three-dimensional environment 550A, shared virtual objects 552 and 514 are displayed at second electronic device 570A second distance from avatar 517 corresponding to the user of first electronic device 560 in three-dimensional environment 550B.

In some examples, because shared virtual objects 514 and 552 are positioned away from the viewpoint of the user of first electronic device 560, the user of first electronic device 560 may wish to move shared virtual objects 514 and 552 closer to the viewpoint of the user of first electronic device 560. As shown in fig. 5B, a user of the first electronic device 560 may provide a selection input 572A directed to a gripper or handle affordance 535. For example, when a user of the first electronic device 560 is looking at the grab or handle affordance 535 or sharing a portion of the virtual object 514/552, the user provides a pinch gesture, a tap or touch gesture, a verbal command, or the like, that points to and/or is directed. As shown, the selection input 572A provided by one or more fingers of the user's hand may be followed by a drag/move input 574A towards the user's point of view of the first electronic device 560. For example, while maintaining the selection input (e.g., the user's hand continues to pinch on the gesture), the user of the first electronic device moves the hand closer to a portion of the user's body (e.g., the chest). In some examples, the selection input 572A may alternatively be provided directly to the virtual tray 514 or virtual mug 552 in the three-dimensional environment 550A.

As shown in fig. 5C, when shared virtual objects 514 and 552 are displayed in three-dimensional environment 550A at a location closer to the viewpoint of the user of first electronic device 560 than the location before the first electronic device 560 received the selection and movement input (e.g., as shown in fig. 5B), shared virtual objects 514 and 552 are now positioned in three-dimensional environment 550B at a location farther from the viewpoint of the user of second electronic device 570 than the location before the first electronic device 560 received the selection and movement input (e.g., after the end of the selection and/or movement input is detected, as the user's hand is released from the pinch gesture). Thus, in some examples, it may be advantageous to allow a user of the first electronic device and/or the second electronic device to spatially refine a virtual object shared between the first electronic device and the second electronic device without moving the virtual object to an undesirable location within the three-dimensional environment (as demonstrated above). The following discussion relates to exemplary interactions of spatial refinements of shared virtual objects 514 and 552 in a multi-user communication session.

In some examples, instead of moving the shared virtual object 514/552 individually, an avatar corresponding to another user may be moved to spatially refine the shared virtual object 514/552 at both electronic devices (which may move the shared virtual object 514/552 closer to the viewpoint of one user but farther from the viewpoint of the other user). For example, as shown in FIG. 5D, instead of providing input directed to the shared virtual object 514/552, the user of the first electronic device 560 may provide a selection input 572B directed to the avatar 515 corresponding to the user of the second electronic device 570. For example, optionally while the user's gaze is directed toward the avatar 515, the user of the first electronic device 560 is providing a pinch gesture with one hand, a two-handed pinch gesture (e.g., a gesture with the index finger and thumb of the hand touching, separating and second touching), a two-handed pinch and hold gesture (e.g., a gesture with the index finger and thumb of the hand touching and holding touching for a threshold amount of time (e.g., 1s, 1.5s, 2s, 2.5s, 3s, 4s, etc.), a selection of a spatially refined affordance (not shown) displayed in a predetermined area of the three-dimensional environment 550A (e.g., at or near the top of the field of view of the three-dimensional environment 550A), or a verbal command, among other possibilities. Subsequently, the user of the first electronic device 560 may provide a drag/move input 574B towards the viewpoint of the user of the first electronic device 560, as shown. In some examples, the avatar 515 may be translated and/or rotated (e.g., about an axis based on the user's point of view) in the three-dimensional environment 550B based on the movement input 574B (e.g., in four degrees of freedom).

In some examples, when the user of the first electronic device 560 is providing the selection input 572B and/or the drag input 574B, the second electronic device 570 may change the display of the avatar 517 corresponding to the user of the first electronic device 560 in the three-dimensional environment 550B. In particular, it may be advantageous to change the appearance of the avatar 517 in the three-dimensional environment 550B to avoid the appearance of physical interactions between the users of the first and second electronic devices, which may be understood as a potentially invasive, socially unacceptable, and/or otherwise offensive gesture performed by the avatar 517 from the perspective of the user of the second electronic device 570 (e.g., a display of a hand such as the avatar 517 within the personal space of the user of the second electronic device 570 and/or in direct contact with the user of the second electronic device). For example, as shown in fig. 5D, when the user of the first electronic device 560 provides a selection input 572B and/or a movement input 574B at the first electronic device 560, the second electronic device 570 optionally changes the appearance of an avatar 517 corresponding to the user of the first electronic device 460, as shown by the dashed outline 576B. In some examples, for an avatar that includes a whole body depiction of the user, changing the appearance of the avatar may include fading, blurring, or stopping the display of a portion of the avatar (e.g., the avatar's hands, arms, and/or torso). Additionally or alternatively, in some examples, changing the appearance of the avatar may include stopping the avatar's animation such that the input provided by the user (e.g., pinch and drag gestures) is also not performed by the avatar corresponding to the user. In some examples, as described in more detail below, changing the appearance of the avatar may include replacing the display of the avatar with a second representation, e.g., an abstract representation.

In some examples, in response to receiving the selection input 572B and/or the movement input 574B, the first electronic device 560 can change the display of the avatar 515 corresponding to the user of the second electronic device 570 in the three-dimensional environment 550A. For example, as shown in fig. 5D, at the first electronic device 560, in response to receiving the selection input 572B and/or the drag input 574B, the first electronic device 560 changes the appearance of the avatar 515 corresponding to the user of the second electronic device 570. For example, the first electronic device 560 may display the avatar 515 with a lighting/highlighting effect 578, as shown in fig. 5D, to provide feedback regarding successful selection of the avatar 515. In some examples, in response to receiving the selection input 572B and/or the movement input 574B, the first electronic device 560 may fade, obscure, and/or cease display of a portion of the avatar 515 (e.g., a portion of the avatar targeted by the user of the first electronic device 560).

Additionally, in some examples, in response to receiving selection input 572B and/or movement input 574B, first electronic device 560 optionally displays a planar element (e.g., a disk or disc-shaped element) 537 (and optionally a representation of other users' private content and/or applications) under the shared object in three-dimensional environment 550A. For example, as shown in fig. 5D, a disc 537 may be displayed under an avatar 515 and shared virtual objects (e.g., virtual tray 514 and virtual mug 552) corresponding to a user of the second electronic device 570. In some examples, the center of the disc 537 may be positioned at the viewpoint of the user of the first electronic device 560, and the edge of the disc 537 extends into the three-dimensional environment 550A to include all objects selected for spatial refinement. Thus, the disc 537 may be used as a reference point for subsequent movements of the objects selected for spatial refinement at the first electronic device 560 (i.e., the avatar 515 and the shared virtual objects 514 and 552). For example, the disc 537 extends within the three-dimensional environment 550A to include (e.g., display underneath) all objects selected for spatial refinement. It should be appreciated that while a disk is shown in fig. 5D and described herein, in some examples, alternative user interface elements, such as a platform or tray of rectangular, square, triangle, octagon, etc., may be displayed under the avatar 515 and shared object in the three-dimensional environment 550A. As discussed in more detail below, the first electronic device may move the disc 537 in the three-dimensional environment 550A as the object selected for refinement moves in the three-dimensional environment 550A.

In some examples, movement input directed to the avatar 515 corresponding to the user of the second electronic device 570 causes the avatar 515 and any shared objects to move in the three-dimensional environment 550A in accordance with the movement input. For example, as shown in FIG. 5E, moving the avatar 515 corresponding to the user of the second electronic device 570 in the direction of the movement input 574B may cause the first electronic device 560 to also move the shared virtual objects 514 and 552 in the three-dimensional environment 550A to move with the avatar 515. As shown, as the user of the first electronic device 560 moves the avatar 515 in the three-dimensional environment 550A, the first electronic device 560 moves the virtual tray 514 and virtual mug 552 along with the avatar 515. In addition, according to the movement of avatar 515, first electronic device 560 moves disk 537 displayed underneath avatar 515 and shared virtual objects 514 and 552. In some examples, the selection input 572B (e.g., pinch gesture) is maintained as the object moves within the three-dimensional environment 550A.

In some examples, movement of avatar 515 corresponding to the user of second electronic device 570 optionally does not cause shared virtual objects 514 and 552 to move in three-dimensional environment 550B displayed at second electronic device 570. For example, unlike the scenario described above with reference to fig. 5C, when the shared virtual objects 514 and 552 move in the three-dimensional environment 550A according to the movement of the avatar 515, the second electronic device 570 discards displaying the movement of the shared virtual objects 514 and 552 displayed in the three-dimensional environment 550B. Instead, in some examples, the second electronic device 570 displays movements of an avatar 517 corresponding to the user of the first electronic device 560 in the three-dimensional environment 550B. For example, when the user of the first electronic device 560 moves the avatar 515 a first amount (e.g., according to the respective magnitude of the movement input 574B) in the three-dimensional environment 550A, the second electronic device 570 displays a second amount of movement of the avatar 517 in the three-dimensional environment 550B in the direction of the movement input 574B, corresponding to the first amount. In some examples, the second amount is optionally equal to the first amount. In some examples, the second amount is optionally proportional to the first amount. As shown in fig. 5E, since the direction of the movement input 574B is toward the viewpoint of the user of the first electronic device 560, the second electronic device 570 optionally displays movement of the avatar 517 corresponding to the user of the first electronic device 560 toward the viewpoint of the user of the second electronic device 570. In some examples, the distance between the user of the first electronic device and the object is the same as the distance between the avatar 517 corresponding to the user of the first electronic device and the object presented to the user of the second electronic device.

As shown in fig. 5E, as the user of the first electronic device 560 continues to provide the selection input 572B and/or the movement input 574B, the appearance of the avatar 517 corresponding to the user of the first electronic device 560 remains altered in the three-dimensional environment 550B, as shown by the dashed outline 576B. Additionally, in some examples, as the user of the first electronic device 560 continues to provide the selection input 572B and/or the movement input 574B (or until a deselect input is received), the appearance of the avatar 515 corresponding to the user of the second electronic device 570 remains changed in the three-dimensional environment 550A, as indicated by the highlight 578.

As shown in fig. 5F, the user of the first electronic device 560 optionally no longer provides a selection input 572B and a movement input 574B (or receives a deselect input) directed to the avatar 515 in the three-dimensional environment 550A. In some examples, in response to detecting the end of the movement input directed to the avatar 515, the first electronic device 560 displays the avatar 515 and the shared virtual objects 552 and 514 at a new location in the three-dimensional environment 550A determined based on the end of the movement input. Similarly, in some examples, the second electronic device 570 displays the avatar 517 at a new position in the three-dimensional environment 550B determined based on the end of the movement input at the first electronic device 560. As shown in fig. 5F, the first electronic device 560 may restore the appearance of the avatar 515 in the three-dimensional environment 550A (e.g., such that the avatar 515 is no longer displayed with the lighting/highlighting effect 578). Similarly, the second electronic device 570 may restore the appearance of the avatar 517 in the three-dimensional environment 550B (e.g., such that the avatar 517 is no longer displayed in a faded or blurred manner, as previously shown by the dashed outline 576B).

As outlined above and shown in fig. 5F, by providing a movement input at the first electronic device 560 that points to the avatar 515, the avatar 515 and the shared virtual objects 514 and 552 may be spatially refined within the three-dimensional environment 550A, which enables the shared virtual objects 514 and 552 to be positioned at advantageous locations within the three-dimensional environments 550A and 550B at both electronic devices 560 and 570. Thus, one advantage of the disclosed method for spatial refinement in a multi-user communication session is that a shared object and an avatar corresponding to a user of an electronic device may be positioned at a favorable location to easily interact with the shared object in the multi-user communication session by the user of the electronic device. An additional advantage of the disclosed method is that the spatial refinement of the shared object and avatar is intuitive from the perspective of the user providing the spatial refinement input, and the resulting spatial refinement is intuitive from the perspective of the other users, because the electronic device displays the movement of the avatar corresponding to the user providing the spatial refinement input, rather than displaying the movement of the shared object, while the shared content remains stationary. Attention is now directed to further exemplary interactions involving spatial refinement in multi-user communication sessions between multiple electronic devices.

In some examples, the avatar 515 corresponding to the user of the second electronic device 570 may alternatively be translated laterally within the three-dimensional environment 550A. Additionally, in some examples, the three-dimensional environment 550A may include one or more virtual objects (e.g., private application windows) that are not shared with the second electronic device 570 in the multi-user communication session. As shown in fig. 5G, the three-dimensional environment 550A can include a respective user interface element 524, which can be a non-shared application window corresponding to a respective application (e.g., application a) running on the first electronic device 560. Because the respective user interface element 524 is not shared, the second electronic device 570 optionally displays a representation 524 of the respective user interface element in the three-dimensional environment 550B. As described above, in some examples, the representation 524 "of the respective user interface element may be a fade, occlusion, color change, and/or semi-transparent representation of the respective user interface element 524 that prevents a user of the second electronic device 570 from viewing the content of the respective user interface element 524.

As shown in fig. 5G, in some examples, the user of the first electronic device 560 may provide a selection input 572C directed to an avatar 515 corresponding to the user of the second electronic device 570, followed by a movement input 574C. For example, while the user's gaze of the first electronic device 560 is directed toward the avatar 515, the user may provide a pinch gesture (e.g., using the user's hand), followed by movement of the user's hand while maintaining the pinch gesture. In some examples, the selection input 572C corresponds to the selection input 572B described above with reference to fig. 5D. As shown in fig. 5G, the movement input 574C optionally corresponds to movement of the avatar 515 to the right in the three-dimensional environment 550A from the perspective of the user of the first electronic device 560.

In some examples, the second electronic device 570 optionally changes the display of an avatar 517 corresponding to the user of the first electronic device 560 in the three-dimensional environment 550B in response to receiving the selection input 572C and/or the movement input 574C at the first electronic device 560. For example, as discussed above with reference to fig. 5D, in response to receiving the selection input 572C and/or the movement input 574C at the first electronic device 560, the second electronic device 570 changes the appearance of the avatar 517, e.g., replacing the display of the avatar 517 with the abstract representation 576C, as shown in fig. 5G. As discussed above, changing the appearance of the avatar 517 in the three-dimensional environment 550B at the second electronic device 570 avoids the situation where input provided by the user of the first electronic device causes the avatar 517 to perform potentially offensive interactions directed to the user of the second electronic device 570.

In some examples, as shown in fig. 5H, in response to receiving a movement input 574C directed to an avatar 515 corresponding to a user of the second electronic device 570, the first electronic device 560 moves the avatar 515 in the three-dimensional environment 550A in accordance with the movement input 574C. Additionally, as shown, as avatar 515 moves according to movement input 574C, shared virtual objects 514 and 552 and disk 537 move in three-dimensional environment 550A. As described above, the respective user interface element 524 is not selected for spatial refinement because the respective user interface element 524 is a non-shared object in the three-dimensional environment 550A. Thus, as the avatar 515 and the shared virtual objects 514 and 552 move in the three-dimensional environment 550A according to the movement input 574C, the first electronic device 560 optionally relinquishes moving the corresponding user interface element 524, as shown in FIG. 5H.

It should be appreciated that while forward and lateral movement of the avatar 515/517 and the shared virtual objects 514 and 552 is shown and described herein, additional or alternative movement may be provided based on movement of the user's hand. For example, the electronic device may move the avatar and the shared virtual object forward and sideways in a three-dimensional environment according to forward and sideways movements of the user's hand. Further, it should be appreciated that in some examples, additional or alternative options for initiating spatial refinement at the electronic device may be provided. For example, a user of the electronic device may select a spatially refined affordance displayed in the three-dimensional environment that allows the user to individually select objects and/or avatars that the user wishes to move in the three-dimensional environment. Additionally, in some examples, the electronic device may display a list of options including options to initiate spatial refinement when an object (e.g., an avatar or shared object) is selected.

In some examples, when performing scene refinement in a multi-user communication session, it may be advantageous to change the orientation of one or more avatars corresponding to a user of the electronic device. For example, when the content shared between the first electronic device and the second electronic device in the multi-communication session is video content, it may be advantageous to provide functionality that enables the first user to redirect an avatar corresponding to the second user to face the shared video content (e.g., to simulate a real viewing experience) rather than to the first user's point of view (e.g., as discussed above). Attention is now directed to exemplary interactions involving spatial refinement of the orientation of an avatar in a multi-user communication session.

It should be appreciated that while the spatial refinement illustrated in fig. 5A-5I includes translational motion, in some examples, the spatial refinement may include translation, rotation, and/or both translation and rotation. In some such examples, the rotational movement may be performed relative to any desired reference point, such as a reference point at a point of view of the user, a reference point at a position of a shared object in the three-dimensional environment, a reference point at a position of an avatar of the user in the three-dimensional environment, and/or a reference point at a position selected by the user (e.g., based on a gaze of the user and/or in response to receiving a two-hand pinch and rotate gesture).

As shown in fig. 6A, a user of the second electronic device 670 is optionally providing a selection input 672A directed to a play option 627 in an application window 632 in the three-dimensional environment 650B. For example, optionally, when the user of the second electronic device 670 is looking at the play option 627, the user may provide a pinch gesture, a tap or touch gesture, a verbal command, or the like. In some examples, in response to accepting selection input 672A, second electronic device 670 may display video content 625 within three-dimensional environment 650B, as shown in fig. 6B.

In a multi-user communication session, a user of the second electronic device 670 may share video content 625 with a user of the first electronic device 660 such that the video content 625 may be displayed within the application window 632 in the three-dimensional environment 650A. However, prior to sharing video content 625, it may be advantageous to allow users of electronic devices 670/660 to change their orientation within three-dimensional environment 650A/650B in a multi-user communication session to target particular objects or content items displayed in three-dimensional environment 650A/650B. For example, as shown in FIG. 6A, the orientation of avatar 615 in three-dimensional environment 650A corresponding to the user of second electronic device 670 is optionally forward facing with respect to the viewpoint of the user of first electronic device 660, and the orientation of avatar 617 in three-dimensional environment 650B corresponding to the user of first electronic device 660 is optionally forward facing with respect to the viewpoint of the user of second electronic device 670. It may be desirable for the user of the first electronic device 660 to redirect himself in the three-dimensional environment 650A to instead face the front side of the representation 632 "of the front side of the application window and to be positioned beside the avatar 615 to simulate a more realistic shared viewing experience between the user of the first electronic device 660 and the user of the second electronic device 670 when sharing the video content 625.

Accordingly, the user of the first electronic device 660 may provide a selection input 672B directed to the avatar 615 in the three-dimensional environment 650A, followed by a move/drag input 674A, as shown in fig. 6B. For example, while the user of the first electronic device 660 is looking at the avatar 615, the user may provide a pinch gesture (e.g., using the user's hand) followed by movement of the user's hand while maintaining the pinch gesture (or other selection input). In some examples, selection input 672B corresponds to selection input 572B and/or selection input 572C described above with reference to fig. 5D and 5G. As shown in fig. 6B, movement input 674A optionally corresponds to movement of avatar 615 to the right in three-dimensional environment 650A and closer to the viewpoint of the user of first electronic device 660.

In some examples, in response to receiving selection input 672B and/or movement input 674A at first electronic device 660, second electronic device 670 optionally changes the display of avatar 617 corresponding to the user of first electronic device 660 in three-dimensional environment 650B. For example, as discussed above with reference to fig. 5D and 5G, in response to receiving selection input 672B and/or movement input 674A at first electronic device 660, second electronic device 670 changes the appearance of avatar 617, e.g., fades, obscures, changes color, or stops the display of a portion of avatar 617 (e.g., hand, arm, shoulder, and/or chest) as shown by dashed outline 676A in fig. 6B. As discussed above, changing the appearance of the avatar 617 in the three-dimensional environment 650B at the second electronic device 670 avoids the situation where input provided by the user of the first electronic device causes the avatar 617 to perform potentially offensive interactions directed to the user of the second electronic device 670.

In some examples, the electronic device 660/670 may implement attractive field performance (e.g., similar in function to gravity or a magnetic field) for movement of an avatar falling within a threshold distance from a user viewpoint and/or a predefined location within a three-dimensional environment in a multi-user communication session. In some examples, the attractive field performance optionally causes the orientation of the avatar to change such that the orientation may point to a particular object or direction in the three-dimensional environment. For example, the movement input 674A optionally corresponds to movement of the avatar 615 within a threshold distance (e.g., 0.2m, 0.4m, 0.5m, 0.8m, 1m, or 1.2 m) of the viewpoint of the user of the first electronic device 660. Additionally or alternatively, in some examples, the movement input 674A optionally corresponds to movement of the avatar 615 in the three-dimensional environment 650A to a predefined location near/beside the viewpoint of the user of the first electronic device 660. Thus, as shown in FIG. 6C, in response to detecting movement of the avatar 615 within a threshold distance of the viewpoint of the user of the first electronic device 660 and/or to a predefined location in the three-dimensional environment 650A, the first electronic device 660 achieves attractive field performance and changes the orientation of the avatar 615 and the representation 632 "of the application window (e.g., rotates them 180 degrees relative to the viewpoint of the user of the second electronic device 670, as indicated by arrow 675).

In some examples, as the avatar 615 is moved closer to a predefined position and/or across a threshold distance, the orientation of the avatar may gradually change according to the greater magnitude of movement until the avatar is redirected to the user-specified orientation. For example, once the avatar crosses the threshold distance, the orientation of the avatar may gradually change such that the avatar's face rotates counter-clockwise by a corresponding amount (e.g., 5 degrees or 10 degrees) as it is moved closer to the user's point of view and/or predefined location. Thus, as described above, by providing attractive field-like capabilities for movement of an avatar in a multi-communication session, the avatar may be redirected to face a particular object, rather than to the point of view of a user in the multi-communication session.

It should be understood that the examples shown and described herein are merely exemplary, and that additional and/or alternative elements for interacting with content and/or avatars may be provided within a three-dimensional environment. It should be understood that the appearance, shape, form, and size of each of the various user interface elements and objects shown and described herein are exemplary and that alternative appearances, shapes, forms, and/or sizes may be provided. For example, virtual objects (e.g., 524 and 632) representing application windows may be provided in alternative shapes other than rectangular shapes (such as circular shapes, triangular shapes, etc.). In some examples, various selectable options described herein (e.g., option 419A/419B/419C, option 627, or option 623) may be verbally selected via a user verbal command (e.g., a "select option" verbal command). Additionally or alternatively, in some examples, various options, user interface elements, control elements, etc. described herein may be selected and/or manipulated via user input received through one or more separate input devices in communication with an electronic device. For example, the selection input may be received via a physical input device (such as a mouse, touch pad, keyboard, etc.) in communication with the electronic device.

Additionally, it should be appreciated that although the above-described method is described with reference to two electronic devices, the above-described method is optionally applicable to two or more electronic devices communicatively linked in a communication session. For example, when a multi-user communication session is initiated between three, four, five, or more electronic devices, each electronic device may individually scan the environment surrounding the electronic device to generate an occupancy map, identify preferred placement locations within the three-dimensional environment presented at the electronic device based on the occupancy map, identify one or more updated placement locations if needed (e.g., if the preferred placement locations include objects), and/or display avatars corresponding to users of other electronic devices at the placement locations determined in the three-dimensional environment (e.g., as described with reference to fig. 4A-4I). In some examples, when three, four, five, or more electronic devices are communicatively linked in a multi-user communication session, when a user of one electronic device provides a movement input at the electronic device, if the movement input is directed to a shared object in the multi-user communication session, the movement input moves the shared object at the electronic device, and if the movement input is directed to an avatar in the multi-user communication session, the movement input moves avatars and shared objects at the electronic device that correspond to users of other electronic devices (e.g., as described with reference to fig. 5A-5I). In some examples, when three, four, five, or more electronic devices are communicatively linked in a multi-user communication session, when a user of one electronic device provides a movement input directed to an avatar in the multi-communication session, an orientation of the avatar changes in the three-dimensional environment if the movement input moves the avatar within a threshold distance of a viewpoint of the user and/or to a predefined location in the three-dimensional environment presented at the electronic device (e.g., as described with reference to fig. 6A-6D).

Fig. 7A-7B illustrate a flowchart illustrating an exemplary process of spatial placement of an avatar in a multi-user communication session at an electronic device, according to some examples of the present disclosure. In some examples, process 700 begins at a first electronic device in communication with a display, one or more input devices, and a second electronic device. In some examples, the first electronic device and the second electronic device are optionally head mounted displays similar to or corresponding to devices 260/270 of fig. 2, respectively. As shown in fig. 7A, in some examples, at 702, a first electronic device may present, via the display, a computer-generated environment including a captured portion of a physical environment surrounding the first electronic device. For example, as discussed above with reference to fig. 4A, the first electronic device 460 may display a three-dimensional environment 450A that includes a representation 406 'of a table and a representation 409' of a window. In some examples, at 704, when displaying the computer-generated environment, the first electronic device receives, via the one or more input devices, a first input corresponding to a request to enter a communication session with the second electronic device. For example, as discussed above with reference to fig. 4A, the first electronic device 460 may receive a selection input 472A corresponding to a request to enter a multi-user communication session with the second electronic device 470.

As shown in fig. 7A, in some examples, at 706, in response to receiving the first input, the first electronic device scans a physical environment surrounding the first electronic device at 708. For example, as described above with reference to fig. 4B, the first electronic device 460 may scan the physical environment in the field of view of the user of the first electronic device 460 to generate an occupancy map that identifies the locations of objects and/or open space in the three-dimensional environment 450A presented at the first electronic device 460. In some examples, at 710, a first electronic device determines a first location in a computer-generated environment. For example, as shown in fig. 4B, the first electronic device 460 may identify (e.g., using an occupancy map) a placement location 425A that is located at a center of a field of view of a user of the first electronic device 460 and/or a predefined distance 426A from a viewpoint 418A of the user.

As shown in fig. 7B, in some examples, a first electronic device may enter into a communication session with a second electronic device at 712. In some examples, at 714, the first electronic device may display, via the display, a virtual object representing a user of the second electronic device at a first location in the computer-generated environment in accordance with a determination that the first set of criteria is satisfied. For example, as shown in fig. 4C, because the identified placement location 425A does not include any objects, the first set of criteria is met and the first electronic device 460 displays an avatar 415 corresponding to the user of the second electronic device 470 at the determined placement location 425A in the three-dimensional environment 450A. In some examples, at 716, the first electronic device displays a virtual object representing a user of the second electronic device at a second location different from the first location in the computer-generated environment based on the determination that the first set of criteria is not met. For example, as shown in fig. 4E-4F, because the identified placement location 425A includes an object (e.g., a portion of the representation 406' of a table), the first set of criteria is not met and the first electronic device 460 displays an avatar 415 corresponding to the user of the second electronic device 470 at the updated placement location 425C in the three-dimensional environment 450A.

It should be appreciated that process 700 is an example and that more, fewer, or different operations may be performed in the same or different order. In addition, the operations in process 700 described above are optionally implemented by running one or more functional modules in an information processing device, such as a general purpose processor (e.g., as described with respect to fig. 2) or a dedicated chip, and/or by other components of fig. 2.

Fig. 8 illustrates a flow chart showing an exemplary process of spatial refinement in a multi-user communication session at an electronic device, according to some examples of the disclosure. In some examples, process 800 begins at a first electronic device in communication with a display, one or more input devices, and a second electronic device. In some examples, the first electronic device and the second electronic device are optionally head mounted displays similar to or corresponding to devices 260/270 of fig. 2, respectively. As shown in fig. 8, in some examples, a first electronic device presents, via a display, a computer-generated environment including an avatar and a first shared object corresponding to a user of a second electronic device while in a communication session with the second electronic device, 802. For example, as shown in fig. 5A, the first electronic device 560 may present a three-dimensional environment 550A that includes an avatar 515 corresponding to the user of the second electronic device 570 and the shared virtual tray 514. In some examples, at 804, the first electronic device receives a first input via one or more input devices while displaying a computer-generated environment that includes an avatar and a first shared object corresponding to a user of the second electronic device. For example, as shown in fig. 5B, the first electronic device 560 receives a selection input 572A, followed by a movement input 574A.

In some examples, at 810, the first electronic device moves the first shared object according to the first input without moving the avatar in accordance with a determination that the first input corresponds to a request to move the first shared object in the computer-generated environment. For example, as shown in fig. 5B, in response to receiving the selection input 572A and the movement input 574A directed to the shared virtual tray 514, the first electronic device 560 moves the shared virtual tray 514 in the three-dimensional environment 550A in accordance with the movement input 574A without moving the avatar 515, as shown in fig. 5C. In some examples, the first shared object is moved in the computer-generated environment without moving representations of other shared objects, avatars, and other users' private applications.

It should be appreciated that process 800 is an example and that more, fewer, or different operations may be performed in the same or different order. In addition, the operations in process 800 described above are optionally implemented by running one or more functional modules in an information processing device, such as a general purpose processor (e.g., as described with respect to FIG. 2) or a dedicated chip, and/or by other components of FIG. 2.

Fig. 9 illustrates a flow chart showing an exemplary process of spatial refinement in multi-user communications at an electronic device, according to some examples of the present disclosure. In some examples, process 900 begins at a first electronic device in communication with a display, one or more input devices, and a second electronic device. In some examples, the first electronic device and the second electronic device are optionally head mounted displays similar to or corresponding to devices 260/270 of fig. 2, respectively. As shown in fig. 9, in some examples, at 902, a first electronic device presents, via a display, a computer-generated environment including an avatar and a first shared object corresponding to a user of a second electronic device while in a communication session with the second electronic device. For example, as shown in fig. 5A, the electronic device 570 may present a three-dimensional environment 550B that includes an avatar 517 corresponding to a user of the electronic device 560 and the shared virtual tray 514.

In some examples, at 904, the first electronic device detects a first indication from the second electronic device via one or more input devices while displaying a computer-generated environment that includes an avatar and a first shared object corresponding to a user of the second electronic device. For example, the first electronic device receives an indication from the second electronic device that the second electronic device has received user input (e.g., selection input 572A and/or movement input 574A in fig. 5B). In some examples, in response to detecting the first indication, at 906, the first electronic device moves the avatar in the computer-generated environment according to the first movement input without moving the first shared object according to the first movement input in accordance with a determination that the avatar corresponding to the user of the second electronic device corresponds to movement of the first movement input received at the second electronic device according to the first indication at 908. For example, as shown in fig. 5D, in response to electronic device 560 receiving selection input 572B and/or movement input 574B directed to avatar 515 corresponding to the user of electronic device 570, electronic device 570 moves avatar 517 corresponding to the user of electronic device 560 in three-dimensional environment 550B based on movement input 574B without moving shared virtual tray 514, as shown in fig. 5E.

As shown in fig. 9, in some examples, at 910, in accordance with a determination that the first indication corresponds to movement of the first shared object in accordance with a second movement input received at the second electronic device, the first electronic device moves the first shared object in accordance with the second movement input in the computer-generated environment without moving the avatar. For example, as shown in FIG. 5B, in response to the electronic device 560 detecting the selection input 572A and/or the movement input 574A directed to the shared virtual tray 514, the electronic device 570 moves the shared virtual tray 514 in the three-dimensional environment 550B based on the movement input 574A without moving the avatar 517, as shown in FIG. 5C.

It should be appreciated that process 900 is an example and that more, fewer, or different operations may be performed in the same or different order. In addition, the operations in the above-described process 900 are optionally implemented by running one or more functional modules in an information processing device, such as a general purpose processor (e.g., as described with respect to fig. 2) or a dedicated chip, and/or by other components of fig. 2.

Thus, in accordance with the foregoing, some examples of the disclosure relate to a method comprising: at a first electronic device in communication with a display, one or more input devices, and a second electronic device: presenting, via a display, a computer-generated environment comprising a portion of a physical environment surrounding a first electronic device; when presenting a computer-generated environment, receiving, via one or more input devices, a first input corresponding to a request to enter a communication session with a second electronic device; and responsive to receiving the first input, scanning at least a portion of a physical environment surrounding the first electronic device, determining a first location in the computer-generated environment, and entering a communication session with the second electronic device, comprising: in accordance with a determination that the first set of criteria is met based on the scan of at least a portion of the physical environment, displaying, via the display, a virtual object representing a user of the second electronic device at a first location in the computer-generated environment, and in accordance with a determination that the first set of criteria is not met based on the scan of at least a portion of the physical environment, displaying, at a second location in the computer-generated environment different from the first location, a virtual object representing a user of the second electronic device.

Additionally or alternatively, in some examples, displaying the virtual object representing the user of the second electronic device includes displaying an avatar corresponding to the user of the second electronic device.

Additionally or alternatively, in some examples, the first electronic device and the second electronic device each include a head mounted display.

Additionally or alternatively, in some examples, determining the first location in the computer-generated environment includes determining a location toward a center of a field of view of a user of the first electronic device.

Additionally or alternatively, in some examples, the first set of criteria includes criteria that are met when the first location does not include an object, the second location does not include an object, and the second location is within a field of view of the first electronic device.

Additionally or alternatively, in some examples, determining the first location in the computer-generated environment includes determining a location a predetermined distance from a point of view of the first electronic device.

Additionally or alternatively, in some examples, the first set of criteria includes criteria that are met when the first location does not include an object, the second location does not include an object, and the second location is a respective distance from a viewpoint of the first electronic device that is different than the predetermined distance.

Additionally or alternatively, in some examples, scanning at least a portion of the physical environment surrounding the first electronic device includes identifying one or more physical objects in a field of view of the first electronic device, and the first set of criteria includes criteria that are met when the first location does not include a physical object.

Additionally or alternatively, in some examples, the method further comprises: in response to receiving the first input, one or more virtual objects in the computer-generated environment are identified. Additionally or alternatively, in some examples, the first set of criteria includes criteria that are met when the first location does not include a virtual object.

Additionally or alternatively, in some examples, the method further comprises: in response to receiving the first input, a request to enter a communication session with the first electronic device is transmitted to the second electronic device. Additionally or alternatively, in some examples, the second electronic device scans at least a portion of a physical environment surrounding the second electronic device, and the second electronic device determines a third location in the computer-generated environment.

Additionally or alternatively, in some examples, the virtual object representing the user of the first electronic device is displayed at a third location in the computer-generated environment at the second electronic device in accordance with a determination that the first set of criteria is met, and the virtual object representing the user of the first electronic device is displayed at a fourth location in the computer-generated environment at the second electronic device different from the third location in accordance with a determination that the first set of criteria is not met.

Additionally or alternatively, in some examples, the first location is a predefined distance from the third location in the computer-generated environment based on a determination that the virtual object representing the user of the second electronic device is displayed at the first location and the virtual object representing the user of the first electronic device is displayed at the third location, and the second location is a predefined distance from the fourth location in the computer-generated environment based on a determination that the virtual object representing the user of the second electronic device is displayed at the second location and the virtual object representing the user of the first electronic device is displayed at the fourth location.

Additionally or alternatively, in some examples, the first set of criteria includes a first criterion that is met when the first location does not include an object and a second criterion that is met when the third location does not include an object. Additionally or alternatively, in some examples, the method further comprises: in response to receiving the first input, prior to entering a communication session with the second electronic device: transmitting a first indication to the second electronic device that the first criterion is met because the first location does not include the object, wherein the first location is a predefined distance from a point of view of the first electronic device; and in response to receiving, from the second electronic device via the one or more input devices, a second indication that the second criterion is not met due to the third location including the object in the computer-generated environment at the second electronic device, transmitting, to the second electronic device, a third indication that the first criterion is met due to the second location not including the object, wherein the second location is a first distance from the point of view of the first electronic device that is different than the predefined distance.

Additionally or alternatively, in some examples, the method further comprises: in response to receiving, via the one or more input devices, a fourth indication from the second electronic device that the second criterion is met because the fourth location does not include the object, entering a communication session with the second electronic device, wherein the fourth location is a first distance from a point of view of the second electronic device, comprising: a virtual object representing a user of the second electronic device is displayed in the computer-generated environment at a second location a first distance from a point of view of the first electronic device. Additionally or alternatively, in some examples, a virtual object representing a user of the first electronic device is displayed at a fourth location a first distance from a point of view of the second electronic device in a computer-generated environment at the second electronic device.

Additionally or alternatively, in some examples, the computer-generated environment at the second electronic device includes a first object, and entering the communication session with the second electronic device further includes displaying a virtual object representing a user of the second electronic device with the representation of the first object.

Some examples of the present disclosure relate to a method comprising: at a first electronic device in communication with a display, one or more input devices, and a second electronic device: while in a communication session with the second electronic device, presenting, via the display, a computer-generated environment including an avatar and a first shared object corresponding to a user of the second electronic device; receiving a first input via one or more input devices while displaying a computer-generated environment that includes an avatar and a first shared object corresponding to a user of a second electronic device; and in response to receiving the first input, moving the avatar and the first shared object in accordance with the first input in accordance with a determination that the first input corresponds to a request to move an avatar corresponding to a user of the second electronic device in the computer-generated environment, and moving the first shared object without moving the avatar in accordance with the first input in accordance with a determination that the first input corresponds to a request to move the first shared object in the computer-generated environment.

Additionally or alternatively, in some examples, the computer-generated environment further includes a second shared object.

Additionally or alternatively, in some examples, in response to receiving the first input, in accordance with a determination that the first input corresponds to a request to move the second shared object in the computer-generated environment, the second shared object is moved in accordance with the first input without moving the avatar and the first shared object.

Additionally or alternatively, in some examples, the method further comprises: in response to receiving the first input, moving the avatar, the first shared object, and the second shared object in accordance with the first input in accordance with a determination that the first input corresponds to a request to move an avatar corresponding to a user of the second electronic device in the computer-generated environment.

Additionally or alternatively, in some examples, the first input includes a pinch gesture provided by a hand of a user of the first electronic device, and movement of the hand of the user while the pinch gesture is held by the hand.

Additionally or alternatively, in some examples, the first input includes a first pinch gesture provided at least partially simultaneously by a first hand of a user of the first electronic device and a second pinch gesture provided by a second hand, and movement of the first hand or the second hand of the user while maintaining the pinch gesture with the first hand or the second hand.

Additionally or alternatively, in some examples, the method further comprises: responsive to receiving the first input, determining from the first input that corresponds to a request to move an avatar corresponding to a user of the second electronic device in the computer-generated environment: displaying, via a display, a planar element in a computer-generated environment under the avatar and the first shared object; and moving the planar element with the avatar and the first shared object according to the first input.

Additionally or alternatively, in some examples, the method further comprises: responsive to receiving the first input, the planar element is abandoned from being displayed in the computer-generated environment under the avatar and the first shared object in accordance with a determination that the first input corresponds to a request to move the second shared object in the computer-generated environment.

Additionally or alternatively, in some examples, the method further comprises: upon detecting an end of the first input, receiving, via the one or more input devices, a second input corresponding to a request to move an avatar corresponding to the user of the second electronic device to a respective location in the computer-generated environment; and in response to receiving the second input, restricting movement of the avatar and the first shared object to respective positions in the computer-generated environment in accordance with a determination that the respective positions include the object, and moving the avatar and the first shared object to respective positions in the computer-generated environment in accordance with a determination that the respective positions do not include the object.

Additionally or alternatively, in some examples, an avatar corresponding to a user of the second electronic device has a first orientation relative to the first shared object in the computer-generated environment. Additionally or alternatively, in some examples, the method further comprises: in response to receiving the first input, a display of the avatar is maintained in a first orientation when the avatar and the first shared object are moved in accordance with a determination that the first input corresponds to a request to move an avatar corresponding to a user of the second electronic device in the computer-generated environment.

Additionally or alternatively, in some examples, the computer-generated environment further includes a first non-shared object prior to receiving the first input.

Additionally or alternatively, in some examples, the method further comprises: in response to receiving the first input, moving the avatar and the first shared object without moving the first non-shared object in accordance with the first input in accordance with a determination that the first input corresponds to a request to move an avatar corresponding to a user of the second electronic device in the computer-generated environment.

Additionally or alternatively, in some examples, the first shared object includes video content configured to be played in a computer-generated environment.

Additionally or alternatively, in some examples, an avatar corresponding to a user of the second electronic device has a first orientation in the computer-generated environment. Additionally or alternatively, in some examples, the method further comprises: upon detecting an end of the first input, receiving, via the one or more input devices, a second input corresponding to a request to move an avatar corresponding to the user of the second electronic device to a respective location in the computer-generated environment; and in response to receiving the second input, moving an avatar corresponding to the user of the second electronic device to a respective location in the computer-generated environment in accordance with a determination that the respective location is within a threshold distance and within a threshold angle from the viewpoint of the user of the first electronic device, and displaying the avatar corresponding to the user of the second electronic device in a second orientation that is different from the first orientation and that faces the shared first object.

Some examples of the present disclosure relate to a method comprising: at a first electronic device in communication with a display, one or more input devices, and a second electronic device: while in a communication session with the second electronic device, presenting, via the display, a computer-generated environment including an avatar and a first shared object corresponding to a user of the second electronic device; receiving, via one or more input devices, a first indication from a second electronic device while displaying a computer-generated environment that includes an avatar and a first shared object corresponding to a user of the second electronic device; and in response to receiving the first indication, moving the avatar in the computer-generated environment without moving the first shared object in accordance with the first movement input in accordance with a determination that the avatar corresponding to the user of the second electronic device corresponds to movement of the first movement input received at the second electronic device in accordance with the first movement input, and moving the first shared object in the computer-generated environment without moving the avatar in accordance with the second movement input in accordance with a determination that the first shared object corresponds to movement of the first shared object in accordance with the second movement input received at the second electronic device.

Additionally or alternatively, in some examples, prior to receiving the first indication, the computer-generated environment includes a representation of the first non-shared object.

Additionally or alternatively, in some examples, the method further comprises: in response to receiving the first indication, moving the avatar and the representation of the first non-shared object in the computer-generated environment without moving the first shared object in accordance with the first movement input in accordance with a determination that the avatar corresponding to the user of the second electronic device corresponds to movement of the first movement input received at the second electronic device in accordance with the first indication.

Some examples of the present disclosure relate to an electronic device, comprising: one or more processors; a memory; and one or more programs stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing any of the methods described above.

Some examples of the present disclosure relate to a non-transitory computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by one or more processors of an electronic device, cause the electronic device to perform any of the above-described methods of the claims.

Some examples of the present disclosure relate to an electronic device, comprising: one or more processors; a memory; and means for performing any of the above methods.

Some examples of the present disclosure relate to an information processing apparatus for use in an electronic device, the information processing apparatus including means for performing any one of the methods described above.

The foregoing description, for purposes of explanation, has been described with reference to specific embodiments. However, the illustrative discussions above are not intended to be exhaustive or to limit the invention to the precise forms disclosed. Many modifications and variations are possible in light of the above teaching. The examples were chosen and described in order to best explain the principles of the invention and its practical application, to thereby enable others skilled in the art to best utilize the invention and various described examples with various modifications as are suited to the particular use contemplated.

Claims

1. A method, comprising:

at a first electronic device in communication with a display, one or more input devices, and a second electronic device:

presenting, via the display, a computer-generated environment, the computer-generated environment comprising a portion of a physical environment surrounding the first electronic device;

Receiving, via the one or more input devices, a first input corresponding to a request to enter a communication session with the second electronic device while the computer-generated environment is presented; and

in response to receiving the first input:

scanning at least a portion of the physical environment surrounding the first electronic device;

determining a first location in the computer-generated environment; and

entering the communication session with the second electronic device, comprising:

in accordance with a determination that a first set of criteria is met based on the scanning of at least a portion of the physical environment, displaying, via the display, a virtual object representing a user of the second electronic device at the first location in the computer-generated environment; and

in accordance with a determination that the first set of criteria is not met based on the scanning of at least a portion of the physical environment, the virtual object representing the user of the second electronic device is displayed at a second location different from the first location in the computer-generated environment.

2. The method of claim 1, wherein determining the first location in the computer-generated environment comprises determining a location a predetermined distance from a viewpoint of the first electronic device, and wherein the first set of criteria comprises criteria that are met when the first location does not include an object.

3. The method according to claim 1, wherein:

scanning at least a portion of the physical environment surrounding the first electronic device includes identifying one or more physical objects in a field of view of the first electronic device; and is also provided with

The first set of criteria includes criteria that are met when the first location does not include a physical object.

4. The method of claim 1, further comprising:

in response to receiving the first input:

identifying one or more virtual objects in the computer-generated environment;

wherein the first set of criteria includes criteria that are met when the first location does not include a virtual object.

5. The method of claim 1, further comprising:

in response to receiving the first input:

transmitting a request to the second electronic device to enter a communication session with the first electronic device, wherein:

the second electronic device scans for at least a portion of a physical environment surrounding the second electronic device;

the second electronic device determining a third location in the computer-generated environment;

displaying a virtual object representing a user of the first electronic device at the third location in the computer-generated environment at the second electronic device in accordance with the determination that the first set of criteria is satisfied; and is also provided with

In accordance with the determination that the first set of criteria is not met, displaying the virtual object representing the user of the first electronic device at a fourth location different from the third location in the computer-generated environment at the second electronic device.

6. The method according to claim 5, wherein:

in accordance with a determination that the virtual object representing the user of the second electronic device is displayed at the first location and the virtual object representing the user of the first electronic device is displayed at the third location, the first location is a predefined distance from the third location in the computer-generated environment; and is also provided with

In accordance with a determination that the virtual object representing the user of the second electronic device is displayed at the second location and the virtual object representing the user of the first electronic device is displayed at the fourth location, the second location is the predefined distance from the fourth location in the computer-generated environment.

7. The method of claim 6, wherein the first set of criteria includes a first criterion that is met when the first location does not include an object and a second criterion that is met when the third location does not include an object, the method further comprising:

In response to receiving the first input:

prior to entering the communication session with the second electronic device:

transmitting a first indication to the second electronic device that the first criterion is met because the first location does not include an object, wherein the first location is a predefined distance from a point of view of the first electronic device; and

in response to receiving, via the one or more input devices, a second indication from the second electronic device that the second criterion was not met due to the third location including an object in the computer-generated environment at the second electronic device, transmitting, to the second electronic device, a third indication that the first criterion was met due to the second location not including an object, wherein the second location is a first distance from the point of view of the first electronic device that is different than the predefined distance.

8. The method of claim 7, further comprising:

in response to receiving, via the one or more input devices, a fourth indication from the second electronic device that the second criterion is met because the fourth location does not include an object, wherein the fourth location is the first distance from the point of view of the second electronic device:

displaying the virtual object representing the user of the second electronic device at the second location in the computer-generated environment at the first distance from the point of view of the first electronic device;

wherein the virtual object representing the user of the first electronic device is displayed in the computer-generated environment at the second electronic device at the fourth location that is the first distance from the point of view of the second electronic device.

9. A first electronic device, comprising:

one or more processors;

a memory; and

one or more programs stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing a method comprising:

presenting, via a display, a computer-generated environment, the computer-generated environment comprising a portion of a physical environment surrounding the first electronic device;

when the computer-generated environment is presented, receiving, via one or more input devices, a first input corresponding to a request to enter a communication session with a second electronic device; and

In response to receiving the first input:

determining a first location in the computer-generated environment; and

10. The first electronic device of claim 9, wherein determining the first location in the computer-generated environment comprises determining a location a predetermined distance from a point of view of the first electronic device, and wherein the first set of criteria comprises criteria that are met when the first location does not include an object.

11. The first electronic device of claim 9, wherein:

12. The first electronic device of claim 9, wherein the method further comprises:

in response to receiving the first input:

identifying one or more virtual objects in the computer-generated environment;

13. The first electronic device of claim 9, wherein the method further comprises:

in response to receiving the first input:

14. The first electronic device of claim 13, wherein:

15. The first electronic device of claim 14, wherein the first set of criteria includes a first criterion that is met when the first location does not include an object and a second criterion that is met when the third location does not include an object, the method further comprising:

In response to receiving the first input:

prior to entering the communication session with the second electronic device:

16. The first electronic device of claim 15, wherein the method further comprises:

17. A non-transitory computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by one or more processors of a first electronic device, cause the first electronic device to perform a method comprising:

In response to receiving the first input:

determining a first location in the computer-generated environment; and

18. The non-transitory computer-readable storage medium of claim 17, wherein determining the first location in the computer-generated environment comprises determining a location a predetermined distance from a viewpoint of the first electronic device, and wherein the first set of criteria comprises criteria that are met when the first location does not include an object.

19. The non-transitory computer-readable storage medium of claim 17, wherein:

20. The non-transitory computer-readable storage medium of claim 17, wherein the method further comprises:

in response to receiving the first input:

identifying one or more virtual objects in the computer-generated environment;