WO2021194487A1

Movatterモバイル変換

Info

Publication number: WO2021194487A1
Application number: PCT/US2020/024713
Authority: WO
Inventors: Sarthak GHOSH; Sunil Bharitkar; Rafael Antonio Ballagas
Original assignee: Hewlett Packard Development Co LP
Current assignee: Hewlett Packard Development Co LP
Priority date: 2020-03-25
Filing date: 2020-03-25
Publication date: 2021-09-30
Anticipated expiration: 2022-09-25

Abstract

An example headset device includes a frame to fit a head of a wearer and a sensor at the frame. The sensor is to capture data indicative of a curvature of the head. The headset device further includes a speaker at the frame. The speaker is to output sound to an ear of the wearer as modified by a head-related transfer function determined from the data captured by the sensor.

Description

BACKGROUND

[0001] Spatial audio is often used to provide an immersive experience.

Spatial sound may create a sense of presence, for example, by enhancing the believability of a virtual world. In virtual reality, spatial audio can give users the ability to visually locate a virtual object by hearing a sound emanating from the object. This capability can be important in gaming, simulation, training, or other uses of virtual or augmented reality.

BRIEF DESCRIPTION OF THE DRAWINGS

[0002] FIG. 1 is a block diagram of an example headset device to determine a head-related transfer function using curvature data of a user’s head.

[0003] FIG. 2 is a flowchart of an example method to play sound according to a head-related transfer function that is determined from head curvature.

[0004] FIG. 3 is a schematic diagram of an example headset device with a sensor to determine curvature data of a user’s head for determination of a head- related transfer function.

[0005] FIG. 4 is a schematic diagram of another example headset device with sensors to determine a wearer’s anthropometric data including head curvature data for determination of a head-related transfer function.

[0006] FIG. 5 is a schematic diagram of the example headset device of FIG.

4 showing additional anthropometric information.

ISA/RU [0007] FIG. 6 is a block diagram of an example headset system to determine a head-related transfer function from curvature data of a user’s head using machine learning.

[0008] FIG. 7 is a flowchart of a method of training a machine-learning model to obtain a personalized head-related transfer function with respect to head curvature.

DETAILED DESCRIPTION

[0009] The sensation of spatial sound may be created using a head-related transfer function (HRTF) that may capture temporal and spectral detail useful for localization. As head shape varies from user to user, the same level of perceptible presence in an immersive environment is not readily achievable for various different users.

[0010] A personalized HRTF may be used to enhance an audio experience for a particular user. An HRTF may be used in an augmented reality (AR) headset, virtual reality (VR) headset, headphones/earbuds, or similar audio- capable device to increase verisimilitude in the user experience.

[0011] Anthropometric data concerning the curvature of a user’s head may be used to obtain a personalized HRTF. Such data may be collected by a sensor installed at a headset and may be applied to a trained machine-learning model to obtain a personalized HRTF. The sensor may be a strain gauge or conformal stretch sensor installed at a flexible material, such as a fabric cap, forming part of the headset. A sensor or sensors may measure head curvature data by taking measurements along multiple different axes.

[0012] A machine-learning model may be trained using head-curvature data of a representative sample of users captured by the same type of headset or a different type of device, such as a camera system. The machine-learning model may be made accessible to an AR/VR headset or similar personal audio device, so that a personalized HRTF may be determined for a user of the headset. [0013] Use of head curvature information may provide for increased accuracy when compared to techniques limited to head diameter and a spherical head assumption. Thus, the techniques described in the present disclosure may provide increased accuracy to HRTFs and therefore improved personalized sound output for various users of differing head geometry.

[0014] FIG. 1 shows an example headset device 100 that applies a personalized HRTF based on head curvature. The headset device 100 may be an AR/VR headset and may be a component of an AR/VR system. The headset device may be a set of headphones, earbuds, or similar personal audio device. The headset device 100 may be worn by a user to experience media, such as music, voice communications, a simulation, a training environment, a game, a live or prerecorded entertainment performance, or similar. Such media includes an audio component and may also include a visual component. The headset device 100 may also be used to train a machine-learning model that associates head curvature, as well as potentially other head and ear measurements, to HRTFs.

[0015] The headset device 100 includes a frame 102, a sensor 104, and a speaker 106.

[0016] The frame 102 is shaped and size to fit the head of a wearer. The frame 102 may be adjustable so as to fit a range of user head sizes and shapes. The frame 102 may include an adjustment mechanism. The frame 102 may include a flexible material, such as an elastic strap or cap, to provide fit to a range of head sizes and shapes.

[0017] The sensor 104 is positioned at the frame 102. The sensor 104 may be attached to components of the frame 102, which are physically adjustable with respect to each other, so as to measure a displacement between the components. The sensor 104 may be attached to a flexible material of the frame 102. As such, displacement of a component or stretching/compression of a flexible material may be measured by the sensor 104. The specific components and/or flexible material and the positioning of the sensor 104 thereon may be selected to allow the sensor 104 to be sensitive to head curvature. The sensor 104 may thus capture anthropometric data 108 indicative of head curvature of the particular wearer of the headset device 100. The sensor 104 or another sensor may additionally be used to capture or derive other anthropometric data, such as head diameter, ear shape, or similar information about the wearer of the headset 100.

[0018] Head curvature data may be explicit, in that head curvature data is a numerical representation of the shape of the user’s head. For example, head curvature data may be a numerical value that defines an instantaneous rate of change of direction of a point that moves along a curve in an outer surface of the user’s head. Alternatively or additionally, curvature may be implicit to a measurement obtained by a sensor that is located on a material, such as flexible material, that tends to conform to the wearer’s head. When machine learning is used, the nature of the head curvature need not be resolved to human-intelligible values.

[0019] The data 108 captured by the sensor 104 may be used to select an HRTF 110 that is influenced by head curvature. Selection of an HRTF 110 may include applying head curvature data to a lookup function, in which a particular HRTF 110 is selected from library that contains a plurality of HRTFs. In other examples, selection of an HRTF 110 may include applying head curvature data to a machine-learning model, such as a neural network, that has been trained against a set of head curvature data obtained for a group of people.

[0020] The speaker 106 is positioned at the frame 102. The speaker 106 may be positioned at, near, or in the wearer’s ear when the headset device 100 is worn. A pair of speakers 106 may be provided, one for each ear. The speaker 106 output sounds to the wearer’s ear, where the sound is modified by the HRTF 110 determined from the data 108 captured by the sensor 104. As such, a personalized audio experience that considers individual head curvature may be provided to the wearer of the headset device 100. [0021] FIG. 2 shows a method 200 of outputting personalized audio to a user. The method 200 may be performed by any of the devices or systems discussed herein. The method 200 may be implemented with instructions storable on a non-transitory computer-readable medium and executable by a processor.

[0022] At block 202, head curvature data is captured from the user. A headset device, such as the headset device 100, may be used to capture head curvature data. Additional anthropometric data may also be captured, such as head diameter, pinna shape/size, and similar.

[0023] At block 204, an HRTF is determined based on the data captured at block 202. A trained machine-learning model may be used. The machine learning model may take head curvature data, as well as other head geometry and/or ear geometry data, as input and provide a HRTF in response. The HRTF may thus be personalized to the wearer and particularly personalized as to the wearer’s head curvature.

[0024] At block 206, the personalized HRTF is applied to a sound, such as a sound file, a sound stream, or similar digitized sound data.

[0025] At block 208, the sound, as modified by the HRTF, is played to the user, such as via a pair of headset speakers that may be positioned over or inside the user’s ears. Use of a HRTF that considers head curvature may provide the user with a more immersive experience of the sound, as the HRTF is personalized to a further extent than capable with a spherical head model.

[0026] FIG. 3 shows an example headset device 300 with a sensor to measure head curvature. The headset device 300 may be used to train a machine-learning model that associates head curvature, as well as potentially other head and ear measurements, to HRTFs.

[0027] The headset device 300 includes a flexible material 302, a first sensor 304, and a second sensor 306. [0028] The flexible material 302 conforms to the curvature of the wearer’s head. The flexible material 302 may include fabric. The flexible material 302 may be in the shape of a strip or a cap. In this example, a cap made of synthetic fabric is provided.

[0029] The sensors 304, 306 are attached to the flexible material 302. The sensors 304, 306 may be adhered to the flexible material 302 by an adhesive. Examples of sensors 304, 306 include strain gauges and conformal stretch sensors.

[0030] The sensors 304, 306 are oriented to capture data indicative of a curvature of the head along multiple different sensor axes 308, 310. Each sensor 304, 306 is sensitive to a change in length along a respective axis 308, 310. The sensors 304, 306 may be provide at different locations on the flexible material. Any suitable number, positioning, and orientation of sensors 304, 306 may be used.

[0031] The greater the number of sensors 304, 306 and sensor axes 308, 310, the greater the accuracy of head curvature that may be obtained. The number, positioning, and orientation of the sensors 304, 306 may be optimized by providing a maximal difference in the sensor data between subjects over a large subject pool. A curvature distance metric may be used to perform sensor positioning optimization during the sensor-array design process.

[0032] Stretch of the flexible material 302 caused by fitting and conformance to the user’s head causes the sensors 304, 306 to output signals that enables the inference of distances or other measurements from the sensors. Such distances may form part of a set of anthropometric feature vectors to develop a model that maps input features, e.g., sensor measurements, to HRTFs.

[0033] The headset device 300 may also be worn by a user to experience media. To facilitate this, the headset device 300 may also include a pair of speakers 312 to emit sound modified by a HRTF that is selected based on measurement data obtained by the sensors 304, 306. The headset device 300 may further include a stereoscopic display 314 to provide images to the wearer for AR/VR functionality. The headset device 300 may further include a frame 316 to mutually attach the flexible material 302, speakers 312, and stereoscopic display 314. The frame 316 may also serve to secure the headset device 300 to a wearer’s head. The speakers 312 and display 314 may be rigidly, slidably, or pivotally connected to the frame 316. The flexible material 302 may be adhered, tied, or attached to the frame by snaps or other type of fabric fastener.

[0034] FIG. 4 shows another example headset device 400 with multiple sensors 402 aligned with multiple different axes 404 to measure head curvature. The headset device 400 may be used for media output or to train a machine learning model.

[0035] Each sensor 402 may be disposed on or in a flexible material 406, such as a fabric cap or strip, at different locations and orientations. Accordingly, each sensor 402 may measure extension of the flexible material 406 along a different axis 404.

[0036] The headset device 400 may further include a frame 408 that includes mutually adjustable components 410, 412. An example component 410 is a headband that secures the headset 400 to the user’s head. The headband 410 may include an adjustment mechanism 414 to adjust the circumference of the headband 410 to fit different particular users. Another example component 412 is a headphone arm that connects a speaker 416 to the headband 410. The headphone arm 412 may be pivotably adjustable with respect the headband 410 to accommodate ear locations of different users. The headphone arm 412 may be extendible and retractable for the same reason.

[0037] The components 410, 412 may include a sensor, such as a linear or rotational potentiometer. In some examples, the headband 410 includes a linear potentiometer 418 to measure extension of the headband 410 and thus derive circumference of the wearers head. Such a measurement can provide sufficient information to compute head diameter. In some examples, the headphone arm 412 includes a rotational potentiometer 420 with respect to the headband to measure ear position.

[0038] Other example components and sensors include a stereoscope display 422 pivotably connected to the headband 410 and with a rotary potentiometer 424 therebetween, a top strap 426 with ends connected to display 422 and the back of the headband 410 and with a length sensor 428 to measure the length of the top strap 426, and a linear potentiometer 430 at the headphone arm 410 to measure headphone and thus ear distance from the headband 410.

[0039] As such various dimensions may be directly measured or computed, as shown in FIG. 5, such as a distance 502 between the display 422 and the wearer’s eyes, a distance 504 between the top of the head and the top of the headphone arm 412, a distance 506 between the eyes and the top of the headphone arm 412, a distance 508 between the top of the headphone arm 412 and the ear, an angle 510 of the display 422 relative to the headband 410, an angle 512 of the headphone arm 412 relative to the headband 410, and an overall length 514 of the headset.

[0040] Any of these various dimensions 502-514 may be considered anthropometric measurements or may be used to compute anthropometric data of the wearer of the headset 400. Any such anthropometric information may be used to supplement the anthropometric head curvature information obtained with the sensors 402 disposed at the flexible material 406 to determine an HRTF personalized to the wearer of the headset 400.

[0041] FIG. 6 shows an example headset system 600 that may determine a HRTF from curvature data of a user’s head using machine learning.

[0042] The headset system 600 includes a processor 602, memory 604, sensors 606, and a trained machine-learning model 608.

[0043] The sensors 606 are positioned at a headset device 610 that may include various other components, such as a frame, headphones, a stereoscopic display, and/or similar components discussed elsewhere herein. The headset device 606 may be donned by a user to undergo an audio-visual experience.

[0044] The processor 602, memory 604, and trained machine-learning model 608 may each be provided to the headset device 610 or may be provided separate from the headset device 610. If provided separately, any of the processor 602, memory 604, and trained machine-learning model 608 may be provided to a computing device to which the headset device 610 connects, where such connection may be a local connection or a network connection. If provided together, the processor 602, memory 604, trained machine-learning model 608 may be provided in a frame of the headset device 610.

[0045] The processor 602 may include a central processing unit (CPU), a microcontroller, a microprocessor, a processing core, a field-programmable gate array (FPGA), an application-specific integrated circuit (ASIC), or a similar device capable of executing instructions. The memory 604 may include a non- transitory machine-readable medium that may be an electronic, magnetic, optical, or other physical storage device that encodes instructions. The non- transitory machine-readable medium may include, for example, random access memory (RAM), read-only memory (ROM), flash memory, a storage drive, an optical device, or similar.

[0046] The trained machine-learning model 608 may have been previously trained based on human head curvature data and measured HRTFs. For example, head curvature data may be captured for a plurality of subjects whose HRTFs are obtained from tests in an anechoic chamber. The set of measured curvature data may then be correlated to the measured HRTFs by the machine learning model, so that arbitrary curvature data may be applied to the machine learning model to obtain a corresponding HRTF.

[0047] When the headset device 610 is worn by a particular user, the processor 602 captures anthropometric data from the sensors 606, where such anthropometric data includes curvature information of the headset wearer’s head.

[0048] The processor 602 applies the anthropometric data captured by the sensors 606 to the trained machine-learning model 608 to obtain a HRTF 612 for the particular user. The HRTF 612 may be stored in the memory 604.

[0049] Then, audio media outputted by the headset device 610 may be processed by the selected HRTF 612.

[0050] In another example, the processor 602 selects the HRTF 612 from a plurality of HRTFs, such as may be stored in a library, based on the data captured by the sensors 606. A lookup or interpolation function may be used instead of the machine-learning model 608. A library may store a plurality of HRTFs 612 from which to select.

[0051] FIG. 7 flowchart of a method 700 of training a machine-learning model to determine a HRTF based on head curvature. The method 700 may be performed using any of the devices or systems discussed herein.

[0052] At block 702, head curvature for a human subject is measured. A sensor located on flexible material may be used. A headset device, such as any of those discussed herein, may be used to measure head curvature. Additional anthropometric data, such as head diameter and pinna size/shape may be measured.

[0053] At block 704, the subject may be situated in an anechoic or pseudo- anechoic chamber. Microphones may be located in or near the subject’s ears and various sounds may be played. The response to the sounds at the microphones may be measured.

[0054] At block 706, the HRTF of the subject may be computed based on the subjects response to the sounds played in the anechoic or pseudo-anechoic chamber. The HRTF may be associated to the measured head curvature and other anthropometric data. [0055] Blocks 702-706 may be performed for a plurality of human subjects that form a sample pool.

[0056] The captured data and computed HRTFs for the plurality of human subjects may then be provided to machine-learning model, at block 710. The machine-learning model may be trained using the captured data and HRTFs, specifically head curvature data. The machine-learning model may be trained, such that during inference any provided head curvature or other anthropometric data provided to the model results in HRTFs in response for various angles in 3D space.

[0057] It should be apparent from the above that head curvature may be used to determine a personalized HRTF. As such, a user of a VR/AR/audio system may be provided with audio playback that is enhanced by an HRTF that considers head curvature in addition to other anthropometric data, such as head diameter, ear position, and ear shape/side.

[0058] It should be recognized that features and aspects of the various examples provided above can be combined into further examples that also fall within the scope of the present disclosure. In addition, the figures are not to scale and may have size and shape exaggerated for illustrative purposes.

Claims

1. A headset device comprising: a frame to fit a head of a wearer; a sensor at the frame, the sensor to capture data indicative of a curvature of the head; and a speaker at the frame, the speaker to output sound to an ear of the wearer as modified by a head-related transfer function determined from the data captured by the sensor.

2. The headset device of claim 1 , wherein the sensor is to capture data indicative of a curvature of the head along multiple different axes.

3. The headset device of claim 1 , further comprising a flexible material attached to the frame, the flexible material to conform to curvature of the head, wherein the sensor is attached to the flexible material.

4. The headset device of claim 3, wherein the flexible material comprises fabric.

5. The headset device of claim 1 , wherein the sensor comprises a strain gauge.

6. The headset device of claim 1 , wherein the sensor comprises a conformal stretch sensor.

7. The headset device of claim 1 , further comprising a processor to select the head-related transfer function from a plurality of head-related transfer functions based on the data captured by the sensor.

8. The headset device of claim 1 , further comprising a processor to apply the data captured by the sensor to a trained machine-learning model to obtain the head-related transfer function.

9. A non-transitory computer-readable medium comprising instructions executable by a processor to: receive an anthropometric measurement of a head of a person; apply the anthropometric measurement to a trained machine-learning model, the trained machine-learning model being trained based on human head curvature data and measured head-related transfer functions; receive a head-related transfer function specific to the person from the trained machine-learning model; and apply the head-related transfer function to sound to be outputted to the person.

10. The non-transitory computer-readable medium of claim 9, wherein the human head curvature data is measured along multiple different axes.

11. The non-transitory computer-readable medium of claim 9, wherein the human head curvature data is measured by a sensor sensitive to a stretching or compression of a flexible material.

12. The non-transitory computer-readable medium of claim 9, wherein the anthropometric measurement comprises a head curvature measurement.

13. The non-transitory computer-readable medium of claim 12, wherein the anthropometric measurement further comprises a head diameter or circumference measurement.

14. A method comprising: capturing head curvature data from a user; determine a head-related transfer function of the user based on the head curvature data; applying the head-related transfer function to a sound; and playing the sound to the user.

15. The method of claim 14, wherein capturing the head curvature data from the user comprises using a sensor at a headset to capture data along multiple different axes.