CN105308536A

Movatterモバイル変換

Info

Publication number: CN105308536A
Application number: CN201480014375.1A
Authority: CN
Inventors: D·霍尔兹
Original assignee: Leap Motion Inc
Current assignee: Lmi Clearing Co ltd; Ultrahaptics IP Ltd
Priority date: 2013-01-15
Filing date: 2014-01-15
Publication date: 2016-02-03
Also published as: CN113568506A; DE112014000441T5; WO2014113507A1

Abstract

The disclosed technology relates to distinguishing meaningful gestures from proximate nonsense gestures in a three-dimensional (3D) sensory space. It also relates to consistently responding to a user's gesture input regardless of the user's position in 3D sensory space. It also relates to detecting whether a user intends to interact with a virtual object based on measuring the degree of completion of a gesture and creating interface elements in 3D space. The disclosed technology relates to distinguishing meaningful gestures from proximate nonsense gestures in a three-dimensional (3D) sensory space. It also relates to consistently responding to a user's gesture input regardless of the user's position in 3D sensory space. It also relates to detecting whether a user intends to interact with a virtual object based on measuring the degree of completion of a gesture and creating interface elements in 3D space.

Description

Translated fromChinese

用于显示器控制和定制姿势解释的动态用户交互Dynamic user interaction for display control and custom gesture interpretation

技术领域technical field

所公开的技术总体涉及显示器控制和姿势识别，具体地，涉及基于动态用户交互的显示器控制以及使用自由空间的手势作为用户对机器的输入。The disclosed technology relates generally to display control and gesture recognition, and in particular, to display control based on dynamic user interaction and the use of free-space gestures as user input to machines.

背景技术Background technique

传统上，用户与电子装置(如计算机或电视)或计算应用程序(如计算机游戏、多媒体应用或办公应用程序)经由间接的输入装置进行交互，所述输入装置包括例如键盘、操纵杆或者远程控制器。用户操纵输入装置来执行特定动作，例如从操作菜单中选择特定条目。但是，现代的输入装置往往在复杂的配置中包括多个按钮，以方便将用户的命令传达给电子装置或计算应用程序；这些输入装置的正确操作对用户通常是个挑战。此外，输入装置上执行的动作通常不以任何直观感觉对应于所获得的变化，所述变化例如是由装置控制的屏幕显示器上的变化。输入装置也可能丢失，并且寻找放错地方的装置的频繁经历已经成为现代生活中令人沮丧的常事。Traditionally, users interact with electronic devices (such as computers or televisions) or computing applications (such as computer games, multimedia applications, or office applications) via indirect input devices, such as keyboards, joysticks, or remote controls. device. The user manipulates the input device to perform a specific action, such as selecting a specific item from an operational menu. However, modern input devices often include multiple buttons in complex configurations to facilitate communicating user commands to electronic devices or computing applications; proper operation of these input devices is often a challenge for users. Furthermore, the actions performed on the input device often do not correspond in any intuitive sense to the changes obtained, for example on an on-screen display controlled by the device. Input devices can also be lost, and the frequent experience of finding a misplaced device has become a frustrating routine of modern life.

直接在用户控制装置上实施的触摸屏消除了对独立的输入装置的要求。触摸屏检测用户的手指或其它物体在显示屏上做出的“触摸”动作的存在和位置，使用户能够通过简单地触摸屏幕的适当区域来输入所希望的输入。虽然其适用于诸如平板电脑和无线电话的小型显示器装置，但是触摸屏对用户从远处观看的大型娱乐装置不适用。特别是对于在这些装置上实施的游戏，电子产品制造商已经开发出检测用户的运动或姿势并使得显示器在有限的情形下响应的系统。例如，在电视机附近的用户可以做出滑动手势，所述手势由姿势识别系统检测，电视可响应检测到的姿势，激活控制面板并将控制面板显示在屏幕上，从而允许用户使用随后的手势在其上进行选择，例如，用户可以将其手沿“向上”或“向下”的方向运动，再次被检测和解释所述运动以协助频道选择。A touch screen implemented directly on the user control eliminates the need for a separate input device. Touch screens detect the presence and location of a "touch" action by a user's finger or other object on the display screen, enabling the user to enter desired input by simply touching the appropriate area of the screen. While suitable for small display devices such as tablet computers and wireless phones, touch screens are not suitable for large entertainment devices viewed by the user from a distance. Especially for games implemented on these devices, electronics manufacturers have developed systems that detect the user's motion or gestures and cause the display to respond in limited circumstances. For example, a user near a television can make a swipe gesture, which is detected by a gesture recognition system, and the television can respond to the detected gesture by activating and displaying a control panel on the screen, allowing the user to use subsequent gestures To make a selection, for example, the user may move his hand in an "up" or "down" direction, again the motion is detected and interpreted to assist in channel selection.

虽然这些系统已经产生了大量消费刺激，可能最终取代用户和控制元件之间需要存在物理接触的常规控制方式，但是当前的装置受限于低检测灵敏度。用户被要求做出大幅度的、经常是夸张的且有时是可笑的运动以便从姿势识别系统引发响应。因为低分辨率，检测不到小幅度手势或者将其当作噪音处理。例如，为使光标在电视上运动通过一厘米的距离，用户的手可能必须横过大得多的距离。这种不匹配不仅给用户增加了繁琐的操作负担，特别是当运动受到限制时，而且，再次降低手势和响应之间的直观关系。此外，该系统的响应通常单一，即，手势的物理跨度总是对应相同的屏幕虚拟控制增量，不管用户的意愿是什么。While these systems have generated a large number of consumer stimuli and may eventually replace conventional controls that require physical contact between the user and the control element, current devices are limited by low detection sensitivity. Users are asked to make large, often exaggerated, and sometimes comical movements in order to elicit a response from the gesture recognition system. Because of the low resolution, small gestures are not detected or treated as noise. For example, to move a cursor across a distance of one centimeter on a television, the user's hand may have to traverse a much greater distance. This mismatch not only imposes a cumbersome operational burden on the user, especially when motion is restricted, but, again, degrades the intuitive relationship between gesture and response. Furthermore, the response of this system is usually monolithic, i.e., the physical span of a gesture always corresponds to the same increment of on-screen virtual control, regardless of the user's will.

因此，有机会引入新的姿势识别系统，其实时检测小幅度手势，并允许用户调整物理运动与在屏幕上显示的相应的动作之间的关系。Therefore, there is an opportunity to introduce new gesture recognition systems that detect small gestures in real time and allow users to adjust the relationship between physical movements and corresponding actions displayed on the screen.

为了选择在电子装置的屏幕上显示的期望的虚拟对象，用户可能需要将其手扫动一个大的距离。由于低灵敏度，扫动太短的距离可能不可检测或被视为噪声，从而使所需的虚拟对象保持未被选择。结果，用户可能会发现自己重复做出具有不同运动程度的相同的手势，直到所需的选择被确认。手势的重复性不仅麻烦，而且难以使用户准确地确定什么时候成功地选择了虚拟对象。因此，需要一种指示用户的姿势完成的姿势识别系统。In order to select a desired virtual object displayed on the screen of the electronic device, the user may need to swipe his hand a large distance. Due to low sensitivity, swiping too short a distance may not be detectable or be seen as noise, leaving the desired virtual object unselected. As a result, users may find themselves repeatedly making the same gesture with varying degrees of motion until the desired selection is confirmed. The repetitive nature of gestures is not only cumbersome, but makes it difficult for the user to determine exactly when a virtual object has been successfully selected. Therefore, there is a need for a gesture recognition system that indicates completion of a user's gesture.

另外，旨在作为单一姿势的用户动作可能仍然涉及相互关联的多个运动，所述运动的每一个都可以被视为单独的姿势。其结果是，传统的姿势识别系统可能无法正确地解释用户的意图，因此传送故障信号(或根本无信号)给被控制的电子装置。假设例如用户挥动其手臂，同时不知不觉中弯曲其手指；由于相互关联的运动，姿势识别系统可能无法识别期望的姿势，或者可能会指示做出两种姿势(其可能会发生冲突，使得装置不知所措，或手势中的一个可能会无法响应可允许的输入)。Additionally, a user action intended to be a single gesture may still involve multiple interrelated movements, each of which may be considered a separate gesture. As a result, conventional gesture recognition systems may not correctly interpret the user's intent, thus sending a faulty signal (or no signal at all) to the electronic device being controlled. Suppose, for example, that the user swings his arm while unknowingly flexing his fingers; due to the interrelated movements, the gesture recognition system may not recognize the desired gesture, or may be instructed to make two gestures (which may conflict, making the device overwhelmed, or one of the gestures may fail to respond to allowable input).

但是，现有的系统依靠输入元件(例如，计算机鼠标和键盘)来补充其可以做出的任何手势识别。这些系统缺乏简单的命令之外所需的用户界面元件，并且通常，仅在用户通过键盘和鼠标设置了姿势识别环境之后才会识别这些命令。因此，有进一步的机会引入新的姿势识别系统，让用户能够以更先进的方式与更广泛的应用程序和游戏互动。However, existing systems rely on input elements (eg, computer mice and keyboards) to supplement any gesture recognition they can make. These systems lack the required user interface elements beyond simple commands, and typically only recognize these commands after the user has set up the gesture recognition environment with the keyboard and mouse. As such, there are further opportunities to introduce new gesture recognition systems that allow users to interact with a wider range of applications and games in more advanced ways.

发明内容Contents of the invention

所公开的技术的实施方式涉及具有对用户的姿势的高检测灵敏度以允许用户准确和快速地(即，没有任何不必要的延迟时间)使用小幅度姿势控制电子装置，以及在某些实施方式中控制姿势的物理跨度和所导致的显示的响应之间的关系的方法和系统。在不同实施方式中，用户做出姿势的一个或多个身体部位(以下统称为“姿势”，例如，手指，手，手臂等)的形状和位置首先在被捕获的二维(2D)图像中被检测和识别；然后，一组时间序列的图像中的姿势的时间集合被组装，以便重构三维(3D)空间中的姿势。用户的意图可以通过例如将检测到的姿势与存储在数据库中的一组姿势记录进行对比来识别。每个姿势记录均将一个检测到的手势(例如，编码为矢量)与一个动作、命令或其它输入相关联，由当前运行的应用程序处理—例如，以便调用在应用程序上执行的相应的指令或指令序列，或者提供参数值或其它输入数据。由于所公开的技术中的姿势识别系统提供高检测灵敏度，可以精确地检测和识别身体的部位(例如，手指)的小幅度运动(例如，几毫米的运动)，从而使用户能够准确地与电子装置和/或显示在其上的应用程序进行交互。Embodiments of the disclosed technology relate to having high detection sensitivity to a user's gestures to allow the user to accurately and quickly (i.e., without any unnecessary delay time) control an electronic device using small gestures, and in some embodiments Methods and systems for controlling the relationship between the physical span of a gesture and the resulting displayed response. In various implementations, the shape and position of one or more body parts the user is gesturing (hereinafter collectively referred to as "gestures", e.g., fingers, hands, arms, etc.) are detected and recognized; then, a temporal collection of poses in a set of time-sequenced images is assembled in order to reconstruct the pose in three-dimensional (3D) space. The user's intent can be identified by, for example, comparing a detected gesture to a set of gesture records stored in a database. Each gesture record associates a detected gesture (e.g., encoded as a vector) with an action, command, or other input to be processed by the currently running application—for example, to invoke a corresponding command executed on the application or sequence of instructions, or provide parameter values or other input data. Since the gesture recognition system in the disclosed technology provides high detection sensitivity, it is possible to accurately detect and recognize small movements (for example, movements of a few millimeters) of parts of the body (for example, fingers), thereby enabling users to accurately communicate with electronic devices. devices and/or applications displayed on them.

所公开的技术的一些实施方式实时从均可视为姿势的不关联的运动中鉴别主导姿势，并且可以输出指示主导手势的信号。根据所公开的技术的方法和系统理想地对可视为用户姿势的运动具有高检测灵敏度，该能力当与主导姿势的快速鉴别力相结合时允许用户准确和迅速地(即，没有任何不必要的时间延迟)控制电子装置。Some implementations of the disclosed technology identify dominant gestures in real-time from disjoint movements that may all be considered gestures, and may output a signal indicative of the dominant gesture. Methods and systems in accordance with the disclosed technology ideally have high detection sensitivity to motions that can be considered user gestures, a capability that when combined with rapid discrimination of dominant gestures allows users to accurately and quickly (i.e., without any unnecessary time delay) control electronics.

在不同实施方式中，当检测到多个姿势(例如，手臂挥动姿势和手指弯曲)时，姿势识别系统识别用户的主导姿势。例如，姿势识别系统可以通过计算将挥动姿势标识为挥动轨迹，将手指弯曲姿势标识为五个单独(并且是更小幅度的)的轨迹。每条轨迹可以沿着例如欧拉空间中的六个欧拉自由度转换成矢量。具有最大幅度的矢量代表该运动的主导成分(例如，在这种情况下，挥动)，其余的矢量可以被忽略，或者以与处理主导姿势不同的方式处理。在某些实施方式中，可以使用过滤技术实施的矢量过滤器应用于所述多个矢量，以过滤掉小矢量并识别主导矢量。可以重复、迭代该过程，直到识别出一个矢量—运动的主导成分。然后，所识别的主要分量可以被用于操纵电子装置或其应用程序。In various implementations, the gesture recognition system identifies the user's dominant gesture when multiple gestures are detected (eg, arm swing gesture and finger curl). For example, a gesture recognition system may computationally identify a swipe gesture as a swipe trajectory, and a finger curl gesture as five separate (and smaller-magnitude) trajectories. Each trajectory can be transformed into a vector along, for example, six Euler degrees of freedom in Euler space. The vector with the largest magnitude represents the dominant component of the motion (eg, in this case, the swipe), and the rest of the vectors can be ignored, or treated differently than the dominant pose. In some embodiments, a vector filter, which may be implemented using a filtering technique, is applied to the plurality of vectors to filter out small vectors and identify dominant vectors. This process can be repeated, iteratively, until one vector - the dominant component of the motion - is identified. The identified principal components can then be used to manipulate the electronic device or its applications.

在一些实施方式中，姿势识别系统启用或提供屏幕虚拟(on-screen)显示器，以实时显示姿势完成程度。例如，姿势识别系统可以通过将其与数据库的记录进行匹配来识别姿势，所述数据库包括多个图像，所述多个图像中的每一个都与所做出的姿势的一个完成程度(例如，从1％到100％)相关联。然后，所做出的手势的完成程度被呈现在屏幕上。例如，当用户使其手指运动靠近电子装置以做出点击或触摸姿势时，装置显示器可以示出空心圆形图标，呈现应用程序用颜色填充所述空心圆形图标，以指示用户的运动还有多久完成该姿势。当用户完全做出点击或触摸姿势时，圆形被完全填充，这可能导致例如将期望的虚拟对象标记为被选择的对象。完成程度指示器从而使用户能够识别选择虚拟对象的确切时刻。In some embodiments, the gesture recognition system enables or provides an on-screen display to show the degree of completion of the gesture in real time. For example, a gesture recognition system may recognize a gesture by matching it to records from a database that includes multiple images, each of which corresponds to a degree of completion of the gesture made (e.g., from 1% to 100%) correlate. Then, the degree of completion of the gesture made is presented on the screen. For example, when a user moves his finger close to the electronic device to make a tap or touch gesture, the device display may show a hollow circular icon that the rendering application fills with color to indicate that the user's motion and How long to perform the pose. When the user fully makes a click or touch gesture, the circle is completely filled, which may result, for example, in marking a desired virtual object as the selected object. The level of completion indicator thus enables the user to identify the exact moment when the virtual object is selected.

在其它的实施方式中，虚拟的屏幕虚拟圆盘可以用来通过以下选择变量或其它参数的值：允许用户通过按压其一侧来滑动圆盘。用户可以通过进一步的手势创建其它的用户界面元件，一旦创建该元件，则将其用作针对软件应用程序的输入或控制件。In other embodiments, a virtual on-screen puck may be used to select the value of a variable or other parameter by allowing the user to slide the puck by pressing on its side. The user can create other user interface elements through further gestures and, once created, use them as inputs or controls for the software application.

在一种实施方式中，姿势识别系统提供用于用户静态或动态调整其实际运动以及所导致的响应之间的关系的功能，所述所导致的响应例如是显示在电子装置的屏幕上的对象的运动。在静态操作中，用户通过操纵显示的滑动开关或操纵其它图标手动设置该灵敏度水平，所述其它图标例如使用本文说明的姿势识别系统。在动态操作中，系统自动响应用户和装置之间的距离、被显示的活动的性质、可用的物理空间和/或用户自己的响应的模式(例如，基于用户姿势似乎被限制在其中的空间的体积，按比例缩放该响应)。例如，当可用空间有限时，用户可以将关系调整至小于1的比例(例如，1:10)，使得其实际运动的每个单元(例如，1毫米)导致显示在屏幕上的对象运动的10个单位(例如10个像素或10毫米)。类似地，当用户相对接近电子装置时，其可以将该关系调整(或者感测到用户的距离的装置可以自动调整)至大于1的比例(例如，10:1)以进行补偿。因此，调整用户的实际运动与显示在屏幕上的所导致的动作的比例(例如，对象运动)提供了额外的灵活性，供用户远程指挥电气装置和/或控制显示在其上的在虚拟环境。In one embodiment, the gesture recognition system provides functionality for the user to statically or dynamically adjust the relationship between his actual movement and the resulting response, such as an object displayed on the screen of the electronic device exercise. In static operation, the user manually sets this sensitivity level by manipulating a displayed slide switch or manipulating other icons, for example using the gesture recognition system described herein. In dynamic operation, the system automatically responds to the distance between the user and the device, the nature of the activity being displayed, the available physical space, and/or the pattern of the user's own responses (e.g., based on the volume, scale the response proportionally). For example, when available space is limited, the user can adjust the relationship to a ratio of less than 1 (e.g., 1:10), so that each unit of its actual motion (e.g., 1 mm) results in 10% of the motion of the object displayed on the screen. unit (for example, 10 pixels or 10 millimeters). Similarly, when the user is relatively close to the electronic device, it can adjust this relationship (or the device sensing the user's distance can automatically adjust) to a ratio greater than 1 (eg, 10:1) to compensate. Therefore, adjusting the ratio of the user's actual motion to the resulting action displayed on the screen (e.g., object motion) provides additional flexibility for the user to remotely command the electrical device and/or control the virtual environment displayed on it. .

根据一种实施方式，公开的技术还涉及过滤姿势。具体地，其涉及通过以下将三维(3D)感觉空间中的关注的姿势与非关注姿势区别开：将用户限定的基准姿势的特性与在3D感觉空间做出的实际姿势进行对比。基于该对比，从在3D感觉空间中做出的所有姿势中过滤出一组的关注的姿势。According to one implementation, the disclosed technology also relates to filtering gestures. In particular, it involves distinguishing gestures of interest from non-interest gestures in a three-dimensional (3D) sensory space by comparing properties of a user-defined reference gesture with actual gestures made in the 3D sensory space. Based on this comparison, a set of gestures of interest is filtered out of all gestures made in the 3D sensory space.

根据另一种实施方式，所公开的技术还涉及为特定的用户定制姿势解释。具体地，其涉及通过以下设置识别姿势的参数：提示用户选择姿势的特性值。在一种实施方式中，公开的技术包括执行姿势的特性边界集中示范。它还包括通过提示用户执行完整的姿势示范并接收用户的有关解释的评估来测试姿势解释。According to another implementation, the disclosed technology also relates to customizing gesture interpretation for a particular user. Specifically, it involves setting the parameters of the recognition gesture by prompting the user to select a characteristic value of the gesture. In one implementation, the disclosed techniques include performing characteristic boundary-focused demonstrations of gestures. It also includes testing gesture interpretation by prompting the user to perform a complete gesture demonstration and receiving the user's evaluation of the interpretation.

可以通过查看之后的附图、详细说明和权利要求了解本技术的其它方面和优点。Other aspects and advantages of the technology can be understood by examining the drawings, detailed description, and claims that follow.

附图说明Description of drawings

在附图中，在所有不同的视图中，相同的参考字符通常指类似的部件。此外，附图不一定按比例绘制，相反，重点通常放在例示所公开的技术的原理。在以下说明中，所公开的技术的不同实施方式将参照以下附图进行说明，其中：In the drawings, like reference characters generally refer to like parts throughout the different views. Furthermore, the drawings are not necessarily to scale, emphasis instead generally being placed upon illustrating the principles of the disclosed technology. In the following description, various implementations of the disclosed technology will be described with reference to the following drawings, in which:

图1A示出了根据所公开的技术的一种实施方式的捕获图像数据的系统。Figure 1A illustrates a system for capturing image data according to one implementation of the disclosed technology.

图1B是根据所公开的技术的一种实施方式的实施图像分析装置的姿势识别系统的简化框图。FIG. 1B is a simplified block diagram of a gesture recognition system implementing an image analysis device in accordance with one implementation of the disclosed technology.

图2A描绘了根据所公开的技术的一种实施方式的由用户的姿势控制的电子装置。2A depicts an electronic device controlled by a user's gestures, according to one implementation of the disclosed technology.

图2B描绘了根据所公开的技术的一种实施方式的由姿势识别系统检测的多个姿势。FIG. 2B depicts multiple gestures detected by a gesture recognition system, in accordance with one implementation of the disclosed technology.

图3A和3B描绘了根据所公开的技术的一种实施方式的反映用户的姿势的完成程度的屏幕虚拟指示器。3A and 3B depict an on-screen virtual indicator reflecting the degree of completion of a user's gesture, according to one implementation of the disclosed technology.

图3C是示出根据所公开的技术的一种实施方式的预测用户选择虚拟对象并随后及时操纵所选择的虚拟对象的时间的方法的流程图。3C is a flowchart illustrating a method of predicting when a user selects a virtual object and then manipulates the selected virtual object in time, according to one implementation of the disclosed technology.

图4A和4B描绘根据所公开的技术的一种实施方式的动态调整用户的实际运动和显示在屏幕上的所导致的动作之间的关系。4A and 4B depict dynamically adjusting the relationship between a user's actual motion and the resulting action displayed on the screen, according to one implementation of the disclosed technology.

图4C是示出根据所公开的技术的一种实施方式的动态调整用户的实际运动和显示在屏幕上的所导致的动作之间的关系的方法的流程图。4C is a flowchart illustrating a method of dynamically adjusting the relationship between a user's actual motion and the resulting action displayed on the screen, according to one implementation of the disclosed technology.

图5A和5B描绘根据所公开的技术的一种实施方式的圆盘用户界面元件。5A and 5B depict a puck user interface element in accordance with one implementation of the disclosed technology.

图6是示出根据所公开的技术的一种实施方式的过滤姿势的方法的流程图。6 is a flowchart illustrating a method of filtering gestures according to one implementation of the disclosed technology.

图7是示出根据所公开的技术的一种实施方式的定制姿势解释的方法的流程图。FIG. 7 is a flowchart illustrating a method of customizing gesture interpretations according to one implementation of the disclosed technology.

图8A、8B和8C示出根据所公开的技术的一种实施方式的用户限定姿势的示例性训练指导流程。Figures 8A, 8B, and 8C illustrate an example coaching flow for user-defined gestures, in accordance with one implementation of the disclosed technology.

详细说明Detailed description

所公开的技术的实施方式涉及使用音频信号以降低功耗操作运动捕获系统的方法和系统。例如，可以关联一个序列的图像，以便构造对象的3-D模型，所述3-D模型包括对象的位置和形状。可使用相同的技术分析连续的图像，以便构建对象的运动(例如，自由形式的姿势)的模型。在光线不足的情况下，当不能以足够的可靠度对自由形式的姿势进行光学识别时，音频信号可以提供对象的方向和位置，如本文进一步所述的。Embodiments of the disclosed technology relate to methods and systems for operating a motion capture system using audio signals with reduced power consumption. For example, a sequence of images can be correlated in order to construct a 3-D model of an object, including the object's position and shape. Successive images can be analyzed using the same technique in order to build a model of the subject's motion (eg, free-form pose). In low light conditions, when free-form gestures cannot be optically recognized with sufficient reliability, the audio signal can provide the orientation and position of the object, as further described herein.

如本文所使用的，如果之前的信号、事件或值影响给定的信号、事件或值，则给定的信号、事件或值“依赖于”之前的信号、事件或值。如果有中间处理元件、步骤、动作或时间段，则给定的信号、事件或值仍然可以“依赖于”之前的信号、事件或值。如果中间处理元件或步骤结合多个信号、事件或值，则处理元件或步骤的信号输出被认为是“依赖于”每个信号、事件或值的输入。如果给定信号、事件或值与之前的信号、事件或值相同，则这仅是简并情形，其中给定的信号、事件或值仍然被认为是“依赖于”之前的信号、事件或值。类似地限定给定的信号、事件或值对另一信号、事件或值的“响应”。As used herein, a given signal, event or value "depends on" a previous signal, event or value if the previous signal, event or value affects the given signal, event or value. A given signal, event or value may still "depend on" previous signals, events or values if there are intervening processing elements, steps, actions or time periods. If an intermediate processing element or step incorporates multiple signals, events or values, then the signal output of the processing element or step is said to be "dependent on" the input of each signal, event or value. If a given signal, event or value is the same as a previous signal, event or value, then this is only a degenerate case where the given signal, event or value is still said to be "dependent on" the previous signal, event or value . A "response" of a given signal, event or value to another signal, event or value is similarly defined.

首先参考图1A，图1A示出了一种示例性姿势识别系统100A，所述系统包括一对摄像机102、104，所述摄像机联接至图像分析系统106。摄像机102、104可以是任何类型的摄像机，包括对全部可见光谱敏感的摄像机，或者更典型地，对一定范围内的波长带(例如，红外(IR)或紫外线波长带)具有增强的敏感度的摄像机；更一般地，术语“摄像机”在此是指能够捕获对象的图像并以数字数据的形式表示该图像的任何装置(或装置的组合)。虽然使用两个摄像机的实施方式的示例进行例示，但是也容易实施使用不同数目的摄像机或非摄像机光敏图像传感器或其组合的其它的实施方式。例如，可以使用行传感器(linesensor)或线摄像机(linecamera)而非能够捕获二维(2D)图像的传统的装置。术语“光”通常用来指任何电磁辐射，其可以或可以不在可见光谱内，并且可以是宽带(例如，白光)或窄带(例如，单个波长或窄带波长)。Referring first to FIG. 1A , an exemplary gesture recognition system 100A is shown that includes a pair of cameras 102 , 104 coupled to an image analysis system 106 . Cameras 102, 104 may be any type of camera, including cameras sensitive to the full visible spectrum, or more typically, cameras with enhanced sensitivity to a range of wavelength bands (e.g., infrared (IR) or ultraviolet wavelength bands). Camera; more generally, the term "camera" herein refers to any device (or combination of devices) capable of capturing an image of a subject and representing that image in the form of digital data. Although illustrated using an example of a two camera implementation, other implementations using a different number of cameras or non-camera light sensitive image sensors or combinations thereof are readily implemented. For example, a line sensor or line camera may be used instead of conventional devices capable of capturing two-dimensional (2D) images. The term "light" is generally used to refer to any electromagnetic radiation, which may or may not be within the visible spectrum, and may be broadband (eg, white light) or narrowband (eg, a single wavelength or a narrowband of wavelengths).

虽然没有特别的帧频要求，但是摄像机102、104优选能够捕获视频图像(即，至少每秒15帧的恒定速率的连续图像帧)。摄像机102、104的功能对所公开的技术并非关键，摄像机可以具有不同的帧频、图像分辨率(例如，每幅图像的像素)、颜色或亮度(intensity)分辨率(例如，每像素亮度数据的比特数目)、镜头的焦距、景深等。一般情况下，对于特定的应用程序，可以使用能对关注的空间体积内的物体进行聚焦的任何摄像机。例如，为了捕获静止的人的手的运动，关注的体积可以被限定为一侧约一米的立方体。Although there is no particular frame rate requirement, the cameras 102, 104 are preferably capable of capturing video images (ie, a constant rate of successive image frames of at least 15 frames per second). The functionality of the cameras 102, 104 is not critical to the disclosed technology, and the cameras may have different frame rates, image resolutions (e.g., pixels per image), color or intensity resolution (e.g., intensity data per pixel) The number of bits), the focal length of the lens, the depth of field, etc. In general, any camera capable of focusing on objects within the spatial volume of interest can be used for a particular application. For example, to capture the motion of a stationary human hand, the volume of interest may be defined as a cube approximately one meter on a side.

在一些实施方式中，示出的系统100A包括一对源108、110，所述源可被设置在摄像机102、104任一侧，并由图像分析系统106控制。在一种实施方式中，源108、110是光源。例如，光源可以是红外光源，例如红外发光二极管(LED)，摄像机102、104可以对红外光敏感。使用红外光能允许姿势识别系统100A在广范围的光线条件下工作，并且能避免可能会与引导可见光进入人运动的区域相关的各种不便或干扰。然而，要求电磁光谱的特定波长或区域。在一种实施方式中，过滤器120、122被放置在摄像机102、104前方，以便过滤掉可见光，使得仅红外光存在于由摄像机102、104捕获的图像中。在另一实施方式中，源108、110是声波源。声波源向用户发射声波；用户遮挡(或“声波遮掩”)或者改变(或“声波偏移”)向用户冲击的声波。该声波遮掩和/或声波偏移也可以用于检测用户的手势。在一些实施方式中，声波是例如超声波，即人类听不见的声波。In some embodiments, the illustrated system 100A includes a pair of sources 108 , 110 that may be positioned either side of the cameras 102 , 104 and controlled by the image analysis system 106 . In one embodiment, the sources 108, 110 are light sources. For example, the light source may be an infrared light source, such as an infrared light emitting diode (LED), and the cameras 102, 104 may be sensitive to infrared light. The use of infrared light energy allows gesture recognition system 100A to operate in a wide range of lighting conditions and avoids various inconveniences or disturbances that may be associated with directing visible light into areas of human motion. However, specific wavelengths or regions of the electromagnetic spectrum are required. In one embodiment, filters 120 , 122 are placed in front of cameras 102 , 104 to filter out visible light so that only infrared light is present in images captured by cameras 102 , 104 . In another embodiment, the sources 108, 110 are acoustic wave sources. The sound wave source emits sound waves toward the user; the user blocks (or "sonic masking") or alters (or "sonic offsets") the sound waves impinging on the user. This acoustic masking and/or acoustic offset may also be used to detect user gestures. In some embodiments, the sound waves are, for example, ultrasound, sound waves that are inaudible to humans.

应当强调，图1A所示的布置是代表性的，而非限制性的。例如，可用激光或其它光源代替LED。在包括激光的实施方式中，附加的光学元件(例如，镜头或漫射器(diffuser))可用于加宽激光束(使其视野类似于摄像机的视野)。有用的布置还可以包括用于不同范围的小角度和广角照明器。光源通常漫射而不是镜面点光源；例如，具有光传播包封的封装的LED是合适的。It should be emphasized that the arrangement shown in Figure 1A is representative and not limiting. For example, lasers or other light sources could be used instead of LEDs. In embodiments that include a laser, additional optical elements (eg, lenses or diffusers) may be used to widen the laser beam (making its field of view similar to that of a camera). Useful arrangements can also include small angle and wide angle illuminators for different ranges. The light source is usually diffuse rather than a specular point source; for example, a packaged LED with a light spreading envelope is suitable.

在操作中，光源108、110被布置为照亮关注的区域112，所述区域包含可选地持有工具的人体114的一部分(在这个示例中，一只手)或其它关注的对象，摄像机102、104定向为朝向区域112，以捕获手114的视频图像。在一些实施方式中，光源108、110和摄像机102、104的操作由图像分析系统106控制，所述图像分析系统可以是例如计算机系统。基于所捕获的图像，图像分析系统106确定对象114的位置和/或运动。In operation, the light sources 108, 110 are arranged to illuminate an area of interest 112 comprising a part of the human body 114 optionally holding a tool (in this example, a hand) or other object of interest, the camera 102 , 104 are oriented toward area 112 to capture a video image of hand 114 . In some embodiments, the operation of light sources 108, 110 and cameras 102, 104 is controlled by image analysis system 106, which may be, for example, a computer system. Based on the captured images, image analysis system 106 determines the position and/or motion of object 114 .

图1B是计算机系统100B的简化框图，实施根据所公开的技术的一种实施方式的图像分析系统106(也被称为图像分析仪)。图像分析系统106可以包括能够捕获和处理图像数据的任何装置或装置部件，或由能够捕获和处理图像数据的任何装置或装置部件组成。在一些实施方式中，计算机系统100B包括处理器132、存储器134、摄像机界面136、显示器138、扬声器139、键盘140和鼠标141。存储器134可以用于存储由处理器132执行的指令以及与执行指令相关联的输入和/或输出数据。具体地，存储器134包含这样的指令，所述指令在概念上示出为一组模块，所述模块控制处理器132的工作以及处理器132与其它硬件部件的交互，所述模块在后文更详细地说明。操作系统指导执行低层次的、基本系统功能，如文件管理和海量存储装置的运行。操作系统可以是/或包括多种操作系统，如MicrosoftWINDOWS操作系统、Unix操作系统、Linux操作系、Xenix的操作系统、IBMAIX操作系统、惠普UX操作系统、NovellNETWARE操作系统、SunMicrosystems的SOLARIS操作系统、OS/2操作系统、BeOS操作系统、MACINTOSH操作系统、APACHE操作系统、OPENACTION操作系统、iOS、Android或其它移动操作系统，或其它平台操作系统。FIG. 1B is a simplified block diagram of a computer system 100B implementing an image analysis system 106 (also referred to as an image analyzer) in accordance with one embodiment of the disclosed technology. Image analysis system 106 may include or consist of any device or device component capable of capturing and processing image data. In some implementations, computer system 100B includes processor 132 , memory 134 , camera interface 136 , display 138 , speaker 139 , keyboard 140 and mouse 141 . Memory 134 may be used to store instructions executed by processor 132 and input and/or output data associated with executing the instructions. In particular, memory 134 contains instructions, shown conceptually as a set of modules that control the operation of processor 132 and the interaction of processor 132 with other hardware components, described more hereinafter. Explain in detail. The operating system directs the execution of low-level, basic system functions, such as file management and operation of mass storage devices. The operating system can be/or include multiple operating systems, such as Microsoft WINDOWS operating system, Unix operating system, Linux operating system, Xenix operating system, IBM AIX operating system, HP UX operating system, NovellNETWARE operating system, SunMicrosystems SOLARIS operating system, OS /2 operating system, BeOS operating system, MACINTOSH operating system, APACHE operating system, OPENACTION operating system, iOS, Android or other mobile operating systems, or other platform operating systems.

计算环境还可以包括其它可移动/不可移动、易失性/非易失性计算机存储介质。例如，硬盘驱动器可读取或写入至不可移动、非易失性磁介质。磁盘驱动器可读取或写入至可移动、非易失性磁盘，光盘驱动器可读取或写入至可移动、非易失性光盘，例如CD-ROM或其它光学介质。可以在示例性操作环境中使用的其它可移动/不可移动、易失性/非易失性计算机存储介质包括但不限于：磁带盒、闪存卡、数字多功能盘、数字录像带、固态RAM、固态ROM等。存储介质通常通过可移动或不可移动存储器界面连接到系统总线。The computing environment may also include other removable/non-removable, volatile/nonvolatile computer storage media. For example, a hard drive can read from or write to non-removable, non-volatile magnetic media. A magnetic disk drive can read from or write to a removable, nonvolatile magnetic disk, and an optical disk drive can read from or write to a removable, nonvolatile optical disk, such as a CD-ROM or other optical media. Other removable/non-removable, volatile/nonvolatile computer storage media that may be used in the exemplary operating environment include, but are not limited to: magnetic tape cartridges, flash memory cards, digital versatile disks, digital video tapes, solid-state RAM, solid-state ROM, etc. Storage media are usually connected to the system bus through removable or non-removable memory interfaces.

处理器132可以是通用微处理器，但是，根据实施方式，可以可选地是微控制器、外围集成电路元件、CSIC(客户专用集成电路)、ASIC(应用程序专用集成电路)、逻辑电路、数字信号处理器，诸如FPGA(现场可编程门阵列)的可编程逻辑装置、PLD(可编程逻辑装置)、PLA(可编程逻辑阵列)、RFID处理器、智能芯片、或能够实施所公开的技术的过程的动作的任何其它装置或装置的布置。Processor 132 may be a general purpose microprocessor, but, depending on the implementation, may alternatively be a microcontroller, peripheral integrated circuit components, CSIC (customer specific integrated circuit), ASIC (application specific integrated circuit), logic circuits, Digital signal processors, programmable logic devices such as FPGAs (Field Programmable Gate Arrays), PLDs (Programmable Logic Devices), PLAs (Programmable Logic Arrays), RFID processors, smart chips, or capable of implementing the disclosed technology Any other device or arrangement of devices for the action of a process.

摄像机界面136可包括这样的硬件和/或软件，所述硬件和/或软件使计算机系统100B和摄像机(如在图1A中所示的摄像机102、104)以及相关联的光源(例如，图1A中的光源108、110)之间能够进行通信。因此，例如，摄像机界面136可包括一个或多个数据端口146、148(摄像机可连接到该数据端口)以及硬件和/或软件的信号处理器，以便在将信号作为输入提供到在处理器132上执行的运动捕获(“运动捕获”)程序144之前，修改来自摄像机的数据信号(例如，为减少噪声或重新格式化数据)。在一些实施方式中，摄像机界面136还可以传送信号给摄像机，以便例如激活或停用所述摄像机，以控制摄像机设置(帧频、图像质量、敏感度等)等等。例如，可以响应来自处理器132的控制信号传送前述信号，所述控制信号可以是响应用户输入或其它检测到的事件而生成的。Camera interface 136 may include hardware and/or software that enable computer system 100B and cameras (such as cameras 102, 104 shown in FIG. 1A ) and associated light sources (e.g., FIG. 1A ) Communication is enabled between the light sources 108, 110) in the light source. Thus, for example, the camera interface 136 may include one or more data ports 146, 148 to which a camera may be connected and a signal processor in hardware and/or software to provide a signal as input to the processor 132. The data signal from the camera is modified (eg, to reduce noise or to reformat the data) before a motion capture ("motion capture") program 144 executes on it. In some embodiments, the camera interface 136 may also transmit signals to the cameras, for example, to activate or deactivate the cameras, to control camera settings (frame rate, image quality, sensitivity, etc.), and the like. For example, the foregoing signals may be communicated in response to control signals from processor 132, which may be generated in response to user input or other detected events.

摄像机界面136还可以包括控制器147、149，光源(例如，光源108、110)可以连接到所述控制器。在一些实施方式中，例如，控制器147、149响应来自执行运动捕获程序144的处理器132的指令向光源供给工作电流。在其它实施方式中，光源可以从外部电源(未示出)汲取工作电流，控制器147、149能生成用于光源的控制信号，所述控制信号例如指示光源开启或关闭或改变亮度。在一些实施方式中，单个控制器可用于控制多个光源。Camera interface 136 may also include controllers 147, 149 to which light sources (eg, light sources 108, 110) may be connected. In some embodiments, for example, the controllers 147 , 149 supply operating current to the light sources in response to instructions from the processor 132 executing the motion capture program 144 . In other embodiments, the light sources may draw operating current from an external power source (not shown), and the controllers 147, 149 can generate control signals for the light sources, such as instructing the light sources to turn on or off or to change brightness. In some embodiments, a single controller can be used to control multiple light sources.

限定运动捕获程序144的指令存储在存储器134中，当被执行时，这些指令对由连接至摄像机界面136的摄像机供给的图像进行运动捕获分析。在一种实施方式中，运动捕获程序144包括各种模块，例如对象检测模块152、对象分析模块154和姿势识别模块156。对象检测模块152可以分析图像(例如，通过摄像机界面136捕获的图像)来检测其中的对象的边缘和/或关于对象的位置的其它信息。对象分析模块154可分析由对象检测模块152提供的对象的信息，以确定对象(例如，用户的手)的3D位置和/或运动。可以在运动捕获程序144的代码模块中实施的动作的示例说明如下。存储器134还可以包括运动捕获程序144使用的其它信息和/或代码模块。Stored in memory 134 are instructions defining a motion capture program 144 which, when executed, perform motion capture analysis on images supplied by cameras connected to camera interface 136 . In one embodiment, the motion capture program 144 includes various modules such as an object detection module 152 , an object analysis module 154 , and a gesture recognition module 156 . Object detection module 152 may analyze images (eg, images captured by camera interface 136 ) to detect edges of objects therein and/or other information about the location of objects. The object analysis module 154 may analyze the information of the object provided by the object detection module 152 to determine the 3D position and/or motion of the object (eg, a user's hand). Examples of actions that may be implemented in code modules of motion capture program 144 are described below. Memory 134 may also include other information and/or code modules used by motion capture program 144 .

显示器138、扬声器139、键盘140和鼠标141可以用来协助用户与计算机系统100B的交互。在一些实施方式中，使用摄像机界面136和运动捕获程序144进行姿势捕获的结果可以被解释为用户输入。例如，用户可以做出手势，所述手势使用运动捕获程序144进行分析，该分析的结果可以解释为给在处理器132上执行某些其它程序(例如，web浏览器、文字处理器或其它应用程序)的指令。因此，通过举例说明的方式，用户可以使用向上或向下的刷动姿势“滚动”当前在显示器138上显示的网页，使用旋转姿势来增加或减少扬声器139的音频输出的音量，依此类推。Display 138, speakers 139, keyboard 140, and mouse 141 may be used to assist a user in interacting with computer system 100B. In some implementations, the results of gesture capture using camera interface 136 and motion capture program 144 may be interpreted as user input. For example, a user may make a gesture that is analyzed using the motion capture program 144 and the results of that analysis may be interpreted for some other program executing on the processor 132 (e.g., a web browser, word processor, or other application). program) instructions. Thus, by way of example, a user may use an up or down swipe gesture to "scroll" a web page currently displayed on display 138, a rotate gesture to increase or decrease the volume of audio output from speaker 139, and so on.

应当理解，计算机系统100B是说明性的，各种变化和修改都是可能的。计算机系统可以以各种形式的因素来实施，所述因素包括服务器系统、桌面系统、笔记本电脑系统、平板电脑、智能电话或个人数字助理等等。具体实施方式可包括本文没有说明的其它功能，例如，有线和/或无线网络界面、媒体播放和/或记录能力等。在一些实施方式中，一个或多个摄像机可以内置在计算机中，而不是被提供为单独的部件。另外，图像分析仪可以仅利用计算机系统部件的一个子集(例如，作为具有合适的I/O界面以接收图像数据和输出分析结果的处理器执行程序代码、ASIC或固定功能的数字信号处理器)来实施。It should be understood that the computer system 100B is illustrative and that various changes and modifications are possible. A computer system may be implemented in various form factors including server systems, desktop systems, laptop systems, tablet computers, smartphones, or personal digital assistants, among others. Particular embodiments may include other functionality not described herein, such as wired and/or wireless network interfaces, media playback and/or recording capabilities, and the like. In some implementations, one or more cameras may be built into the computer rather than being provided as a separate component. In addition, image analyzers may utilize only a subset of computer system components (e.g., as processors executing program code, ASICs, or fixed-function digital signal processors with appropriate I/O interfaces to receive image data and output analysis results ) to implement.

虽然本文参考特定的模块说明了计算机系统100B，但是，应该理解，限定模块是为了说明的方便，并不旨在暗示部件的特定的物理布置。此外，模块不必对应于物理上独特的部件。就使用物理上独特的部件而言，根据需要，部件之间的连接(例如，用于数据通信)可以是有线和/或无线的。Although computer system 100B is described herein with reference to particular modules, it should be understood that defining modules is for ease of illustration and is not intended to imply a particular physical arrangement of components. Furthermore, modules do not necessarily correspond to physically distinct components. To the extent that physically distinct components are used, connections between components (eg, for data communications) may be wired and/or wireless as desired.

参考图1A、1B和2A，用户做出姿势，所述姿势由摄像机102、104捕获为一系列时间上连续的图像。这些图像由姿势识别模块156分析，所述姿势识别模块156可以被实施为运动捕获144的另一模块。姿势识别系统在计算机视觉领域是公知的，并且可以利用基于3D模型的算法(即，容积模型或骨架模型)、使用人体或与姿势相关的身体部位的简化表示的骨架模型、或基于例如与姿势相关的身体部位的变形模板的基于图像的模型或其它技术。参见例如，Wu等人，“基于视觉的姿势识别：回顾与展望”(Vison-BasedGestureRecognition:AReview)，载于《人机交互中基于姿势的通信》(Springer，1999)；Pavlovis等人，“人机交互中的手势的视觉解读：回顾与展望”(VisualInterpretationofHandGesturesforHuman-computerInteraction:areview)，IEEETrans.PatternAnalysisandMachineIntelligence(19(7)：677-695，1997年7月)。Referring to Figures 1A, 1B and 2A, a user makes a gesture that is captured by cameras 102, 104 as a series of temporally consecutive images. These images are analyzed by gesture recognition module 156 , which may be implemented as another module of motion capture 144 . Gesture recognition systems are well known in the field of computer vision and may utilize algorithms based on 3D models (i.e., volume models or skeletal models), skeletal models using simplified representations of the human body or body parts associated with the pose, or based on, for example, Image-based models or other techniques of deformable templates of relevant body parts. See, eg, Wu et al., "Vision-Based Gesture Recognition: A Review," in Gesture-Based Communication in Human-Computer Interaction (Springer, 1999); Pavlovis et al., "Human Visual Interpretation of Hand Gestures for Human-computer Interaction: a review, IEEE Trans. Pattern Analysis and Machine Intelligence (19(7): 677-695, July 1997).

姿势识别模块156向电子装置214提供输入，允许用户远程控制电子装置214和/或在显示在屏幕218上的虚拟环境中操纵虚拟对象216，所述虚拟对象例如是原型/模型、块、球，或者其它形状、按钮、杠杆或其它控制件。用户可以使用其身体的任何部位做出姿势，所述部位例如是手指、手或手臂。作为姿势识别的一部分或独立的装置，图像分析仪106可实时确定在3D空间中的用户的手的形状和位置，参见，例如，美国申请系列号61/587554，13/446585和61/724091，其分别提交于2012年1月17日、2012年3月7日和2012年11月8日，其全部公开内容通过引用并入于此。结果，图像分析仪106不仅可以识别姿势以便向电子装置214提供输入，还可以捕获连续视频图像中的用户的手的位置和图像，以便确定3D空间中的姿势的特征，并在显示屏幕218上复制该图像。Gesture recognition module 156 provides input to electronic device 214, allowing the user to remotely control electronic device 214 and/or manipulate virtual objects 216, such as prototypes/models, blocks, balls, in a virtual environment displayed on screen 218, Or other shapes, buttons, levers or other controls. Users can make gestures using any part of their body, such as fingers, hands or arms. As part of gesture recognition or as a stand-alone device, the image analyzer 106 can determine the shape and position of the user's hand in 3D space in real time, see, e.g., U.S. Application Serial Nos. 61/587554, 13/446585 and 61/724091, They were filed on January 17, 2012, March 7, 2012, and November 8, 2012, respectively, the entire disclosures of which are hereby incorporated by reference. As a result, the image analyzer 106 can not only recognize gestures to provide input to the electronic device 214, but can also capture the position and image of the user's hand in continuous video images in order to characterize the gesture in 3D space and display it on the display screen 218. Copy that image.

在一种实施方式中，姿势识别模块156将检测到的姿势与作为记录电子存储在数据库220中的姿势的资料库进行对比，所述数据库实施在图象分析系统106、电子装置214或外部存储系统222中。(如本文中所使用的，术语“电子存储”包括易失性或非易失性存储器中的存储，后者包括磁盘，闪存存储器等，并延伸至任何计算可寻址存储介质(包括例如光学存储))。例如，姿势可以被存储为矢量，即，数学上指定的空间轨迹，所述姿势记录可具有指定做出姿势的用户身体的相关部分的域；因此，用户的手和头部做出的类似的轨迹可以作为不同的姿势存储在数据库中，以使应用程序可以对其有不同的解释。通常，将感测到的姿势的轨迹与所存储的轨迹进行数学对比，以找到最佳匹配，并且，仅当匹配度超过阈值时，该姿势被识别为对应于的所识别出的数据库条目。矢量可以被按比例缩放，以便例如将跟踪的用户的手的大弧度的和小弧度识别为相同的姿势(即，对应于相同的数据库记录)，但是，姿势识别模块返回姿势的身份标识和反映缩放比例(scale)的值。缩放比例可对应于做出姿势时横过的实际姿势距离，或可标准化为一些典型距离。In one embodiment, gesture recognition module 156 compares the detected gestures to a library of gestures stored electronically as records in database 220, implemented in image analysis system 106, electronic device 214, or externally stored System 222. (As used herein, the term "electronic storage" includes storage in volatile or non-volatile memory, the latter including magnetic disks, flash memory, etc., and extends to any computer-addressable storage medium (including, for example, optical storage)). For example, gestures may be stored as vectors, i.e., mathematically specified trajectories in space, and the gesture record may have fields specifying the relevant part of the user's body making the gesture; thus, the user's hands and head make similar Trajectories can be stored in the database as different poses so that applications can interpret them differently. Typically, the trajectory of the sensed gesture is mathematically compared to the stored trajectory to find the best match, and only if the degree of match exceeds a threshold, the gesture is identified as corresponding to the identified database entry. The vector can be scaled so that, for example, the tracked large and small arcs of the user's hand are recognized as the same gesture (i.e., corresponding to the same database record), but the gesture recognition module returns an identity and reflection of the gesture The value of scale. The scaling may correspond to the actual gesture distance traversed when the gesture is made, or may be normalized to some typical distance.

在一些实施方式中，姿势识别模块156检测多个姿势。参照图2B，例如，用户可以做出挥动手臂的姿势，同时手指弯曲。姿势识别模块156检测挥动和弯曲姿势200B，并记录挥动轨迹330和五个手指的五个弯曲轨迹332、334、336、338、340。每条轨迹可以沿着例如欧拉空间中的六个欧拉自由度(x，y，z，翻滚角，俯仰角和偏航角)转换成矢量。具有最大幅度的矢量例如代表该运动的主导成分(例如，在这种情况下，挥动)，其余的矢量可以被忽略。当然，手指的细微运动可以是被隔离解释的主导姿势，而手的较大幅度的挥动运动被忽略。在一种实施方式中，矢量过滤器(可以使用过滤技术实施)应用于所述多个矢量，以过滤掉小矢量并识别主导矢量。可以重复、迭代该过程，直到识别出一个矢量—运动的主导成分。在一些实施方式中，每次检测到新姿势，就生成新的过滤器。In some implementations, gesture recognition module 156 detects multiple gestures. Referring to FIG. 2B , for example, a user may make a gesture of waving an arm while flexing fingers. The gesture recognition module 156 detects the swipe and bend gesture 200B and records the swipe trajectory 330 and the five bend trajectories 332 , 334 , 336 , 338 , 340 for the five fingers. Each trajectory can be transformed into a vector along, for example, the six Euler degrees of freedom (x, y, z, roll, pitch and yaw) in Euler space. The vector with the largest magnitude eg represents the dominant component of the motion (eg in this case the swipe), the rest of the vectors can be ignored. Of course, subtle movements of the fingers can be the dominant gestures interpreted in isolation, while larger waving movements of the hand are ignored. In one embodiment, a vector filter (which may be implemented using filtering techniques) is applied to the plurality of vectors to filter out small vectors and identify dominant vectors. This process can be repeated, iteratively, until one vector - the dominant component of the motion - is identified. In some implementations, each time a new gesture is detected, a new filter is generated.

如果该姿势识别系统156被实施为特定应用程序的一部分(如游戏或电视的控制器逻辑)，则数据库姿势记录还可以包含对应于该姿势(可以对其利用缩放比例值进行按比例缩放)的输入参数；在将姿势识别系统156实施为可用于多种应用程序的实用程序的通用系统中，该应用程序的特定参数被省略：当应用程序调用姿势识别系统156时，其根据其自己的程序解释所识别的姿势。If the gesture recognition system 156 is implemented as part of a particular application (such as a game or a TV's controller logic), the database gesture record may also contain the Input parameters; in a general-purpose system that implements the gesture recognition system 156 as a utility that can be used for a variety of applications, the application-specific parameters are omitted: when an application invokes the gesture recognition system 156, it Interpret the recognized gestures.

因此，参照图2A，姿势识别系统156通过参考数据库220识别手势，将表示所识别的姿势的信号向电子装置214传送。装置214继而把所识别的姿势和缩放比例值处理为输入信号，并对其分配输入参数值；然后，输入参数由在电子装置214上执行的应用程序使用，协助基于姿势的用户交互。例如，用户可以首先使其手以重复或独特的方式(例如，做出挥手势)运动，以启动与电子装置214的通信。在检测并识别该手势时，姿势识别系统156将信号传送给指示用户检测的电子装置214，作为响应，装置214呈现适当的显示(例如，控制面板224)。然后，用户做出另一姿势(例如，沿“向上”或“向下”方向使其手运动)，这又由姿势识别系统156检测到。姿势识别系统156识别姿势和与之相关的缩放比例值，并将数据传送给电子装置214；装置214继而将此信息解释为表示所希望的动作的输入参数(就好像用户按压遥控装置上的按钮)，使得用户能够操纵控制面板224上显示的数据(例如，选择感兴趣的频道、调整音频声音或改变屏幕的亮度)。在不同实施方式中，装置214连接到视频游戏源(例如，视频游戏控制台或CD或基于网络的视频游戏)；用户可以做出各种姿势来与虚拟环境(视频游戏)中的虚拟对象216进行远程交互。检测到的姿势和缩放比例作为输入参数提供到当前运行的游戏，所述游戏解释它们，并执行与背景相适应的动作，即，响应该姿势生成屏幕显示。该系统的各种部件—姿势识别系统156和装置214的解释姿势并基于其生成显示内容的操作元件—可以是单独的(如图所示)，或者可以被组织起来，或在概念上看成是在图像分析系统106内。Therefore, referring to FIG. 2A , gesture recognition system 156 recognizes a gesture by referring to database 220 and transmits a signal representing the recognized gesture to electronic device 214 . The device 214 in turn processes the recognized gesture and scale values as input signals and assigns them input parameter values; the input parameters are then used by applications executing on the electronic device 214 to facilitate gesture-based user interactions. For example, a user may first move his hand in a repetitive or unique pattern (eg, making a waving gesture) to initiate communication with the electronic device 214 . Upon detecting and recognizing the gesture, gesture recognition system 156 transmits a signal to electronic device 214 indicating the user's detection, and in response device 214 presents an appropriate display (eg, control panel 224). The user then makes another gesture (eg, moves his hand in an “up” or “down” direction), which in turn is detected by gesture recognition system 156 . Gesture recognition system 156 recognizes a gesture and its associated scaling value, and transmits the data to electronic device 214; device 214 in turn interprets this information as input parameters representing a desired action (as if the user pressed a button on a remote control) ), enabling the user to manipulate the data displayed on the control panel 224 (eg, select a channel of interest, adjust the audio sound, or change the brightness of the screen). In various implementations, device 214 is connected to a video game source (e.g., a video game console or CD or network-based video game); the user can make various gestures to interact with virtual objects 216 in the virtual environment (video game) interact remotely. The detected gestures and scales are provided as input parameters to the currently running game, which interprets them and performs actions appropriate to the context, ie, generating screen displays in response to the gestures. The various components of the system—the gesture recognition system 156 and the operating elements of the device 214 that interpret gestures and generate displays based on them—may be separate (as shown), or may be organized, or conceptually viewed as is within the image analysis system 106 .

在不同实施方式中，在用户成功启动与姿势识别系统156和电子装置214的通信后，姿势识别系统156生成代表检测到的身体的一部分(例如，手)的光标226或图228(以下称为“光标”)，并将其显示在装置的屏幕218上。在一种实施方式中，姿势识别系统156协调地锁定屏幕218上的光标226的运动，以便跟踪用户的姿势的实际运动。例如，当用户使其手沿向上方向运动时，作为响应，显示的光标226也在显示屏幕上向上运动。结果，光标226的运动直接将用户的姿势映射到显示的内容，使得例如用户的手和光标226的行为分别类似于PC的鼠标和监视器上的光标。这允许用户评估实际物理姿势的运动和在屏幕218上发生的所导致的动作(例如，显示在屏幕上的虚拟对象216的运动)之间的关系。因而，手的绝对位置通常对显示控制不重要；相反，用户的身体的相对位置和/或运动的方向控制屏幕虚拟动作，例如，光标226的运动。In various implementations, after a user successfully initiates communications with gesture recognition system 156 and electronic device 214, gesture recognition system 156 generates cursor 226 or figure 228 (hereinafter referred to as "cursor") and display it on the screen 218 of the device. In one embodiment, the gesture recognition system 156 coordinately locks the movement of the cursor 226 on the screen 218 in order to track the actual movement of the user's gesture. For example, when the user moves his hand in an upward direction, the displayed cursor 226 also moves upward on the display screen in response. As a result, the movement of the cursor 226 directly maps the user's gestures to the displayed content such that, for example, the user's hand and the cursor 226 behave similarly to a PC's mouse and a cursor on a monitor, respectively. This allows the user to evaluate the relationship between the motion of the actual physical gesture and the resulting action that occurs on the screen 218 (eg, the motion of the virtual object 216 displayed on the screen). Thus, the absolute position of the hand is generally not important for display control; rather, the relative position of the user's body and/or direction of motion controls on-screen virtual actions, such as the movement of the cursor 226 .

用户交互的一个示例300A示于图3A。如图所示，用户做出姿势以使显示的光标310运动，以便与所显示的关注的虚拟对象312至少部分地重叠。然后，用户做出另一姿势(例如，“手指点击”)，以选择所期望的对象312。为了将对象312标记为用户选择的对象，用户的运动(即，身体部位的运动)可能需要满足完成姿势的预定的阈值(例如，95％)；这个值存储在数据库220中或由当前在电子装置316中运行的应用程序来实施。One example 300A of user interaction is shown in Figure 3A. As shown, the user gestures to move the displayed cursor 310 to at least partially overlap the displayed virtual object of interest 312 . The user then makes another gesture (eg, “finger tap”) to select the desired object 312 . In order to mark object 312 as a user-selected object, the user's motion (i.e., motion of a body part) may need to meet a predetermined threshold (e.g., 95%) of completing the gesture; Implemented by an application program running on device 316.

例如，完成“点击”姿势可以要求用户的手指运动5厘米的距离，所述“点击”姿势激活类似于按钮的虚拟控制件；在检测到手指运动1厘米时，姿势识别系统314通过将其与数据库记录进行匹配来识别该姿势，并且确定所识别的姿势的完成程度(在这种情况下，20％)。在一种实施方式中，数据库中的每个姿势包括多个图像或矢量，所述图像或矢量的每一个与做出的姿势的完成程度(例如，从1％到100％)相关联；在其它实施方式中，完成程度由内插或将观察的矢量与所存储的矢量进行简单的对比来计算。所做出的姿势的完成程度(例如，用户使其手运动的幅度)可以被呈现在屏幕上，事实上，对姿势的完成程度的评估可以由装置316上运行的呈现应用程序而不是由姿势识别系统314进行处理。For example, performing a "click" gesture that activates a button-like virtual control may require the user's finger to move a distance of 5 centimeters; The database records are matched to identify the gesture and determine how complete the identified gesture is (in this case, 20%). In one embodiment, each gesture in the database includes a plurality of images or vectors, each of which is associated with a degree of completion (e.g., from 1% to 100%) of the performed gesture; In other embodiments, the degree of completion is calculated by interpolation or simple comparison of the observed vector with the stored vector. The degree of completion of the gesture made (e.g., how much the user moved his or her hand) can be presented on the screen, and in fact, the assessment of the degree of completion of the gesture can be presented by a rendering application running on device 316 rather than by the gesture. The identification system 314 performs the processing.

例如，电子装置316可以显示空心圆形图标318，当用户使得其手指运动靠近装置316(用户做出点击或“触摸”姿势)时该装置从姿势识别系统314接收简单运动(位置改变)信号，此时呈现应用程序用一种颜色或多种颜色填充所述空心圆形图标。圆被填充的程度指示用户的运动还有多久完成该姿势(或用户的手指已经离开其原始位置多远)。当用户完全做出点击或触摸姿势时，圆形被完全填充，这可能导致例如将虚拟对象312标记为被选择的对象。For example, the electronic device 316 may display a hollow circular icon 318 that receives a simple motion (position change) signal from the gesture recognition system 314 when the user moves his or her finger close to the device 316 (the user makes a click or "touch" gesture), At this time, the application program is presented to fill the hollow circular icon with one color or multiple colors. The extent to which the circle is filled indicates how far the user's motion has left to complete the gesture (or how far the user's finger has moved from its original position). When the user fully makes a click or touch gesture, the circle is fully filled, which may result in, for example, marking virtual object 312 as the selected object.

在一些实施方式中，该装置暂时显示第二指示(例如，改变指示器的形状，颜色或亮度)，以便确认该对象的选择。因而，姿势完成程度和/或对象选择的确认指示使用户能够很容易地预测选择虚拟对象的确切时刻；相应地，用户可以随后以直观的方式操纵所选择的屏幕虚拟(on-screen)对象。虽然这里的讨论聚焦于填充空心圆318，但是所公开的技术不限于显示在屏幕上的可以指示所做出的姿势的完成程度的任何特定类型的图像。例如，也可以使用逐渐由颜色填充的空心杆320、颜色梯度322、颜色的亮度或用于示出由用户做出的姿势的完成程度的任何合适的指示器，其都在所公开的当前的技术的范围内。In some embodiments, the device temporarily displays the second indication (eg, changes the shape, color or brightness of the indicator) to confirm selection of the object. Thus, the confirmation indication of the degree of gesture completion and/or object selection enables the user to easily predict the exact moment of selection of the virtual object; accordingly, the user can then manipulate the selected on-screen virtual object in an intuitive manner. Although the discussion here focuses on filling the hollow circle 318, the disclosed technology is not limited to any particular type of image displayed on the screen that may indicate the degree of completion of the gesture being made. For example, a hollow bar 320 gradually filled with color, a color gradient 322, the brightness of a color, or any suitable indicator for showing the degree of completion of a gesture made by a user may also be used, as disclosed in the current disclosure. within the range of technology.

姿势识别系统314基于在所捕获的2D图像中的用户的身体的做出姿势的部位的形状和位置连续检测并识别用户姿势。姿势的3D图像可以通过分析在连续获取的图像中用户做出姿势的身体部位的所识别的形状和位置的时间相关性进行重构。因为重构的3D图像可以实时地准确检测和识别小幅度姿势(例如，使手指运动小于1厘米的距离)，所以姿势识别系统314提供高检测敏感度。在不同实施方式中，一旦该姿势被识别且与其相关联的指令被识别，姿势识别系统314就将信号传送给装置316以激活显示用户的姿势的完成程度的屏幕虚拟指示器。屏幕虚拟指示器提供反馈，所述反馈允许用户使用不同程度的运动来控制电子装置316和/或操纵所显示的虚拟对象312。例如，用户的姿势可以是例如和身高尺寸一样大的跳跃或如手指点击的小幅度姿势。The gesture recognition system 314 continuously detects and recognizes user gestures based on the shape and location of the gesturing part of the user's body in the captured 2D images. The 3D image of the gesture can be reconstructed by analyzing the temporal correlation of the recognized shape and position of the user's gestured body part in successively acquired images. Gesture recognition system 314 provides high detection sensitivity because the reconstructed 3D image can accurately detect and recognize small gestures (eg, moving a finger over a distance of less than 1 centimeter) in real time. In various implementations, once the gesture is recognized and the instructions associated therewith are recognized, gesture recognition system 314 transmits a signal to device 316 to activate an on-screen virtual indicator showing the degree of completion of the user's gesture. The on-screen virtual pointer provides feedback that allows the user to control the electronic device 316 and/or manipulate the displayed virtual object 312 using varying degrees of motion. For example, the user's gesture may be a jump, eg, as large as a height dimension, or a gesture of a small magnitude, such as a finger tap.

在一种实施方式中，一旦对象312被标记为选择对象，则所述姿势识别系统314将对象312与屏幕虚拟光标310锁定在一起，以反映用户随后做出的运动。例如，当用户使其手沿向下方向运动时，作为响应，显示的光标310和选择的虚拟对象312也在显示屏幕上一起向下运动。再次，这允许用户在虚拟环境中精确地操纵虚拟对象312。In one embodiment, once the object 312 is marked for selection, the gesture recognition system 314 locks the object 312 together with the on-screen virtual cursor 310 to reflect subsequent movements made by the user. For example, when the user moves his hand in a downward direction, in response, the displayed cursor 310 and the selected virtual object 312 also move downward together on the display screen. Again, this allows the user to precisely manipulate virtual objects 312 in the virtual environment.

在另一种实施方式300B中，当虚拟对象被标记为选择的条目时，用户的后续运动被计算转换为施加到所选择的对象的模拟物理作用力。参照图3B，用户例如使其食指向前运动一厘米的距离来完成对虚拟对象330的选择；这种选择可以通过完全填充显示在屏幕上的空心圆332来确认。然后，用户可将其食指向前运动另一厘米。当检测这种运动时，姿势识别系统314将其转换为模拟作用力；可基于物理模拟模型、身体运动的自由度、身体部位的质量和运动速度、重力和/或任何其它相关参数来转换作用力。装置316上运行的生成虚拟对象330的应用程序通过以下来响应作用力的数据：基于包括牛顿物理原则的运动模型呈现受作用力影响的虚拟对象330的行为。In another embodiment 300B, when a virtual object is marked as the selected item, the user's subsequent motion is calculated and converted into a simulated physical force applied to the selected object. Referring to FIG. 3B , the user, for example, moves his index finger forward a distance of one centimeter to complete the selection of the virtual object 330; this selection can be confirmed by completely filling the hollow circle 332 displayed on the screen. The user can then move their index finger forward another centimeter. When such motion is detected, the gesture recognition system 314 converts it into a simulated force; the conversion may be based on a physical simulation model, degrees of freedom of body motion, mass and velocity of motion of body parts, gravity, and/or any other relevant parameters force. An application running on device 316 that generates virtual object 330 responds to the force data by presenting the behavior of virtual object 330 affected by the force based on a motion model that includes Newtonian principles of physics.

例如，如果用户运动是预定范围(例如，小于1厘米)内的幅度相对较小的运动和/或相对较慢的运动，则转换的作用力使所选择的对象330的形状变形；然而，如果用户的运动超过所确定的范围(即，大于10厘米)或阈值速度时，则装置316将所转换的作用力处理为大(即，比模拟的静摩擦作用力大)到足以使所选择的对象330运动。在接收到推力时，装置316的呈现应用程序基于运动模型模拟对象330的运动；然后在屏幕上更新该运动行为。呈现应用程序可以对虚拟对象330采取其它动作，例如，对按钮、操纵杆、铰链、把手等进行拉伸、弯曲、或施加机械控制。结果，模拟的作用力复制现实世界中等效作用力的效果，使交互对用户是可预见的和现实的。For example, if the user motion is a relatively small motion and/or a relatively slow motion within a predetermined range (e.g., less than 1 cm), the transformed force deforms the shape of the selected object 330; however, if When the user's motion exceeds a determined range (i.e., greater than 10 centimeters) or a threshold velocity, then means 316 processes the converted force to be large (i.e., greater than the simulated static friction force) sufficiently to make the selected object 330 movement. Upon receiving a thrust, the rendering application of device 316 simulates the motion of object 330 based on the motion model; this motion behavior is then updated on the screen. The presentation application may take other actions on the virtual object 330, such as stretching, bending, or exerting mechanical control on buttons, joysticks, hinges, handles, and the like. As a result, the simulated forces replicate the effects of equivalent forces in the real world, making the interaction predictable and realistic to the user.

应当强调，姿势识别系统314和装置316上运行的呈现应用程序之间的前述功能划分仅为示例；在一些实施方式中，该两个部分更紧密联接，甚至统一，从而使得不是简单地将一般作用力数据传递给应用程序，而是姿势识别系统314对呈现在装置316上的环境具有世界知识(worldknowledge)。以这种方式，姿势识别系统314可以将对象特定(object-specific)知识(例如，摩擦力和惯性)施加至作用力数据，使得直接计算用户的运动对呈现的对象的物理效果(而不是基于一般作用力数据，所述一般作用力数据由姿势识别系统314生成，并由装置316基于一个一个的对象进行处理)。此外，在不同实施方式中，运动捕获144在装置316上运行，部件314是简单的传感器，其仅仅传送图像(例如，高对比度的图像)至装置316以便由运动捕获144进行分析。在这样的实施方式中，该运动捕获144可以是独立应用程序，所述独立应用程序将姿势信息提供给在装置316上运行的呈现应用程序(例如游戏)，或者如以上所讨论的，也可以集成在呈现应用程序内(例如，游戏应用程序可以设置合适的运动捕获功能)。这种系统314和装置316之间以及硬件和软件之间的计算责任的划分代表了设计选择。It should be emphasized that the foregoing division of functionality between the gesture recognition system 314 and the presentation application running on the device 316 is merely an example; The force data is passed to the application, but the gesture recognition system 314 has world knowledge of the environment presented on the device 316 . In this way, the gesture recognition system 314 can apply object-specific knowledge (e.g., friction and inertia) to the force data so that the physical effect of the user's motion on the rendered object is calculated directly (rather than based on General force data generated by the gesture recognition system 314 and processed by the device 316 on an object-by-object basis). Also, in various embodiments, where motion capture 144 runs on device 316 , component 314 is a simple sensor that merely transmits images (eg, high-contrast images) to device 316 for analysis by motion capture 144 . In such an embodiment, the motion capture 144 may be a stand-alone application that provides pose information to a rendering application (such as a game) running on the device 316, or, as discussed above, may Integrate within the rendering application (for example, a game application can have suitable motion capture functionality set). This division of computational responsibility between system 314 and device 316 and between hardware and software represents a design choice.

图3C示出代表性方法300C，所述方法用于支持用户与电子装置的姿势交互，尤其涉及监视姿势完成程度，以便能推迟屏幕虚拟动作直到姿势完成。在第一动作352中，用户通过做出姿势来启动与电子装置的通信。在第二动作354中，由姿势识别系统检测姿势。在第三动作356中，姿势识别系统将识别的姿势与存储在数据库中的姿势记录进行对比，以识别该姿势并实时评估完成程度。然后，姿势识别系统将信号传送给电子装置(在第四动作358中)。(如前所述，完成程度功能可以在装置上实施，而不是由姿势识别系统实施，后者的系统仅仅提供运动跟踪数据。)基于该信号，该电子装置显示反映用户姿势的完成程度的屏幕虚拟指示器(在第五动作360中)。如果完成程度超过阈值(例如，95％)，则电子装置和/或显示在屏幕上的虚拟对象随后由用户基于当前或随后做出的姿势及时进行操纵(动作362、364)。FIG. 3C illustrates a representative method 300C for supporting user gesture interaction with an electronic device, particularly involving monitoring the degree of gesture completion so that on-screen virtual actions can be postponed until the gesture is complete. In a first action 352, the user initiates communication with the electronic device by making a gesture. In a second act 354, a gesture is detected by the gesture recognition system. In a third act 356, the gesture recognition system compares the recognized gesture to gesture records stored in the database to recognize the gesture and assess the degree of completion in real time. The gesture recognition system then transmits the signal to the electronic device (in a fourth act 358). (As previously mentioned, the level of completion functionality may be implemented on the device rather than by a gesture recognition system that simply provides motion tracking data.) Based on this signal, the electronic device displays a screen that reflects the level of completion of the user's gesture A virtual indicator (in fifth act 360). If the degree of completion exceeds a threshold (eg, 95%), the electronic device and/or the virtual object displayed on the screen is then manipulated by the user in time based on the current or subsequently made gesture (acts 362, 364).

参考图4A，在一个实施400A中，基于用户的实际运动的绝对空间位移确定在屏幕414上所显示的对象412的运动410。例如，用户可以首先如418所示将手416滑动到右方一厘米；当检测到并识别该姿势时，姿势识别系统420将信号传送给指示运动的电子装置422，该装置将所述信号解释为输入参数，并且作为响应，采取行动使光标或虚拟对象412在屏幕414上沿同一方向运动(即，呈现为运动)例如一百个像素。用户的物理运动和所呈现的运动之间的关系可以由用户通过例如改变由姿势识别系统420存储的用于相关联的姿势的缩放比例因子进行设置。如果姿势识别系统420集成了呈现应用程序，则用户可以使用姿势来进行这种改变。Referring to FIG. 4A , in one implementation 400A, the motion 410 of an object 412 displayed on a screen 414 is determined based on the absolute spatial displacement of the user's actual motion. For example, the user may first slide hand 416 one centimeter to the right as shown at 418; when this gesture is detected and recognized, gesture recognition system 420 transmits a signal to electronic device 422 indicating the movement, which interprets the signal Parameters are entered, and in response, an action is taken that causes the cursor or virtual object 412 to move (ie, appear to move) in the same direction on the screen 414, for example, a hundred pixels. The relationship between the user's physical motion and the presented motion may be set by the user by, for example, changing a scaling factor stored by gesture recognition system 420 for the associated gesture. If the gesture recognition system 420 is integrated with a presentation application, the user can use gestures to make this change.

例如，用户可以指定：光标或对象412响应给定的手的运动做出的更大的屏幕虚拟运动(即，横过更多的像素)。用户可以首先通过做出明确的姿势激活显示在屏幕上的比例控制面板424。控制面板424可被呈现为例如滑块、圆形标度盘或任何适当的形式。用户随后做出另一姿势来基于缩放比例控制面板424的样式来调整比例。如果缩放比例控制面板是滑块，则用户滑动其手指以改变该比例。在另一实施方式中，没有缩放比例控制面板显示在屏幕上；比例是基于该用户的后续姿势调整。作为另一示例，用户可以通过打开其拳头或使其拇指和食指分开来增加缩放比例，通过握拳或使其食指向拇指运动来减少缩放比例。虽然为了说明的目的这里的讨论集中于手或手指姿势，但是所公开的技术不限于由人体的任何特定部分做出的任何姿势。也可使用任何合适的姿势用于用户与电子装置之间的通信，其在公开的当前技术的范围之内。For example, the user may specify that the cursor or object 412 make larger screen virtual movements (ie, traverse more pixels) in response to a given hand movement. The user may first activate the proportional control panel 424 displayed on the screen by making an explicit gesture. Control panel 424 may be presented, for example, as a slider, a circular dial, or any suitable form. The user then makes another gesture to adjust the scale based on the style of the scaling control panel 424 . If the scale control panel is a slider, the user slides his finger to change the scale. In another embodiment, no scale control panel is displayed on the screen; the scale is adjusted based on subsequent gestures by the user. As another example, a user may increase zoom by opening his fist or spreading his thumb and index finger apart, and decrease zoom by closing his fist or moving his index toward his thumb. Although the discussion here focuses on hand or finger gestures for purposes of illustration, the disclosed technology is not limited to any gestures made by any particular part of the human body. Any suitable gestures may also be used for communication between the user and the electronic device and are within the scope of the disclosed current technology.

在另一些实施方式中，比例调整是用遥控装置(用户通过按压按钮来控制)或使用无线装置(诸如平板电脑或智能电话)来实现。不同的缩放比例可以与各姿势相关联(即，缩放比例是局部的，对各姿势可以不同)并存储在数据库中的特定姿势记录中。可替代地，缩放比例可以适用于存储在姿势数据库中的几个或所有的姿势(即，缩放比例是全局的，对至少几个姿势相同)。In other embodiments, scaling is accomplished with a remote control (user controls by pressing a button) or using a wireless device such as a tablet or smartphone. A different scale can be associated with each gesture (ie, the scale is local and can be different for each gesture) and stored in a particular gesture record in the database. Alternatively, the scaling may apply to several or all of the gestures stored in the gesture database (ie the scaling is global and the same for at least a few gestures).

可替代地，物理和屏幕虚拟运动之间的关系至少部分地基于显示器和/或呈现的环境的特性来确定。例如，参考图4B，在一个实施方式400B中，所获取的用户的(摄像机)图像430具有M×N个像素的矩阵形式的亮度值，电子装置422显示屏幕的(呈现的)帧具有X×Y像素。当用户在摄像机图像中做出导致m像素的水平位移(或m象素距离)和n象素竖直位移(或n象素距离)的挥手姿势420时，相对水平和竖直运动分别被设定为m/M，n/N，用于按比例缩放。响应该手势，可以使显示屏幕414上的光标或对象412运动(x，y)像素，其中x和y分别以最简单的形式被确定为x＝m/M×X，y＝n/N×Y。但是，即使为了显示基本上是单位(1:1)的缩放比例(其被针对用户的环境和显示屏幕的相对尺寸调整)，通常也要考虑摄像机的位置和用户的距离、焦距、图像传感器的分辨率、视角等，结果，x和y的量被乘以恒定的量，导致从“用户空间”到呈现的图像的实质仿射影射(affinemapping)。再次说明，所述常数可以被调整来放大或减小屏幕虚拟运动的响应。用户与显示在屏幕上的虚拟对象412进行的这种交互可以在使对象在虚拟环境中运动的同时向用户提供现实的感觉。Alternatively, the relationship between physical and screen virtual motion is determined based at least in part on characteristics of the display and/or presented environment. For example, referring to FIG. 4B , in one embodiment 400B, the captured user's (camera) image 430 has luminance values in the form of a matrix of M×N pixels, and the (presented) frame of the display screen of the electronic device 422 has X×N pixels Y pixels. When the user makes a waving gesture 420 in the camera image that results in a horizontal displacement (or m pixel distance) of m pixels and a vertical displacement (or n pixel distance) of n pixels, the relative horizontal and vertical motions are set respectively Defined as m/M, n/N for scaling. In response to this gesture, the cursor or object 412 on the display screen 414 can be moved by (x, y) pixels, where x and y are respectively determined in the simplest form as x=m/M×X, y=n/N× Y. However, even for displaying essentially unitary (1:1) scaling (which is adjusted for the user's environment and the relative size of the display screen), the position of the camera and the distance to the user, focal length, image sensor's Resolution, viewing angle, etc. As a result, the x and y quantities are multiplied by a constant amount, resulting in a substantial affine mapping from "user space" to the rendered image. Again, the constants can be adjusted to amplify or reduce the response to virtual movement of the screen. This user interaction with the virtual object 412 displayed on the screen can provide the user with a sense of reality while moving the object in the virtual environment.

用户的实际运动和所导致的在屏幕上发生的动作之间的缩放比例关系可能会导致性能上的挑战，特别是当用户可用的空间有限时。例如，当两个家庭成员一起坐在沙发上播放显示在电视上的视频游戏时，每一个用户的运动的有效范围由于其它用户的存在而受到限制。因此，可以改变缩放比例因子来反映受限制的运动范围，以使小幅度的物理运动对应于较大的屏幕虚拟运动。这可以在由姿势识别系统检测到多个相邻的用户时自动发生。在不同实施方式中，缩放比例也可取决于屏幕上的呈现的内容。例如，在具有多个对象的忙碌呈现环境中，可能期望小的缩放比例，以运行用户精密导航；而对于更简单的或更开放的环境(如以下情形：用户假装扔球或挥动高尔夫球杆，且检测到的动作被呈现在屏幕上)，优选大的缩放比例。The scaling relationship between the user's actual motion and the resulting action on the screen can cause performance challenges, especially when the space available to the user is limited. For example, when two family members are sitting together on the couch playing a video game displayed on a television, each user's effective range of motion is limited by the presence of the other users. Therefore, the scaling factor can be changed to reflect the restricted range of motion, so that small physical motions correspond to larger screen virtual motions. This can happen automatically when multiple adjacent users are detected by the gesture recognition system. In various implementations, the zoom ratio may also depend on the content presented on the screen. For example, in a busy rendering environment with many objects, a small zoom factor may be desired to enable user precision navigation; , and the detected motion is presented on the screen), preferably with a large zoom ratio.

如上所述，用户的运动和屏幕上显示的运动之间的合适关系取决于用户相对于记录摄像机的位置。例如，用户的实际运动m与捕获的图像的像素尺寸M的比例取决于在姿势识别系统420实施的摄像机的视角以及摄像机和用户之间的距离。如果视角宽或者用户在远离摄像机的距离处，则用户的姿势的检测到的相对运动(即，m/M)小于如果视角不那么宽或用户更接近摄像机的情况下的相对运动。因此，在前者的情况下，虚拟对象响应姿势在屏幕上运动得太少，而在后一种情况下，虚拟对象运动得太多。在不同实施方式中，用户的实际运动与显示在屏幕上的相应的运动的比例基于例如在用户和姿势识别系统(其可以通过测距来跟踪)之间的距离自动粗略地调整；这使用户可以朝向或远离姿势识别系统运动，而不破坏用户已经获得对实际和呈现运动之间的关系的直观的感受。As noted above, the proper relationship between the user's motion and the motion displayed on the screen depends on the user's position relative to the recording camera. For example, the ratio of the user's actual motion m to the pixel size M of the captured image depends on the viewing angle of the camera implemented in the gesture recognition system 420 and the distance between the camera and the user. If the viewing angle is wide or the user is at a distance away from the camera, the detected relative motion (ie, m/M) of the user's gesture is smaller than if the viewing angle is not as wide or the user is closer to the camera. Thus, in the former case, the virtual object moves too little on the screen in response to the gesture, while in the latter case, the virtual object moves too much. In various implementations, the ratio of the user's actual motion to the corresponding motion displayed on the screen is automatically roughly adjusted based on, for example, the distance between the user and the gesture recognition system (which can be tracked by odometry); this allows the user to Movements can be made toward or away from the gesture recognition system without disrupting the intuitive sense the user has acquired of the relationship between actual and presented movement.

在不同实施方式中，当识别出姿势但所检测到的用户运动很微小(即，低于预定阈值)时，姿势识别系统420从低敏感度检测模式切换为高敏感度模式，在所述高敏感度模式中，基于所获取的2D图像和/或3D模型准确地重构手势的3D图像。由于高敏感度姿势识别系统可以精确地检测由身体的一小部分(例如，一个手指)做出的小幅度运动(例如，小于几毫米)，所以用户的实际运动与屏幕上显示的所导致的运动的比例可以在大范围内进行调整，例如，在1000：1和1:1000之间。In various implementations, when a gesture is recognized but the detected user motion is minimal (ie, below a predetermined threshold), gesture recognition system 420 switches from a low-sensitivity detection mode to a high-sensitivity mode at which In sensitivity mode, the 3D image of the gesture is accurately reconstructed based on the acquired 2D image and/or 3D model. Since a high-sensitivity gesture recognition system can accurately detect small movements (e.g., less than a few millimeters) made by a small part of the body (e.g., a finger), the actual movement of the user differs from that displayed on the screen. The ratio of the movement can be adjusted within a wide range, for example, between 1000:1 and 1:1000.

图4C中示出根据所公开的当前技术的实施方式的用户动态地调整其实际运动和显示在电子装置的屏幕上的所导致的对象运动之间的关系的代表性方法400C。在第一动作452中，用户通过做出姿势启动与电子装置的通信。在第二动作454中，该姿势被检测，并且通过姿势识别系统识别。在第三动作456中，姿势识别系统通过将检测到的姿势与存储在数据库中的姿势进行对比识别与所述姿势相关的指令。然后，姿势识别系统基于所述指令确定用户的实际运动与显示在装置的屏幕上的所导致的虚拟动作的比例(在第四动作458)。姿势识别系统传送指示指令的信号到所述电子装置(在第五动作460)。在第六动作462，在接收到信号时，电子装置基于所确定的比例和用户的随后的运动在屏幕上显示虚拟动作。A representative method 400C of a user dynamically adjusting the relationship between his actual motion and the resulting motion of an object displayed on the screen of an electronic device in accordance with an embodiment of the disclosed current technology is shown in FIG. 4C . In a first action 452, the user initiates communication with the electronic device by making a gesture. In a second action 454, the gesture is detected and recognized by the gesture recognition system. In a third act 456, the gesture recognition system identifies instructions associated with the detected gesture by comparing the gesture to gestures stored in the database. The gesture recognition system then determines the ratio of the user's actual motion to the resulting virtual motion displayed on the screen of the device based on the instructions (at fourth act 458). The gesture recognition system transmits a signal indicative of the instruction to the electronic device (at fifth act 460). In a sixth action 462, upon receiving the signal, the electronic device displays a virtual motion on the screen based on the determined scale and the user's subsequent motion.

系统100B可以经由显示器138向用户呈现各种用户界面元件以方便与其交互。用户界面元件可以是响应来自用户的某些姿势(或其它形式的输入)而创建，或是通过处理器132上运行的软件程序(例如，运动捕获程序144或其它应用程序或游戏程序)创建。在一种实施方式中，显示器138在显示器138上设有盘状“圆盘”(puck)用户界面元件502，如图5A所示。如上所述的姿势识别系统314识别来自用户的姿势，并根据所公开的技术的实施方式使得圆盘502相应地运动。在一种实施方式中，用户的手的代表符504也出现在显示器138上；当手的代表符504触摸圆盘502的一侧506并使其沿第一方向508运动时，圆盘在对应于代表符504的运动的相应的方向510运动。用户可以经由代表符504类似地在其侧面在任何位置触摸圆盘502，做出“推”圆盘502的姿势，从而使得圆盘502在相应的方向上运动。System 100B may present various user interface elements to a user via display 138 to facilitate interaction therewith. User interface elements may be created in response to certain gestures (or other forms of input) from the user, or by software programs running on processor 132 (eg, motion capture program 144 or other application or game programs). In one embodiment, the display 138 is provided with a disc-shaped "puck" user interface element 502 on the display 138, as shown in FIG. 5A. Gesture recognition system 314 as described above recognizes gestures from the user and causes puck 502 to move accordingly in accordance with an embodiment of the disclosed technology. In one embodiment, a representation 504 of the user's hand also appears on the display 138; Movement in direction 510 corresponds to movement of representative symbol 504 . The user can similarly touch the puck 502 anywhere on its side via the representation 504, making a gesture of "push" the puck 502, causing the puck 502 to move in the corresponding direction.

示出在图5A中的所公开的技术的实施方式是说明性的示例；所公开的技术不限于仅此实施方式。屏幕138上可以没有用户的手的代表符504；姿势识别系统314可将来自用户的姿势识别为意在推圆盘502的姿势，而不显示代表符504，用户可以使用其手的其它部分(例如，手掌)或使用其它身体部位或对象创建姿势。在其它实施方式中，如果被显示，代表符504可以包括其它对象，如触控笔或画笔，或用户的其它身体部位。圆盘502可以是任何尺寸或形状，例如圆形、方形、椭圆形或三角形。The implementation of the disclosed technology shown in FIG. 5A is an illustrative example; the disclosed technology is not limited to only this implementation. There may be no representation 504 of the user's hand on the screen 138; the gesture recognition system 314 may recognize a gesture from the user as one intended to push the puck 502, and without displaying the representation 504, the user may use other parts of his hand ( For example, the palm of your hand) or use other body parts or objects to create poses. In other implementations, if displayed, the representation 504 may include other objects, such as a stylus or paintbrush, or other body parts of the user. Disk 502 may be of any size or shape, such as circular, square, oval, or triangular.

圆盘502的位置可被用作输入至计算机程序、显示设置、游戏或任何其它此类软件应用程序的输入或其它变量。在一种实施方式中，圆盘502的x位置控制第一变量，圆盘502的y位置控制第二(有关或无关)变量。图5B示出了一个这样的应用程序；灰度选择小部件512包括圆盘502。通过经由一个或多个姿势推动圆盘502，用户可以在选择小部件512上选择灰度值。例如对应于该圆盘502的中心的灰度值可由此被选择，以便供例如电脑绘画程序使用。选择小部件512可包括多个任何其它此类值(例如，颜色)，用于通过圆盘502从其中选择。The position of puck 502 may be used as an input or other variable to a computer program, display setting, game, or any other such software application. In one embodiment, the x-position of the disk 502 controls a first variable and the y-position of the disk 502 controls a second (related or unrelated) variable. One such application is shown in FIG. 5B ; grayscale selection widget 512 includes puck 502 . By pushing puck 502 via one or more gestures, a user can select a grayscale value on selection widget 512 . For example, a gray value corresponding to the center of the disk 502 can thus be selected for use by, for example, a computer drawing program. Selection widget 512 may include a number of any other such values (eg, colors) for selection from via puck 502 .

圆盘502可响应用户姿势以任何数目的不同方式运动。例如，圆盘502可在用户停止推动它之后继续运动一段时间，并且可以根据虚拟质量和与小部件512的虚拟摩擦系数(或其它类似的值)减速直至停止。圆盘502可以仅在用户的姿势已与其一侧接触并且用户的进一步的运动超过最小阈值距离(例如，圆盘是“粘性”的，需要姿势覆盖初始最小量的距离，才能“摆脱粘性”)才启动运动。在一种实施方式中，当用户的姿势停止与圆盘502接触时，圆盘502通过虚拟“弹簧”栓系到小部件512上的一个点。按压如传统的按钮的圆盘的顶部表面可以导致发生进一步的动作。在一种实施方式中，在按压圆盘502的顶表面之后，用户可以做出旋转姿势，姿势识别系统314可相应地旋转圆盘(并相应地改变应用程序的参数)。The puck 502 can move in any number of different ways in response to user gestures. For example, puck 502 may continue to move for a period of time after the user stops pushing it, and may decelerate according to the virtual mass and virtual coefficient of friction with widget 512 (or other similar values) until it stops. The puck 502 may only "get out of stickiness" if the user's gesture has made contact with one side of it and the user's further motion exceeds a minimum threshold distance (e.g., the puck is "sticky" requiring the gesture to cover an initial minimum amount of distance) Just start the movement. In one embodiment, when the user's gesture stops contacting the puck 502, the puck 502 is tethered to a point on the widget 512 by a virtual "spring". Pressing the top surface of the puck, like a conventional button, can cause further action to take place. In one embodiment, after pressing the top surface of puck 502, the user may perform a rotation gesture, and gesture recognition system 314 may rotate the puck accordingly (and change the parameters of the application accordingly).

在所公开的技术的其它实施方式中，用户可以使用姿势创建附加的用户界面元件，随后用这些元件进行交互。例如，姿势识别系统314可以检测用户已用手指(或其它对象)做出圆形运动并将该圆形运动解释为期望在显示器138上创建按钮。一旦被创建，用户可与用户界面元件进行交互(通过，例如，按压按钮)，并由此使得执行相关联的功能。该功能可以通过以下来确定：显示器138的背景、显示器138上创建用户界面元件的位置或者其它用户输入。In other implementations of the disclosed technology, a user may use gestures to create additional user interface elements and then interact with these elements. For example, gesture recognition system 314 may detect that a user has made a circular motion with a finger (or other object) and interpret the circular motion as a desire to create a button on display 138 . Once created, a user may interact with a user interface element (by, for example, pressing a button) and thereby cause an associated function to be performed. This functionality may be determined by the background of the display 138, the location on the display 138 where the user interface elements are created, or other user input.

在另一种实施方式中，姿势识别系统314响应用户姿势，创建滑块，所述用户姿势例如是伸展两个手指(例如，其食指和中指)并用手指做出姿势(例如，平行于显示器138的平面的运动)。一旦被创建，滑块就可用于控制适当的应用程序(例如，滚动文件、菜单或列表的页面或部分)。In another embodiment, the gesture recognition system 314 creates sliders in response to a user gesture, such as extending two fingers (e.g., their index and middle fingers) and making a gesture with the fingers (e.g., parallel to the display 138 plane movement). Once created, the slider can be used to control the appropriate application (eg, scrolling pages or sections of a file, menu, or list).

在另一种实施方式中，姿势识别系统314将用户的向前或反向的手指指向姿势解释为“鼠标点击”(或其它类似的选择或确认命令)。用户可以使其手指着显示器138或指向显示器138，并使手指沿其长轴的方向朝向显示器138运动；如果手指运动的距离超过一个阈值(例如，1、5或10厘米)，则姿势识别系统314将此姿势解释为鼠标点击。在一种实施方式中，仅当至少有一定比例的运动(例如，50％)是在手指指向的方向时，姿势才被解释为鼠标点击。朝向远离显示器138的方向的运动的类似的姿势可以被解释为另一个或不同的用户输入。在一种实施方式中，向前的姿势是左击鼠标，反向姿势是右鼠标点击。In another embodiment, the gesture recognition system 314 interprets the user's forward or reverse finger pointing gesture as a "mouse click" (or other similar selection or confirmation command). The user can point or point the finger at the display 138 and move the finger in the direction of its long axis toward the display 138; if the distance the finger moves exceeds a threshold (e.g., 1, 5, or 10 centimeters), the gesture recognition system 314 interprets this gesture as a mouse click. In one embodiment, a gesture is interpreted as a mouse click only if at least a certain percentage of the movement (eg, 50%) is in the direction the finger is pointing. A similar gesture toward motion in a direction away from display 138 may be interpreted as another or different user input. In one embodiment, the forward gesture is a left mouse click and the reverse gesture is a right mouse click.

其它用户的姿势、其它对象的运动或其组合可以被集体捕获并用于确定旋转因子。姿势识别系统314可以分析存在于一个序列的捕获的图像中的所有或大部分运动，并基于其生成单个旋转因子(表达为例如若干程度的旋转)。在一种实施方式中，姿势识别系统314在所捕获的运动的中心处或附近选择焦点，计算每个运动对象相对于该焦点的旋转量，并基于其计算平均旋转量。不同对象的运动可以基于其加速度、速度、尺寸、靠近显示器138的程度或其它类似的因子加权在平均值中。然后，单个旋转因子可以用作输入至在系统100B中运行的程序的输入。Gestures of other users, motions of other objects, or a combination thereof may be collectively captured and used to determine the rotation factor. Gesture recognition system 314 may analyze all or most of the motion present in a sequence of captured images and generate a single rotation factor (expressed, for example, as a number of degrees of rotation) based thereon. In one embodiment, gesture recognition system 314 selects a focal point at or near the center of the captured motion, calculates the amount of rotation of each moving object relative to the focal point, and calculates an average amount of rotation based thereon. The motion of different objects may be weighted in the average based on their acceleration, velocity, size, proximity to the display 138, or other similar factors. A single twiddle factor can then be used as input to a program running in system 100B.

如上所述，姿势识别系统(例如，示于图1A的系统100)使用一个或多个摄像机102、104捕获诸如手114的对象的图像；该对象可以使用一个或多个光源108、110照明。对象检测模块152检测对象，姿势的识别模块156检测使用对象做出的姿势。一旦检测到，姿势就被输入到电子装置，所述电子装置可以以不同的方式(例如，操纵虚拟对象)使用姿势。然而，可能会检测许多不同种类的姿势，在电子装置上运行的应用程序可能不使用或不需要每个检测到的姿势。传送不被使用的姿势给应用程序可能会造成不必要的复杂性和/或消耗不必要的应用程序和姿势识别模块156之间的链接的带宽。As noted above, a gesture recognition system (eg, system 100 shown in FIG. 1A ) uses one or more cameras 102 , 104 to capture images of an object, such as a hand 114 ; the object may be illuminated using one or more light sources 108 , 110 . The object detection module 152 detects objects, and the gesture recognition module 156 detects gestures made using the objects. Once detected, the gesture is input to the electronic device, which can use the gesture in different ways (eg, to manipulate a virtual object). However, many different kinds of gestures may be detected, and applications running on the electronic device may not use or require every detected gesture. Transmitting unused gestures to the application may create unnecessary complexity and/or consume unnecessary bandwidth of the link between the application and gesture recognition module 156 .

在一种实施方式中，仅姿势识别模块156所捕获姿势的一个子集被传送给在电子装置上运行的应用程序。如图1A中所示，所识别的姿势可以从姿势识别模块156传送给姿势过滤器158，并基于所述姿势的一个或多个特性进行过滤。通过过滤器158的标准姿势被传送给应用程序，没有通过过滤器的姿势不被传送和/或被删除。姿势过滤器158被示为存储器134中的独立模块，但是公开的技术并不局限于此实施方式；过滤器158的功能可以全部或部分地结合到姿势识别模块156中。在不同实施方式中，姿势识别模块156不考虑过滤器158的设置识别所有检测到的姿势，或者根据过滤器158的设置识别检测姿势的一个子集。In one embodiment, only a subset of the gestures captured by the gesture recognition module 156 are communicated to the application running on the electronic device. As shown in FIG. 1A , a recognized gesture may be passed from gesture recognition module 156 to gesture filter 158 and filtered based on one or more characteristics of the gesture. Standard gestures that pass the filter 158 are passed to the application, gestures that do not pass the filter are not passed and/or deleted. Gesture filter 158 is shown as a separate module in memory 134 , but the disclosed technology is not limited to this implementation; the functionality of filter 158 may be incorporated, in whole or in part, into gesture recognition module 156 . In various implementations, gesture recognition module 156 recognizes all detected gestures regardless of filter 158 settings, or recognizes a subset of detected gestures based on filter 158 settings.

图6是示出根据所公开的技术的实施方式过滤姿势的方法的流程图600。在一种实施方式中，说明了在三维(3D)感觉空间中将关注的姿势与非关注的姿势区别开的方法。所述方法包括在动作652中接收限定一个或多个基准姿势的基准特性的输入，在动作654中，使用电子传感器检测三维(3D)感觉空间中的一个或多个实际姿势并使用来自电子传感器的数据确定实际特性，在动作656中，将实际姿势和基准姿势进行对比来确定一组关注的姿势，并在动作658将所述一组关注的姿势和相应的姿势参数提供给进一步的处理。FIG. 6 is a flowchart 600 illustrating a method of filtering gestures in accordance with an implementation of the disclosed technology. In one implementation, a method for distinguishing gestures of interest from gestures of non-interest in a three-dimensional (3D) sensory space is described. The method includes, in act 652, receiving an input defining a reference characteristic of one or more reference poses, and in act 654, detecting the one or more actual poses in a three-dimensional (3D) sensory space using electronic sensors and using information from the electronic sensors In act 656, the actual pose is compared with the reference pose to determine a set of poses of interest, and in act 658 the set of poses of interest and corresponding pose parameters are provided for further processing.

在一种实施方式中，当基准特性是姿势路径时，诸如横向挥动的直线路径的实际姿势被解释为一组关注的姿势。根据一种实施方式，当基准特性是姿势速度时，具有高速度的实际姿势被解释为一组关注的姿势。根据一种实施方式，当基准特性是姿势形态时，使用以特定的手指指向的手做出的实际姿势被解释为一组关注的姿势。根据一种实施方式，当基准特性是姿势形态时，握拳的手的实际姿势被解释为一组关注的姿势。In one embodiment, when the reference characteristic is a gesture path, an actual gesture such as a straight path of a lateral swipe is interpreted as a set of gestures of interest. According to one embodiment, when the reference characteristic is gesture velocity, actual gestures with high velocities are interpreted as a set of gestures of interest. According to one embodiment, when the reference characteristic is a gesture modality, the actual gesture made with the hand pointed at a particular finger is interpreted as a set of gestures of interest. According to one embodiment, when the reference characteristic is gesture morphology, the actual gesture of the fisted hand is interpreted as a set of gestures of interest.

在另一种实施方式中，当基准特性是姿势的形状时，手竖起大拇指的实际姿势被解释为一组关注的姿势。根据一种实施方式，当基准特性是姿势长度时，挥动姿势被解释为一组关注的姿势。在又一种实施方式中，当基准特性是姿势的位置时，距所述电子传感器的距离小于阈值的实际姿势被解释为一组关注的姿势。当基准特性是姿势的持续时间时，在3D感觉空间中持续阈值时间周期的实际姿势,而非在3D感觉空间中持续的时间小于阈值时间周期的实际姿势被解释为一组关注的姿势。当然，可以在同一时间使用不止一个特性。In another implementation, when the reference characteristic is the shape of the gesture, the actual thumbs up gesture of the hand is interpreted as a set of gestures of interest. According to one embodiment, the waving gesture is interpreted as a set of gestures of interest when the reference characteristic is gesture length. In yet another embodiment, when the reference characteristic is the position of a gesture, actual gestures whose distance from said electronic sensor is less than a threshold are interpreted as a set of gestures of interest. When the reference characteristic is the duration of a pose, actual poses that persist in the 3D sensory space for a threshold time period, but not actual poses that persist in the 3D sensory space for less than the threshold time period, are interpreted as a set of poses of interest. Of course, more than one feature can be used at the same time.

过滤器158的特性可被限定为适应特定的应用程序或一组应用程序。在不同实施方式中，特征可以从菜单界面接收、从命令文件或配置文件读取、经由API或任何其它类似的方法通信。过滤器158可以包括多个组的预配置特性，并允许用户或应用程序选择该多个组中的一个。过滤器特性的示例包括：姿势做出的路径(例如，过滤器158可以仅通过例如具有相对直的路径的姿势，而阻止具有曲线路径的姿势)；姿势的速度(例如，过滤器158可以通过具有高速度的姿势，而阻止具有低速度的姿势)；和/或姿势的方向(例如，过滤器可以通过具有左右运动的姿势，而阻止具有向前向后运动的姿势)。进一步的过滤器特性可基于形态、形状或做出姿势的对象的倾向；例如，过滤器158可以仅通过使用以特定的手指(例如，无名指)指向的手做出的姿势、握拳的手或张开的手。该过滤器158可以进一步仅通过用拇指向上或向下的姿势做出的姿势，例如用于投票应用程序。The characteristics of filter 158 may be defined to suit a particular application or group of applications. In various implementations, features may be received from a menu interface, read from a command file or configuration file, communicated via an API, or any other similar method. Filter 158 may include multiple groups of pre-configured properties and allow a user or application to select one of the multiple groups. Examples of filter properties include: the path taken by the gesture (e.g., filter 158 may pass only gestures such as those with relatively straight paths, while blocking gestures with curved paths); the velocity of the gesture (e.g., filter 158 may pass gestures with high velocities while blocking those with low velocities); and/or the orientation of the gestures (eg, a filter may pass gestures with side-to-side motion while blocking gestures with forward-backward motion). Further filter characteristics may be based on the morphology, shape, or propensity of the gesturing object; for example, the filter 158 may only pass gestures using a hand pointing with a particular finger (e.g., a ring finger), a fisted hand, or an open hand. Open hands. This filter 158 may further only pass gestures made with thumbs up or down gestures, such as for a voting application.

由过滤器158进行的过滤可如下所述地实施。在一种实施方式中，由姿势识别模块156检测到的姿势被分配一组特性，每组特性包括一个或多个特性(例如，速度或路径)，姿势和特性保持在数据结构中。过滤器158检测哪些被分配的特性满足其过滤特性，并通过与这些特性相关联的姿势。通过过滤器158的姿势可以经由API或经由类似的方法返回到一个或多个应用程序。可替代地或附加地，该姿势可以显示在显示器138上和/或在菜单(为，例如，实地教学IF应用程序)中。Filtration by filter 158 can be performed as follows. In one embodiment, gestures detected by gesture recognition module 156 are assigned a set of properties, each set of properties including one or more properties (eg, speed or path), the gestures and properties maintained in a data structure. Filter 158 detects which of the assigned properties satisfy its filter properties and passes gestures associated with those properties. Gestures that pass filter 158 may be returned to one or more applications via an API or via a similar method. Alternatively or additionally, the gesture may be displayed on the display 138 and/or in a menu (for, for example, a Field Education IF application).

如上所述，姿势识别模块156将所检测的对象的运动与已知的姿势的库进行对比，如果存在匹配，则返回匹配的姿势。在一种实施方式中，用户、程序员、应用程序开发者或其它用户限定的姿势补充、修改或替换已知姿势。如果姿势识别模块156识别出用户限定的姿势，则其通过API(或类似方法)将该姿势返回给一个或多个程序。在一种实施方式中，再次参考图1A，姿势设置模块160基于限定姿势的特性的输入屏蔽姿势的运动，并返回具有匹配特性的姿势的组。As described above, the gesture recognition module 156 compares the detected motion of the object to a library of known gestures, and returns the matching gesture if there is a match. In one embodiment, user, programmer, application developer, or other user-defined gestures supplement, modify, or replace known gestures. If gesture recognition module 156 recognizes a user-defined gesture, it returns the gesture to one or more programs via an API (or similar method). In one implementation, referring again to FIG. 1A , the gesture setup module 160 masks the motion of the gesture based on the input defining the characteristics of the gesture and returns a set of gestures with matching characteristics.

用户限定的特性可以包括任何数量的各种不同的姿势的属性。例如，所述特性可以包括姿势的路径(例如，相对较直的、曲线的；圆形与刷动)；姿势的参数(例如，最小或最大长度)；姿势的空间特性(例如，所述姿势发生的空间区域)；姿势的时间特性(例如，姿势的最小或最大持续时间)；和/或姿势的速度(例如，最小或最大速度)。所公开的技术不局限于这些属性，姿势的任何其它属性都在所公开的技术的范围之内。User-defined properties may include any number of attributes for various different gestures. For example, the properties may include the path of the gesture (e.g., relatively straight, curved; circular vs. swiping); parameters of the gesture (e.g., minimum or maximum length); spatial properties of the gesture (e.g., the gesture the spatial region in which it occurs); the temporal characteristics of the gesture (eg, the minimum or maximum duration of the gesture); and/or the velocity of the gesture (eg, the minimum or maximum velocity). The disclosed technology is not limited to these attributes, any other attributes of gestures are within the scope of the disclosed technology.

用户限定的姿势和预定的姿势之间的冲突可以以任何数量的方式解决。程序员可以例如指定应该忽略预定的姿势。在另一实施方式中，用户限定的姿势被确定优先于预定的姿势，使得如果姿势同时匹配二者，则返回用户限定的姿势。Conflicts between user-defined gestures and predetermined gestures can be resolved in any number of ways. A programmer may, for example, specify that predetermined gestures should be ignored. In another embodiment, a user-defined gesture is determined to take precedence over a predetermined gesture such that if the gesture matches both, the user-defined gesture is returned.

在不同实施方式中，姿势训练系统帮助应用程序开发者和/或最终用户来限定自己的姿势和/或使得姿势适应自己的需求和喜好—换句话说，超出预先编程或者“封装”的姿势之外。姿势训练系统可以通过正常的语言(例如，一系列的问题)与用户进行交互，以更好地限定用户希望系统能够识别的动作。通过在预说明的安装过程回答这些问题，用户可以限定用于相应的姿势的参数和/或参数范围，从而解决模糊性。有利的是，这种方法提供了可靠的姿势识别，而不需要通常与需计算机猜测答案相关联的算法复杂性；因此，其有助于减少软件复杂性和成本。在一种实施方式中，一旦系统已被训练来识别特定姿势或动作，则其可以为这个姿势或动作创建一个对象(例如，文件、数据结构等)，之后协助识别姿势或动作。该对象可以由应用编程界面(API)使用，并且可以由开发人员和非开发用户使用。在一些实施方式中，数据是由开发者和非开发者用户共享，或能够由其共享，从而有助于协作等。In various embodiments, the gesture training system helps application developers and/or end users to define their own gestures and/or adapt gestures to their own needs and preferences—in other words, beyond pre-programmed or "packaged" gestures. outside. The gesture training system can interact with the user through normal language (eg, a series of questions) to better define the actions the user wants the system to recognize. By answering these questions during a pre-specified setup, the user can define parameters and/or parameter ranges for corresponding gestures, thereby resolving ambiguities. Advantageously, this approach provides reliable gesture recognition without the algorithmic complexity typically associated with requiring a computer to guess the answer; thus, it helps reduce software complexity and cost. In one embodiment, once the system has been trained to recognize a particular gesture or action, it can create an object (eg, file, data structure, etc.) for that gesture or action, which then assists in recognizing the gesture or action. This object can be used by an application programming interface (API), and can be used by developers and non-development users. In some embodiments, data is shared, or can be shared, by developers and non-developer users, thereby facilitating collaboration and the like.

在一些实施方式中，姿势训练是对话式的、互动和动态的；根据用户给出的响应，可以选择接下来的问题或者待指定的下一个参数。这些问题可以以视觉或音频形式(例如，作为显示在计算机屏幕上的文本，或者通过扬声器输出)呈现给用户。用户的响应同样可以以各种模式给出，例如通过键盘的文字输入、图形用户界面元件的选择(例如，使用鼠标)、语音命令，或者，在某些情况下通过该系统已经熟练识别的基本姿势。(例如，“拇指向上”或“拇指向下”姿势可以用于回答任何是和否的问题。)另外，如由以下示例示出的，某些问题引起动作(具体而言，进行示范性的姿势(例如，典型的姿势或姿势的范围的边界点))而不是口头答复。在这种情况下，该系统可以利用例如机器学习方法来精选来自摄像机的图像或捕获动作的视频流的相关信息。In some embodiments, the gesture training is conversational, interactive and dynamic; based on the responses given by the user, the next question or next parameter to be specified can be selected. These questions may be presented to the user in visual or audio form (eg, as text displayed on a computer screen, or output through a speaker). User responses can likewise be given in various modes, such as text entry via the keyboard, selection of graphical user interface elements (e.g., using a mouse), voice commands, or, in some cases, basic posture. (For example, a "thumbs up" or "thumbs down" gesture can be used to answer any yes and no question.) Also, as shown by the examples below, certain questions elicit actions (specifically, performing the exemplary gestures (eg, typical gestures or boundary points of a range of gestures)) rather than verbal responses. In this case, the system can leverage, for example, machine learning methods to curate relevant information from images from cameras or video streams that capture motion.

图7是为特定用户定制姿势解释的方法的流程图700。在一种实施方式中，说明了为特定用户定制姿势解释的方法。所述方法包括：在动作752提示用户选择用于自由空间中的姿势的特性值并接收选择的特性值，在动作754提示用户在三维(3D)感觉空间中执行姿势的特性边界集中示范(characteristicfocuseddemonstrationofboundaries)，在动作756，从由电子传感器捕获的边界集中示范确定姿势的一组参数，在动作758存储用于姿势识别的该组参数和相应的值。FIG. 7 is a flowchart 700 of a method of customizing gesture interpretation for a particular user. In one implementation, a method of customizing gesture interpretation for a particular user is described. The method includes, at act 752, prompting the user to select a characteristic value for a gesture in free space and receiving the selected characteristic value, and at act 754, prompting the user to perform a characteristic focused demonstration of boundaries of the gesture in a three-dimensional (3D) sensory space. ), at act 756, demonstrating a set of parameters determining the gesture from the bounding set captured by the electronic sensor, storing the set of parameters and corresponding values for gesture recognition at act 758.

所述方法还包括通过以下测试对特定姿势的解释：提示用户在3D感觉空间中做出特定姿势的完整姿势示范，从由电子传感器捕获的完整姿势示范确定特定姿势的一组参数，将特定姿势的该组参数与从边界集中示范确定的相应的一组参数和选择的特性值进行对比，并且在行动760向用户报告对比的结果，并接收对特定姿势的解释是否正确的确认信息。The method also includes testing the interpretation of the particular gesture by prompting the user to perform a complete gesture demonstration of the particular gesture in a 3D sensory space, determining a set of parameters for the particular gesture from the complete gesture demonstration captured by the electronic sensor, and converting the particular gesture The set of parameters is compared with a corresponding set of parameters and selected characteristic values exemplarily determined from the bounding set, and the results of the comparison are reported to the user at act 760 and confirmation is received whether the interpretation of the particular gesture is correct.

所述方法还包括使用用于提示用户选择姿势的特性值的问卷。在一种实施方式中，使用问卷提示用户选择特性值包括：从用户接收姿势在3D感觉空间内的最小阈值时间段，在此之前不解释姿势。在另一实施方式中，执行特性边界集中示范包括用户使用特定手指做出手指向姿势作为姿势形态。执行特性边界集中示范还包括用户使用手做出握拳姿势作为姿势的形态。执行特性边界集中示范还包括用户使用手拇指向上或拇指向下的姿势作为姿势的形状。The method also includes using a questionnaire for prompting the user to select a characteristic value of the gesture. In one embodiment, using the questionnaire to prompt the user to select a property value includes receiving from the user a minimum threshold period of time for the gesture to be within the 3D sensory space before interpreting the gesture. In another embodiment, performing the characteristic boundary-focused demonstration includes the user making a finger-pointing gesture using a specific finger as a gesture modality. Executing characteristic boundary-focused demonstration also includes a form in which the user makes a fist gesture with a hand as a gesture. Executing characteristic boundary-focused demonstrations also include the user using a thumb-up or thumb-down gesture as the shape of the gesture.

在一种实施方式中，执行特性边界集中示范包括用户用手做出拇指向上或拇指向下的姿势作为姿势的形状。根据一种实施方式，执行特性边界集中示范包括用户做出捏合姿势以便设定最小姿势距离作为一姿势尺寸。在另一实施方式中，执行特性边界集中示范还包括用户做出挥动姿势以便设置最大姿势距离为一姿势尺寸。In one embodiment, performing characteristic boundary-focused demonstration includes the user making a thumbs up or thumbs down gesture with a hand as a shape of the gesture. According to one embodiment, performing characteristic boundary-focused demonstration includes the user making a pinch gesture in order to set a minimum gesture distance as a gesture size. In another embodiment, performing characteristic boundary-focused demonstration further includes the user making a swipe gesture to set the maximum gesture distance as a gesture size.

在又一种实施方式中，执行特性边界集中示范包括用户做出手指轻弹的姿势以便设置最快的姿势运动。在一种实施方式中，执行特性边界集中示范包括用户做出挥动姿势以便设置最慢的姿势运动。执行特性边界集中示范包括用户做出横向扫动姿势以便设置直线姿势路径。根据一种实施方式，执行特性边界集中示范包括用户做出圆形扫动以便设置圆形姿势路径。In yet another embodiment, performing a characteristic boundary-focused demonstration includes the user making a finger flick gesture to set the fastest gesture motion. In one embodiment, performing a characteristic boundary-focused demonstration includes the user making a waving gesture to set the slowest gesture motion. Executing characteristic boundary-focused demonstrations involves the user making a lateral swipe gesture in order to set a straight gesture path. According to one embodiment, performing a characteristic boundary-focused demonstration includes the user making a circular sweep to set a circular gesture path.

图8A、8B和8C示出了根据一种实施方式的用于示例性训练指导流程的一系列的问题和提示800A，800B，和800C。如图所示，在动作852和854，用户首先被询问在姿势中涉及了多少手和手指。然后，在动作856，系统通过询问姿势能占用的最大和最小时间量确定姿势的总的时间段。在动作858，对于最大时间量，设定较低的截止，如一秒。8A, 8B, and 8C illustrate a series of questions and prompts 800A, 800B, and 800C for an exemplary training coaching flow, according to one embodiment. As shown, at acts 852 and 854, the user is first asked how many hands and fingers were involved in the gesture. Then, in act 856, the system determines the total time period of the gesture by asking for the maximum and minimum amount of time the gesture can take. At act 858, for the maximum amount of time, a lower cutoff is set, such as one second.

在接下来的几个交互中，系统询问用户姿势的尺寸、速度和方向是否重要。在动作860，如果尺寸重要，要求用户示范的最小和最大合理动作。作为示范的结果，自动生成的识别器(即在训练期间基于用户输入创建的对象)可随后量化姿势的尺寸并计算标准化尺寸的姿势输出。相关的训练参数包括指示运动、路径、开始和停止点、弧长等参数和/或其组合，和/或从前述计算的参数。如果尺寸并不重要，则姿势总是被标准化并且不考虑尺寸。在这种情况下，相关的训练参数包括标准化运动参数(包括：例如，运动、路径、开始和停止点、弧长等和/或其组合，和/或从上述计算的参数)。Over the next few interactions, the system asks the user if the size, velocity, and direction of the user's pose matter. At act 860, if size is important, the user is asked to demonstrate a minimum and maximum reasonable motion. As a result of the demonstration, automatically generated recognizers (ie objects created based on user input during training) can then quantize the size of gestures and compute normalized sized gesture outputs. Relevant training parameters include indicating movement, path, start and stop points, arc length, etc. parameters and/or combinations thereof, and/or parameters calculated from the foregoing. If size is not important, pose is always normalized and does not take size into account. In this case, relevant training parameters include standardized motion parameters (including, for example, motion, path, start and stop points, arc length, etc. and/or combinations thereof, and/or parameters calculated from the foregoing).

在动作862，如果速度重要，则请求用户示范最快和最慢的运动。从所观察的运动，系统可以安静地检查加速度范围。速度示范使得自动生成的识别器能输出速度(例如，基于时间变量速度沿着姿势的傅立叶变换，这允许在频率领域中识别数据的特性速度)。相关训练参数包括平移距离(如欧氏距离，即，(dx²+dy²+dz²)^1/2)和持续时间窗口(即，姿势持续多久指示用于分析的相关时间跨度)。如果速度不重要，则姿势是速度标准化的。为了表征姿势的时间方面，时间转换为空间，也就是，使用统一的采样(例如，手的一个位置随着时间在一个方向上运动)。姿势然后被伸展、收缩，并匹配至模板来提取关于速度随着时间的推移信息。训练参数包括所得到的曲线的曲率和扭转。In act 862, if speed is important, the user is requested to demonstrate the fastest and slowest movements. From the observed movement, the system can silently check the acceleration range. Velocity modeling enables automatically generated recognizers to output velocities (eg, based on the Fourier transform of time-variant velocities along poses, which allows identification of characteristic velocities of data in the frequency domain). Relevant training parameters include translational distance (eg, Euclidean distance, ie (dx² +dy² +dz² )^1/2 ) and duration window (ie, how long a gesture lasts indicates the relevant time span for analysis). If speed is not important, pose is speed normalized. To characterize the temporal aspects of poses, time is transformed into space, that is, uniform sampling is used (e.g. one position of the hand moves in one direction over time). Poses are then stretched, contracted, and matched to templates to extract information about velocity over time. The training parameters include the curvature and twist of the resulting curve.

在动作864，如果姿势的方向重要，则用户被要求示范各种合理和各种不合理的方向。结果，自动生成的识别器被启用，以输出以下信息：姿势是否正在被发出、确定性水平和/或错误、和/或运动参数(例如，运动、路径、开始和停止点、弧长，平移范围等和/或其组合，和/或从其组合计算的参数)。如果方向不重要，则训练参数是简单的曲率和扭转。In act 864, if the orientation of the gesture is important, the user is asked to demonstrate various reasonable and various unreasonable orientations. As a result, automatically generated recognizers are enabled to output the following information: whether a gesture is being issued, the level of certainty and/or errors, and/or motion parameters (e.g., motion, path, start and stop points, arc length, translation range, etc. and/or combinations thereof, and/or parameters calculated from combinations thereof). If orientation is not important, the training parameters are simple curvature and twist.

此外，在动作466，用户被要求决定是否应接收草率姿势应。如果接受，则系统要求用户示范非常草率但仍是可接受的姿势。否则，系统将尝试通过要求用户示范可以勉强接受的姿势和不能接受的姿势来确定什么是可以接受的界限。Additionally, at act 466, the user is asked to decide whether a hasty gesture response should be received. If accepted, the system asks the user to demonstrate a very sloppy but still acceptable gesture. Otherwise, the system will attempt to determine what is an acceptable boundary by asking the user to demonstrate acceptable and unacceptable gestures.

最后，在动作868，在所有相关参数都在训练期间被设置后，测试系统的姿势识别能力。用户可能会被要求做出姿势(系统刚刚被训练识别的姿势，或其它姿势)。为指示姿势的开始和结束，用户可以按压例如键盘上的空格键。用户做出姿势之后，系统指示其是否将这个姿势识别为以前被训练过的，并请求用户确认或校正。该测试可以重复多次。可以组合(例如，取平均值等)多个成功的测试的结果或用户选择一个最好的结果。上述交互当然只是一个示例。其它实施方式可以以不同的次序询问问题或提示，和/或提示以不同的顺序，或询问附加的或不同的问题。Finally, at act 868, after all relevant parameters have been set during training, the gesture recognition capabilities of the system are tested. The user may be asked to perform a gesture (a gesture the system has just been trained to recognize, or another gesture). To indicate the start and end of a gesture, the user may press, for example, the space bar on the keyboard. After the user makes a gesture, the system indicates whether it recognizes the gesture as previously trained and asks the user for confirmation or correction. This test can be repeated multiple times. The results of multiple successful tests may be combined (eg, averaged, etc.) or the user selects a best result. The above interaction is of course just an example. Other embodiments may ask questions or prompts in a different order, and/or prompts in a different order, or ask additional or different questions.

因而，本文说明的上述3D用户交互技术使用户能够通过简单地做出身体姿势来直观地控制和操纵电子装置和虚拟对象。因为姿势识别系统协助以高检测敏感度呈现姿势的重构3D图像，所以用于显示控制的动态的用户交互是实时获得，没有过度的计算复杂性。例如，用户可以动态地控制其实际运动与显示在屏幕上的相应的动作之间的关系。此外，该装置可以显示屏幕虚拟指示器，以实时反映用户的姿势的完成程度。因此，所公开的技术使得用户能够动态地与显示在屏幕上的虚拟对象进行交互，并且有利地增强了虚拟环境的真实感。Thus, the above-described 3D user interaction techniques described herein enable users to intuitively control and manipulate electronic devices and virtual objects by simply making physical gestures. Because the gesture recognition system assists in rendering reconstructed 3D images of gestures with high detection sensitivity, dynamic user interactions for display control are obtained in real-time without undue computational complexity. For example, the user can dynamically control the relationship between his actual movement and the corresponding action displayed on the screen. In addition, the device can display on-screen virtual indicators to reflect in real time the degree of completion of the user's gestures. Thus, the disclosed techniques enable users to dynamically interact with virtual objects displayed on the screen and advantageously enhance the realism of the virtual environment.

本文所使用的术语和表达用作说明性的术语和的表达，并非限制性的，并且在使用这些术语和表达时，不意在排除所示出和说明的特征的任何等同物或其一部分。此外，已经说明所公开的技术的某些实施方式，显而易见的是，本领域的普通技术人员认为，可以在不脱离所公开的技术的精神和范围的情况下使用结合本文所公开的概念的其它实施方式。因此，所说明的实施方式应在所有方面被认为是仅是说明性的而不是限制性。The terms and expressions used herein are used as terms and expressions of description, not of limitation, and in their use, there is no intention to exclude any equivalent or part of the features shown and described. Furthermore, having described certain embodiments of the disclosed technology, it should be apparent that those of ordinary skill in the art would recognize that other embodiments incorporating the concepts disclosed herein may be used without departing from the spirit and scope of the disclosed technology. implementation. Accordingly, the illustrated embodiments should be considered in all respects as illustrative only and not restrictive.

具体实施方式detailed description

在一种实施方式中，说明了将三维(3D)感觉空间中的有意义的姿势与相近的无意义的姿势区别开的方法。所述方法包括：通过使用电子传感器检测3D感觉空间中的手臂和附着的手腕和手指的位置，当手臂在运动时，将手腕和手指的弯曲与手臂姿势的总轨迹区别开，从一系列检测的位置计算手臂做出的挥动姿势的空间轨迹，从检测的位置计算手腕和/或手指的弯曲姿势的空间轨迹，以及基于相应的空间轨迹的幅度决定挥动姿势是否主导弯曲姿势。手腕和手指弯曲指手指朝向和/或远离手腕的向内和/或向外运动。在另一实施方式中，手臂做出的挥动姿势是指手臂从一侧到另一侧的向内和/或向外的伸展。所述方法还包括触发对主导姿势的响应，而不触发对非主导姿势的响应。In one implementation, a method is described for distinguishing meaningful gestures in a three-dimensional (3D) sensory space from nearby nonsensical gestures. The method includes detecting the position of the arm and attached wrist and fingers in a 3D sensory space by using electronic sensors, distinguishing the bending of the wrist and fingers from the general trajectory of the arm pose when the arm is in motion, from a series of detected Compute the spatial trajectory of the swaying gesture made by the arm, calculate the spatial trajectory of the bending gesture of the wrist and/or fingers from the detected position, and decide whether the waving gesture dominates the bending gesture based on the magnitude of the corresponding spatial trajectory. Wrist and finger flex refers to the inward and/or outward movement of the fingers toward and/or away from the wrist. In another embodiment, a waving gesture by an arm refers to an inward and/or outward extension of the arm from side to side. The method also includes triggering a response to a dominant gesture without triggering a response to a non-dominant gesture.

所公开的技术的方法和其它实施方式可以包括一个或多个下述特征和/或与所公开的其它方法结合说明的一个或多个特征。为了简明起见，在本申请中公开的特征的组合不单独列举，并且不重复各基本组的特征。读者将理解，在本部分所述的特征可以如何容易地与识别为实施方式的多组基本特征相结合。Methods and other implementations of the disclosed technology may include one or more of the features described below and/or one or more of the features described in conjunction with other methods disclosed. For the sake of brevity, combinations of features disclosed in this application are not listed individually, and features of each basic group are not repeated. The reader will appreciate how easily the features described in this section can be combined with the sets identified as essential to the implementation.

在一种实施方式中，挥动姿势的空间轨迹的幅度至少部分由做出挥动姿势时所横过的距离确定。在另一种实施方式中，弯曲姿势的空间轨迹的幅度至少部分地由手指的卷曲的程度(scaleofcurling)确定。In one embodiment, the magnitude of the spatial trajectory of the swipe gesture is determined at least in part by the distance traversed while making the swipe gesture. In another embodiment, the magnitude of the spatial trajectory of the bending gesture is determined at least in part by the scale of curling of the fingers.

其它实施方式可以包括非临时性计算机可读存储介质，所述存储介质存储指令，所述指令能够由处理器执行以便执行上述任何方法。又一种实施方式可包括包含存储器和一个或多个处理器的系统，所述处理器可操作来执行存储在存储器中的指令以便执行上述任何方法的指令。Other embodiments may include a non-transitory computer-readable storage medium storing instructions executable by a processor to perform any of the methods described above. Yet another embodiment may include a system including a memory and one or more processors operable to execute instructions stored in the memory for performing the instructions of any of the methods described above.

在另一种实施方式中，说明了区分源自于3D感觉空间中的单个对象的两个同时做出的姿势的方法。所述方法包括当手臂运动时通过以下将手腕和手指弯曲与手臂姿势的总轨迹区别开：使用电子传感器检测3D感觉空间中的手臂和附着的手腕和手指的位置，从一系列检测的位置计算手臂做出的挥动姿势的空间轨迹，其中，空间轨迹的幅度至少部分由做出挥动姿势时所横过的距离确定，计算手腕和/或手指的弯曲姿势的空间轨迹，其中，弯曲姿势的空间轨迹的幅度至少部分地由手指的卷曲的程度以及手指间的自由度确定，并且评估各空间轨迹的幅度并基于空间轨迹的幅度决定主导姿势。手腕和手指弯曲指手指朝向和/或远离手腕的向内和/或向外运动。在另一实施方式中，手臂做出的挥动姿势是指手臂从一侧到另一侧的向内和/或向外的伸展。所述方法还包括根据主导姿势触发对总轨迹的响应。In another embodiment, a method of distinguishing between two simultaneous gestures originating from a single object in a 3D sensory space is described. The method includes differentiating wrist and finger flexion from the general trajectory of the arm pose as the arm moves by using electronic sensors to detect the position of the arm and attached wrist and fingers in 3D sensory space, calculated from a series of detected positions A spatial trajectory of a swinging gesture made by the arm, wherein the magnitude of the spatial trajectory is determined at least in part by the distance traversed while making the swinging gesture, computing a spatial trajectory of a flexed gesture of the wrist and/or fingers, wherein the spatial trajectory of the flexed gesture The magnitude of the trajectories is determined at least in part by the degree of curling of the fingers and degrees of freedom between the fingers, and the magnitude of each spatial trajectory is evaluated and a dominant pose is determined based on the magnitude of the spatial trajectories. Wrist and finger flex refers to the inward and/or outward movement of the fingers toward and/or away from the wrist. In another embodiment, a waving gesture by an arm refers to an inward and/or outward extension of the arm from side to side. The method also includes triggering a response to the overall trajectory based on the dominant gesture.

其它实施方式可以包括非临时性计算机可读存储介质，所述存储介质存储能够由处理器执行以执行上述任何方法的指令。又一种实施方式可包括包含存储器和一个或多个处理器的系统，所述处理器可操作来执行存储在存储器中的指令以便执行上述任何方法的指令。Other embodiments may include a non-transitory computer-readable storage medium storing instructions executable by a processor to perform any of the methods described above. Yet another embodiment may include a system including a memory and one or more processors operable to execute instructions stored in the memory for performing the instructions of any of the methods described above.

在另一种实施方式中，说明了不考虑用户在3D感觉空间中的位置一致地响应用户的姿势输入的方法。所述方法包括通过以下来自动调整在物理空间中的姿势和姿势界面中所导致的响应的缩放比例：计算控制对象距电子联接至姿势界面的摄像机的距离，基于控制对象距摄像机的距离来将摄像机视野中的运动所横过的视角(apparentangle)按比例缩放至按比例缩放的运动距离，并自动调整响应与反映物理空间中的姿势的按比例缩放的运动距离而不是横过的视角的比例。In another implementation, a method is described that responds consistently to a user's gestural input regardless of the user's position in a 3D sensory space. The method includes automatically adjusting the scaling of gestures in physical space and resulting responses in the gesture interface by calculating a distance of the control object from a camera electronically coupled to the gesture interface, scaling the control object based on the distance of the control object from the camera. The apparent angle traversed by motion in the camera's field of view is scaled to the scaled motion distance, and the response is automatically scaled to reflect the scaled motion distance of the pose in physical space rather than the traversed angle .

所公开的技术的方法和其它实施方式可以包括一个或多个下述特征和/或与所公开的其它方法结合说明的一个或多个特征。Methods and other implementations of the disclosed technology may include one or more of the features described below and/or one or more of the features described in conjunction with other methods disclosed.

所述方法还包括：当横过的视角低于阈值时，降低姿势界面的屏幕虚拟响应。所述方法还包括：当横过的视角高于阈值时，增大姿势界面的屏幕虚拟响应。The method also includes reducing the on-screen virtual response of the gesture interface when the traversed viewing angle is below a threshold. The method also includes increasing the screen virtual response of the gesture interface when the traversed viewing angle is higher than a threshold.

其它实施方式可包括非临时性计算机可读存储介质，所述存储介质存储能够由处理器执行以执行上述任何方法的指令。又一种实施方式可包括包含存储器和一个或多个处理器的系统，所述处理器可操作来执行存储在存储器中的指令以便执行上述任何方法的指令。Other embodiments may include a non-transitory computer-readable storage medium storing instructions executable by a processor to perform any of the methods described above. Yet another embodiment may include a system including a memory and one or more processors operable to execute instructions stored in the memory for performing the instructions of any of the methods described above.

在另一种实施方式中，说明了调整3D感觉空间中的姿势界面的虚拟对象的响应的方法。所述方法包括通过以下调整物理空间中的姿势和姿势界面中的所导致的虚拟对象的响应之间的响应比例：基于所述虚拟对象的数量计算姿势界面的虚拟对象的密度，并且，响应姿势界面中的虚拟对象的密度，自动调整虚拟对象的屏幕虚拟响应与姿势的比例。In another implementation, a method of adjusting the response of a virtual object of a gesture interface in a 3D sensory space is described. The method includes adjusting a response ratio between a gesture in physical space and a resulting response of a virtual object in the gesture interface by calculating a density of virtual objects of the gesture interface based on the number of virtual objects, and, responding to the gesture The density of virtual objects in the interface automatically adjusts the ratio of screen virtual response and posture of virtual objects.

所述方法还包括：当内容密度高于阈值时，响应特定姿势，自动指定虚拟对象的低屏幕虚拟响应。所述方法还包括：当内容密度低于阈值时，响应特定姿势，自动指定虚拟对象的高屏幕虚拟响应。The method also includes automatically assigning a low-screen virtual response of the virtual object in response to a particular gesture when the content density is above a threshold. The method also includes automatically assigning a high-screen virtual response of the virtual object in response to a specific gesture when the content density is below a threshold.

在另一种实施方式中，说明了在3D感觉空间中一致地响应来自多个用户的姿势输入的方法。所述方法包括通过以下来自动调整来自多个用户的物理空间中的姿势和在分享的姿势界面中所导致的响应之间的响应比例：基于在3D感觉空间检测的用户的间隔计算3D感觉空间中的用户间隔，并且当解释物理空间中的姿势的运动距离时，响应用户间隔，自动调整分享的姿势界面的屏幕虚拟响应的比例。In another implementation, a method of responding consistently to gestural input from multiple users in a 3D sensory space is described. The method includes automatically adjusting a response ratio between gestures in physical space from multiple users and resulting responses in a shared gesture interface by computing a 3D sensory space based on intervals of users detected in the 3D sensory space The user interval in , and automatically adjust the scale of the screen virtual response of the shared gesture interface in response to the user interval when interpreting the motion distance of the gesture in the physical space.

在又一种实施方式中，说明了检测用户是否打算与在3D感觉空间中的虚拟对象进行交互的方法。所述方法包括：使用电子传感器检测3D感觉空间中的手指的点击姿势，并根据点击姿势的完成程度确定是否将点击姿势解释为与3D感觉空间中的虚拟对象进行交互。手指的点击姿势是指手指的向下或向上伸展而不同的手指保持伸展或卷曲。所述确定包括：计算手指做出点击姿势时横过的距离，访问姿势数据库以确定所计算的对应于点击姿势的距离的姿势完成值，并将点击识别为响应超过阈值的姿势完成值操纵虚拟对象。In yet another embodiment, a method of detecting whether a user intends to interact with a virtual object in a 3D sensory space is described. The method includes: detecting a click gesture of a finger in a 3D sensory space using an electronic sensor, and determining whether to interpret the click gesture as interacting with a virtual object in the 3D sensory space according to a degree of completion of the click gesture. The finger click gesture refers to the downward or upward extension of the fingers while the different fingers remain extended or curled. The determining includes calculating a distance traversed by the finger while making the click gesture, accessing a gesture database to determine a calculated gesture completion value corresponding to the distance of the click gesture, and identifying the click as manipulating the virtual device in response to the gesture completion value exceeding a threshold value. object.

在一种实施方式中，姿势数据库包括不同的姿势的轨迹和相应的姿势完成值。在另一实施方式中，所述方法还包括通过将点击姿势的空间轨迹与存储在姿势数据库中的至少一个空间轨迹进行对比来计算点击姿势的完成程度。它也包括通过以下来测量点击姿势的完成程度：将做出点击姿势关联至代表虚拟控制的界面元件，并在做出点击姿势时实时修改界面元件。在又一种实施方式中，所述方法还包括空心圆形图标作为界面元件，并通过响应点击姿势逐渐填充圆形图标来实时修改图标。In one embodiment, the gesture database includes trajectories of different gestures and corresponding gesture completion values. In another embodiment, the method further comprises calculating the degree of completion of the click gesture by comparing the spatial trajectory of the click gesture with at least one spatial trajectory stored in the gesture database. It also includes measuring the completion of tap gestures by associating making tap gestures with interface elements representing virtual controls and modifying interface elements in real-time as tap gestures are made. In yet another embodiment, the method further includes a hollow circular icon as an interface element, and the icon is modified in real time by gradually filling the circular icon in response to a tap gesture.

在一种实施方式中，说明了检测用户是否打算与在3D感觉空间中的虚拟对象进行交互的方法。所述方法包括：使用电子传感器检测3D感觉空间中的手指的点击姿势，响应检测点击姿势，激活显示点击姿势的完成程度的屏幕虚拟指示器，并响应超过阈值的点击姿势的完成程度，修改虚拟对象。手指的点击姿势是指手指的向下或向上伸展而不同的手指保持伸展或卷曲。In one implementation, a method of detecting whether a user intends to interact with a virtual object in a 3D sensory space is described. The method includes detecting a tap gesture of a finger in a 3D sensory space using an electronic sensor, activating an on-screen virtual indicator showing a degree of completion of the tap gesture in response to detecting the click gesture, and modifying the virtual indicator in response to the completion degree of the tap gesture exceeding a threshold. object. The finger click gesture refers to the downward or upward extension of the fingers while the different fingers remain extended or curled.

其它实施方式可包括非临时性计算机可读存储介质，所述存储介质存储指令，所述指令能够由处理器执行以便执行上述任何方法。又一种实施方式可包括包含存储器和一个或多个处理器的系统，所述处理器可操作来执行存储在存储器中的指令以便执行上述任何方法的指令。Other embodiments may include a non-transitory computer-readable storage medium storing instructions executable by a processor to perform any of the methods described above. Yet another embodiment may include a system including a memory and one or more processors operable to execute instructions stored in the memory for performing the instructions of any of the methods described above.

在另一种实施方式中，说明了操纵在3D感觉空间中的虚拟对象的方法。所述方法包括：响应手指在3D感觉空间中的点击姿势，选择姿势界面的虚拟对象，并在保持选择所述虚拟对象的同时检测手指在3D感觉空间中随后的指向姿势，并计算指向姿势的作用力矢量。手指的点击姿势是指手指的向下或向上伸展而不同的手指保持伸展或卷曲。在另一实施方式中，作用力矢量的幅度是基于手指做出指向姿势时横过的距离以及在做出指向姿势期间手指的速度。所述方法还包括：当作用力矢量的幅度超过阈值时，将作用力矢量施加在虚拟对象上，并修改虚拟对象。In another embodiment, a method of manipulating a virtual object in a 3D sensory space is described. The method includes: responding to a click gesture of a finger in a 3D sensory space, selecting a virtual object of the gesture interface, and detecting a subsequent pointing gesture of the finger in the 3D sensory space while keeping selecting the virtual object, and calculating a pointing gesture of the finger Force vector. The finger click gesture refers to the downward or upward extension of the fingers while the different fingers remain extended or curled. In another embodiment, the magnitude of the force vector is based on the distance traversed by the finger while making the pointing gesture and the velocity of the finger during the pointing gesture. The method also includes applying the force vector to the virtual object and modifying the virtual object when the magnitude of the force vector exceeds a threshold.

在一种实施方式中，修改虚拟对象包括改变虚拟对象的形状。在另一种实施方式中，修改虚拟对象包括改变虚拟对象的位置。In one embodiment, modifying the virtual object includes changing the shape of the virtual object. In another embodiment, modifying the virtual object includes changing the position of the virtual object.

在另一种实施方式中，说明了3D感觉空间中创建界面元件的方法。所述方法包括：使用电子传感器检测3D感觉空间中的手指的圆形扫动，检测3D感觉空间中的手指的随后的横向扫动，并响应随后的横向扫动，指示(register)按压屏幕虚拟按钮，并执行至少一个相关联的功能。手指的圆形扫动指手指在自由空间中顺时针或逆时针运动。在另一实施方式中，手指的横向扫描指手指的指尖指向屏幕控制时手指的向前或向后运动In another embodiment, a method for creating interface elements in a 3D sensory space is described. The method includes detecting, using an electronic sensor, a circular sweep of a finger in a 3D sensory space, detecting a subsequent lateral sweep of the finger in the 3D sensory space, and in response to the subsequent lateral sweep, registering a virtual press on the screen. button and perform at least one associated function. A circular sweep of the finger refers to a clockwise or counterclockwise movement of the finger in free space. In another embodiment, the lateral sweep of the finger refers to the forward or backward movement of the finger when the tip of the finger is pointed at the screen control

在一种实施方式中，相关联的功能是基于所述姿势界面的上下文选择。在另一实施方式中，相关联的功能是基于姿势界面上的屏幕虚拟按钮的位置选择。所述方法还包括：如果不低于阈值比例的横向扫动运动是在手指指向的方向上，则将横向扫动解释为左击鼠标。所述方法还包括：如果不低于阈值比例的横向扫动运动是在与手指指向相反的方向上，则将横向扫动解释为右击鼠标。In one embodiment, the associated function is based on a contextual selection of the gesture interface. In another embodiment, the associated function is based on the location selection of an on-screen virtual button on the gesture interface. The method also includes interpreting the lateral swipe as a left mouse click if no less than a threshold proportion of the lateral swipe motion is in the direction the finger is pointing. The method also includes interpreting the lateral swipe as a right mouse click if no less than a threshold proportion of the lateral swipe motion is in a direction opposite to the finger pointing.

在另一种实施方式中，说明了在3D感觉空间中创建界面元件的方法。所述方法包括检测3D感觉空间中的双手指竖直扫动，响应双手指竖直扫动在姿势界面构建竖直滑块，检测所述3D感觉空间中靠近所述竖直滑块的随后的手指竖直扫动，和响应手指竖直扫动滚动竖直滑块，并执行至少一个相关联的功能。双手指竖直扫动指两个手指手在自由空间中向上或向下运动而手的其它手指卷曲。在另一实施方式中，一个手指竖直扫动指手的手指在自由空间中向上或向下运动而手的其它手指卷曲。In another embodiment, a method of creating interface elements in a 3D sensory space is described. The method includes detecting a vertical swipe of both fingers in a 3D sensory space, constructing a vertical slider on a gesture interface in response to the vertical swipe of both fingers, and detecting subsequent gestures close to the vertical slider in the 3D sensory space. Swipe the finger vertically, and scroll the vertical slider in response to the vertical swipe of the finger, and perform at least one associated function. A two-finger vertical sweep refers to the movement of a two-fingered hand up or down in free space while the other fingers of the hand curl. In another embodiment, a vertical sweep of one finger means that the fingers of the hand move up or down in free space while the other fingers of the hand curl.

在一种实施方式中，相关联的功能是基于所述姿势界面的上下文选择。在另一实施方式中，相关联的功能是基于所述姿势界面上的竖直滑块的位置选择。In one embodiment, the associated function is based on a contextual selection of the gesture interface. In another embodiment, the associated function is selected based on the position of a vertical slider on said gesture interface.

在又一种实施方式中，说明使用在3D感觉空间中的自由姿势操纵灰度选择小部件的方法。所述方法包括通过以下将灰度选择小部件关联至屏幕虚拟圆盘：响应屏幕虚拟圆盘运动，修改灰度选择小部件上的灰度值。其包括：响应使用电子传感器检测的3D感觉空间中的手指点击，改变所述屏幕虚拟圆盘的位置，并在对应于屏幕上的圆盘的x或y位置的灰度选择小部件上选择特定的灰度值。手指姿势指第一指在相对于第二手指的限制位置中，随后第二手指快速运动远离食指。In yet another embodiment, a method of manipulating a grayscale selection widget using free gestures in a 3D sensory space is described. The method includes associating a grayscale selection widget to an on-screen virtual puck by modifying a grayscale value on the grayscale selection widget in response to movement of the on-screen virtual puck. It includes changing the position of said on-screen virtual puck in response to a finger tap in a 3D sensory space detected using an electronic sensor, and selecting a specific puck on a grayscale selection widget corresponding to the x or y position of the puck on the screen. gray value of . A finger pose refers to a first finger in a restrained position relative to a second finger, followed by a rapid movement of the second finger away from the index finger.

其它实施方式可包括非临时性计算机可读存储介质，所述存储介质存储指令，所述指令能够由处理器执行以便执行上述任何方法的指令。又一种实施方式可包括包含存储器和一个或多个处理器的系统，所述处理器可操作来执行存储在存储器中的指令以便执行上述任何方法的指令。Other embodiments may include a non-transitory computer-readable storage medium storing instructions executable by a processor to perform instructions for any of the methods described above. Yet another embodiment may include a system including a memory and one or more processors operable to execute instructions stored in the memory for performing the instructions of any of the methods described above.

在又一种实施方式中，说明使用在3D感觉空间中的自由姿势操纵姿势界面的多个控制的方法。所述方法包括通过以下将显示器设置和灰度选择小部件关联至屏幕虚拟圆盘：响应屏幕虚拟圆盘运动，修改显示器设定小部件上的亮度值和灰度选择小部件上的灰度值。其包括：响应使用电子传感器检测的3D感觉空间中的手指点击，改变所述屏幕虚拟圆盘的位置，并选择对应于屏幕上的圆盘的x或y位置的特定的亮度值和灰度值。In yet another embodiment, a method of manipulating multiple controls of a gesture interface using free gestures in a 3D sensory space is described. The method includes associating the display settings and grayscale selection widgets to the screen puck by modifying the brightness value on the display settings widget and the grayscale value on the grayscale selection widget in response to movement of the screen puck . It involves changing the position of the virtual puck on the screen in response to a finger click in a 3D sensory space detected using an electronic sensor, and selecting specific brightness and grayscale values corresponding to the x or y position of the puck on the screen .

在又一种实施方式中，说明了在3D感觉空间中创建界面元件的方法。所述方法包括：使用电子传感器检测3D感觉空间中的手指的圆形扫动，响应圆形扫动，在姿势界面中构建屏幕虚拟圆盘，检测3D感觉空间中的手指的随后的漩涡扫动，并响应随后的涡旋扫动，旋转圆盘，并执行至少一个相关联的功能。手指的圆形扫动指手指在自由空间中顺时针或逆时针运动。在另一实施方式中，手指的漩涡运动指手指在自由空间中重复做出顺时针或逆时针运动并结合有手指的向上或向下的运动。In yet another embodiment, a method of creating interface elements in a 3D sensory space is described. The method comprises: using an electronic sensor to detect a circular sweep of a finger in a 3D sensory space, in response to the circular sweep, constructing an on-screen virtual disk in a gesture interface, detecting a subsequent swirl sweep of the finger in a 3D sensory space , and in response to a subsequent vortex sweep, the disk is rotated and at least one associated function is performed. A circular sweep of the finger refers to a clockwise or counterclockwise movement of the finger in free space. In another embodiment, the swirling motion of the finger refers to repeated clockwise or counterclockwise motion of the finger in free space combined with upward or downward motion of the finger.

在一种实施方式中，说明了在三维(3D)感觉空间中将关注的姿势与非关注的姿势区别开的方法。所述方法包括接收限定一个或多个基准姿势的基准特性的输入，使用电子传感器检测三维(3D)感觉空间中的一个或多个实际姿势并使用来自电子传感器的数据确定实际特性，将实际姿势和基准姿势进行对比来确定一组关注的姿势，并将一组关注的姿势和相应的姿势参数提供给进一步的处理。In one implementation, a method for distinguishing gestures of interest from gestures of non-interest in a three-dimensional (3D) sensory space is described. The method includes receiving input defining reference characteristics of one or more reference poses, detecting one or more actual poses in a three-dimensional (3D) sensory space using electronic sensors and determining the actual characteristics using data from the electronic sensors, comparing the actual poses to A set of poses of interest is determined by comparison with the reference pose, and the set of poses of interest and corresponding pose parameters are provided for further processing.

在一种实施方式中，当基准特性是姿势路径时，直线路径的实际姿势(如横向挥动)被解释为一组关注的姿势。根据一种实施方式，当基准特性是姿势速度时，具有高速度的实际姿势被解释为一组关注的姿势。根据一种实施方式，当基准特性是姿势形态时，使用以特定的手指指向的手做出的实际姿势被解释为一组关注的姿势。当基准特性是姿势形态时，根据一种实施方式握拳的手的实际姿势被解释为一组关注的姿势。In one embodiment, when the reference characteristic is a gesture path, an actual gesture (eg, a lateral swipe) of a straight path is interpreted as a set of gestures of interest. According to one embodiment, when the reference characteristic is gesture velocity, actual gestures with high velocities are interpreted as a set of gestures of interest. According to one embodiment, when the reference characteristic is a gesture modality, the actual gesture made with the hand pointed at a particular finger is interpreted as a set of gestures of interest. When the reference characteristic is gesture morphology, according to one embodiment the actual gesture of the fisting hand is interpreted as a set of gestures of interest.

在另一种实施方式中，当基准特性是姿势的形状时，手竖起大拇指的实际姿势被解释为一组关注的姿势。根据一种实施方式，当基准特性是姿势长度时，挥动姿势被解释为一组关注的姿势。在又一种实施方式中，当基准特性是姿势的位置时，距所述电子传感器的距离小于阈值的实际姿势被解释为一组关注的姿势。当基准特性是姿势的持续时间时，在3D感觉空间中持续阈值时间周期的实际姿势,而非在3D感觉空间中持续的时间小于阈值时间周期的实际姿势被解释为一组关注的姿势。In another implementation, when the reference characteristic is the shape of the gesture, the actual thumbs up gesture of the hand is interpreted as a set of gestures of interest. According to one embodiment, the waving gesture is interpreted as a set of gestures of interest when the reference characteristic is gesture length. In yet another embodiment, when the reference characteristic is the position of a gesture, actual gestures whose distance from said electronic sensor is less than a threshold are interpreted as a set of gestures of interest. When the reference characteristic is the duration of a pose, actual poses that persist in the 3D sensory space for a threshold time period, but not actual poses that persist in the 3D sensory space for less than the threshold time period, are interpreted as a set of poses of interest.

在另一种实施方式中，说明为特定用户定制姿势解释的方法。所述方法包括：提示用户选择用于自由空间中的姿势的特性值并接收选择的特性值，提示用户在三维(3D)感觉空间中执行姿势的特性边界集中示范，从由电子传感器捕获的边界集中示范确定姿势的一组参数，并存储用于姿势识别的该组参数和相应的值。In another embodiment, a method of customizing gesture interpretation for a particular user is described. The method includes prompting a user to select a property value for a gesture in free space and receiving the selected property value, prompting the user to perform a focused demonstration of a property boundary of the gesture in a three-dimensional (3D) sensory space, from a boundary captured by an electronic sensor Centrally demonstrate a set of parameters for determining gestures, and store the set of parameters and corresponding values for gesture recognition.

所述方法还包括使用用于提示用户选择姿势的特性值的问卷。在一种实施方式中，使用问卷提示用户选择特性值包括从用户接收姿势在3D感觉空间内的最小阈值时间段，在此之前不解释姿势。在另一实施方式中，执行特性边界集中示范包括用户使用特定手指做出手指向姿势作为姿势形态。执行特性边界集中示范还包括用户使用手做出握拳姿势作为姿势的形态。执行特性边界集中示范还包括用户使用手拇指向上或拇指向下的姿势作为姿势的形状。The method also includes using a questionnaire for prompting the user to select a characteristic value of the gesture. In one embodiment, using the questionnaire to prompt the user to select a property value includes receiving from the user a minimum threshold period of time for the gesture to be within the 3D sensory space, before interpreting the gesture. In another embodiment, performing the characteristic boundary-focused demonstration includes the user making a finger-pointing gesture using a specific finger as a gesture modality. Executing characteristic boundary-focused demonstration also includes a form in which the user makes a fist gesture with a hand as a gesture. Executing characteristic boundary-focused demonstrations also include the user using a thumb-up or thumb-down gesture as the shape of the gesture.

所述方法还包括通过以下测试对特定姿势的解释：提示用户在3D感觉空间执行特定姿势的完整姿势示范，从由电子传感器捕获的一组特定姿势确定特定姿势的一组参数，将特定姿势的该组参数与从边界集中示范确定的响应的一组参数和选择的特性值进行对比，并且向用户报告对比的结果并接收对特定姿势的解释是否正确的确认信息。The method also includes testing the interpretation of the particular gesture by prompting the user to perform a complete gesture demonstration of the particular gesture in a 3D sensory space, determining a set of parameters for the particular gesture from the set of particular gestures captured by the electronic sensors, combining the The set of parameters is compared to a set of parameters and selected characteristic values from the bounding set that exemplify the determined response, and the results of the comparison are reported to the user and confirmation is received that the interpretation of the particular gesture is correct.

其它实施方式可包括非临时性计算机可读存储介质，所述存储介质存储指令，所述指令可以由处理器执行，以便执行上述任何方法。又一种实施方式可包括包含存储器和一个或多个处理器的系统，所述处理器可操作来执行存储在存储器中的指令以便执行上述任何方法的指令。Other embodiments may include a non-transitory computer-readable storage medium storing instructions executable by a processor to perform any of the methods described above. Yet another embodiment may include a system including a memory and one or more processors operable to execute instructions stored in the memory for performing the instructions of any of the methods described above.

在另一个方面，一种识别姿势的机器实施的方法包括：提示输入在自由空间中广泛地限定姿势的一个或多个特性，将信息传送给机器(无论是否有表面接触)，接收一个或多个输入特性，从所接收的输入确定限定姿势的一组训练参数，提示输入所述姿势的至少一个示例，从姿势的至少一个示例确定对应于该组的训练参数的一组值，并将该组值提供给存储器用于识别姿势。所述方法可包括：存储一组对象参数，所述一组对象参数限定至少一个对象并与姿势相关联，所述至少一个对象可显示在无接触的显示器上。In another aspect, a machine-implemented method of recognizing a gesture includes prompting for input broadly defining one or more characteristics of the gesture in free space, transmitting the information to the machine (with or without surface contact), receiving one or more input characteristics, determining from the received input a set of training parameters defining a pose, prompting for input of at least one instance of the pose, determining from the at least one instance of the pose a set of values corresponding to the set of training parameters, and applying the The set of values is provided to memory for recognizing gestures. The method may include storing a set of object parameters defining at least one object and associated with a gesture, the at least one object displayable on a contactless display.

从姿势的至少一个示例确定对应于该组的训练参数的一组值可以包括：至少部分基于一个或多个特性确定是否标准化所述一组训练值中的至少一个，并可选地，至少部分基于一个或多个特性确定是否忽视所述一组训练值中的至少一个(其可以包括指示所述姿势的尺寸是否重要信息)。限定姿势的该组训练参数还可以包括限定姿势的至少一个运动的至少一个参数。提示输入所述姿势的至少一个示例可以包括提示输入最小的合理的运动或提示输入最大合理运动。Determining from at least one instance of a pose a set of values corresponding to the set of training parameters may include determining whether to normalize at least one of the set of training values based at least in part on one or more characteristics, and optionally, at least in part Whether to disregard at least one of the set of training values (which may include information indicating whether the size of the gesture is important) is determined based on one or more characteristics. The set of training parameters defining a gesture may also include at least one parameter defining at least one movement of the gesture. At least one example of prompting for the gesture may include prompting for a minimum reasonable motion or prompting for a maximum reasonable motion.

在另一个方面，所公开的技术涉及非临时性计算机可读介质，所述介质存储一个或多个指令，当由一个或多个处理器执行时，所述一个或多个指令使得所述一个或多个处理器执行以下步骤：提示输入在自由空间中广泛地限定姿势的一个或多个特性，将信息传送给机器(无论是否有表面接触)，接收一个或多个输入特性，从所接收的输入确定限定姿势的一组训练参数，提示输入所述姿势的至少一个示例，从姿势的至少一个示例确定对应于该组的训练参数的一组值，并将该组值提供给存储器用于识别姿势。In another aspect, the disclosed technology relates to a non-transitory computer-readable medium storing one or more instructions that, when executed by one or more processors, cause the one or more The processor or processors perform the steps of: prompting for input one or more characteristics broadly defining the gesture in free space, transmitting the information to the machine (with or without surface contact), receiving the one or more input characteristics, receiving from the received The input determines a set of training parameters defining a gesture, prompts for input of at least one instance of the gesture, determines from the at least one instance of the gesture a set of values corresponding to the set of training parameters, and provides the set of values to a memory for use in Recognize gestures.

在另一个方面，所公开的技术涉及控制用户与装置的动态交互的方法。在代表性的实施方式中，所述方法包括捕获用户的多个时间连续的图像；计算分析所述用户的图像来识别用户的姿势，并识别与其相关的缩放比例，所述缩放比例指示在做出姿势时横过的实际姿势距离；计算确定缩放比例和显示的运动之间的比例，所述显示的运动对应于待显示在装置上的动作；基于比例将动作显示在装置上，并基于外部参数调整比例。外部参数可以是实际姿势距离，或者是对应于做出的动作的所捕获的图像中的像素距离和以像素为单位的显示屏尺寸的比例。In another aspect, the disclosed technology relates to a method of controlling a user's dynamic interaction with a device. In an exemplary embodiment, the method includes capturing a plurality of time-sequential images of a user; computationally analyzing the images of the user to identify gestures of the user, and identifying a scale associated therewith, the scale indicating the Actual gesture distance traversed when gesturing; calculation determines the ratio between the zoom scale and the displayed motion corresponding to the action to be displayed on the device; displays the motion on the device based on the scale, and based on the external Parameter adjustment scale. The extrinsic parameter may be the actual gesture distance, or the ratio of the pixel distance in the captured image corresponding to the performed action to the display screen size in pixels.

在不同实施方式中，分析用户的图像包括：(i)识别形状和图像中的一个或多个人体部位的位置，和(ii)基于图像中识别的身体部位的形状和位置之间的关系重构人体部位在3D空间中的位置和形状。在一种实施方式中，分析用户的图像还包括：在3D空间中，依时间次序组合所述身体部位的重构位置和形状。另外，所述方法可以包括限定身体部位的3D模型和基于所述3D模型在3D空间中重构身体部位的位置和形状。In various implementations, analyzing the image of the user includes: (i) identifying the shape and location of one or more body parts in the image, and (ii) re- Construct the position and shape of human body parts in 3D space. In one embodiment, analyzing the image of the user further includes: combining the reconstructed positions and shapes of the body parts in time order in 3D space. Additionally, the method may comprise defining a 3D model of the body part and reconstructing the position and shape of the body part in 3D space based on the 3D model.

可以通过将姿势与姿势数据库中的记录进行对比来识别缩放比例，姿势数据库可以包括各自将姿势与输入参数相关联的一系列电子存储记录。此外，该姿势可以作为矢量存储在记录中。Scale can be identified by comparing the gesture to records in a gesture database, which can include a series of electronically stored records that each associate a gesture with an input parameter. Additionally, the pose can be stored in the record as a vector.

在另一个方面，所公开的技术涉及一种使用户与具有显示屏的装置进行动态交互的系统。在不同实施方式中，该系统包括朝向视场的一个或多个摄像机；一个或多个源，所述源将照明引入视场中的用户；姿势数据库，所述姿势数据库包括一系列电子存储记录，每个记录将一个姿势与输入参数相关联；以及联接到摄像机和数据库中的图像分析仪。在一种实施方式中，图像分析仪被配置成操作所述摄像机捕获用户的多个时间连续的图像，分析用户的图像以识别由用户做出的姿势，并且将所识别的姿势与姿势数据库记录进行对比来识别与其相关联的输入参数；输入参数对应于一个动作，所述动作根据在做出姿势时横过的实际姿势距离与对应于该动作的显示的运动之间的比例显示在显示器上，图像分析仪基于外部参数调整该比例。In another aspect, the disclosed technology relates to a system for dynamically interacting a user with a device having a display screen. In various embodiments, the system includes one or more cameras facing the field of view; one or more sources that direct illumination to a user in the field of view; a gesture database comprising a series of electronically stored records , each record associating a pose with input parameters; and an image analyzer coupled to the camera and database. In one embodiment, the image analyzer is configured to operate the camera to capture a plurality of time-sequential images of the user, analyze the images of the user to identify gestures made by the user, and record the identified gestures with the gesture database A comparison is made to identify the input parameter associated therewith; the input parameter corresponds to an action that is displayed on the display according to the ratio between the actual gesture distance traversed while making the gesture and the displayed motion corresponding to the action , the image analyzer adjusts this scale based on the extrinsic parameters.

图像分析仪可以进一步被配置为(i)识别用户图像中的人体的一个或多个部位的形状和位置和(ii)基于图像中的身体部位的识别的形状和位置之间的关系重构人体部位在3D空间中的位置和形状。另外，图像分析仪可以被配置为限定3D模型，并基于3D模型重构人体部位在3D空间中的位置和形状。在一种实施方式中，图像分析仪被配置为：估计3D空间中的身体部位的轨迹。The image analyzer may be further configured to (i) identify the shape and position of one or more parts of the human body in the user image and (ii) reconstruct the human body based on the relationship between the identified shape and position of the body part in the image The position and shape of the part in 3D space. Additionally, the image analyzer can be configured to define a 3D model and reconstruct the position and shape of the body part in 3D space based on the 3D model. In one embodiment, the image analyzer is configured to estimate a trajectory of the body part in 3D space.

外部参数可以是实际姿势距离，或者是对应于所做出的姿势在所捕获的图像中的像素距离与以像素为单位的显示屏尺寸的比例。每个姿势可以具有不同的比例，这些不同的比例存储在数据库的各姿势记录中；或者，姿势数据库中的所有姿势可以具有相同的比例。The extrinsic parameter may be the actual gesture distance, or the ratio of the pixel distance in the captured image corresponding to the made gesture to the display screen size in pixels. Each gesture may have a different scale, which is stored in each gesture record in the database; alternatively, all gestures in the gesture database may have the same scale.

所公开的技术的另一个方面涉及动态显示用户与装置的交互的方法。在代表性的实施方式中，所述方法包括(i)捕获用户的多个时间连续的图像，(ⅱ)计算分析所述用户的图像来识别用户的姿势，(ⅲ)将所识别的姿势与姿势数据库中的记录进行对比以识别的姿势，(ⅳ)计算确定所识别的姿势的完成程度，以及(v)根据确定的完成程度修改装置的显示内容。内容可以包括图标、条、颜色渐变或色彩亮度。Another aspect of the disclosed technology relates to a method of dynamically displaying user interactions with a device. In an exemplary embodiment, the method includes (i) capturing a plurality of time-sequential images of a user, (ii) computationally analyzing the images of the user to identify gestures of the user, (iii) combining the identified gestures with Records in the gesture database are compared to the identified gesture, (iv) a computational determination of the degree of completion of the recognized gesture is made, and (v) a display of the device is modified based on the determined degree of completion. Content can include icons, bars, color gradients, or color intensities.

在不同实施方式中，所述方法包括重复动作(ⅰ)-(ⅴ)，直到完成程度超过预定阈值，然后使所述装置进行完成触发动作。在一种实施方式中，分析用户的图像包括：识别用户图像中人体的一个或多个部位的形状和位置。在一些实施方式中，所述方法还包括：根据物理模拟模型并基于姿势完成程度显示响应姿势的动作。内容可以包括图标、条、颜色渐变或色彩亮度。In various embodiments, the method includes repeating acts (i)-(v) until the degree of completion exceeds a predetermined threshold, and then causing the device to perform a completion triggering action. In one embodiment, analyzing the image of the user includes: recognizing the shape and position of one or more parts of the human body in the image of the user. In some implementations, the method further includes displaying an action in response to the gesture based on the physical simulation model and based on the degree of completion of the gesture. Content can include icons, bars, color gradients, or color intensities.

在另一个方面，所公开的技术涉及一种用户与具有显示屏的装置的动态交互系统。在某些实施方式中，该系统包括朝向视场的一个或多个摄像机；一个或多个源，所述源(例如光源和/或声源)将照明引入视场中的用户；姿势数据库，所述姿势数据库包括一系列电子存储记录，每个记录指定一个姿势；以及联接到摄像机的图像分析仪。在一种实施方式中，图像分析仪被配置成操作所述摄像机捕获用户的多个时间连续的图像，分析用户的图像以识别用户的姿势，并且将所识别的姿势与姿势数据库的记录进行对比来识别姿势，确定所识别的姿势的完成程度，并在装置的屏幕上显示反应所确定的完成程度。指示器可以包括图标、条、颜色渐变或色彩亮度。In another aspect, the disclosed technology relates to a system for dynamic interaction of a user with a device having a display screen. In some embodiments, the system includes one or more cameras facing the field of view; one or more sources (e.g., light sources and/or sound sources) that direct illumination to the user in the field of view; a gesture database, The gesture database includes a series of electronically stored records, each record specifying a gesture; and an image analyzer coupled to a camera. In one embodiment, the image analyzer is configured to operate the camera to capture a plurality of time-sequential images of the user, analyze the images of the user to identify gestures of the user, and compare the identified gestures to records of a gesture database to recognize a gesture, determine the degree of completion of the recognized gesture, and display the response on the screen of the device to the determined degree of completion. Indicators can include icons, bars, color gradients, or color brightness.

在不同实施方式中，图像分析仪被配置成确定完成程度是否超过预定阈值，如果完成程度超过预定阈值，则使该装置进行完成触发(completion-triggered)的动作。图像分析仪可以进一步被配置为根据物理模拟模型并基于姿势完成程度显示响应姿势的动作。所显示的动作可进一步基于运动模型。In various embodiments, the image analyzer is configured to determine whether the degree of completion exceeds a predetermined threshold, and if the degree of completion exceeds the predetermined threshold, cause the device to perform a completion-triggered action. The image analyzer may be further configured to display motion in response to the gesture based on the physical simulation model and based on the degree of completion of the gesture. The displayed motion may further be based on a motion model.

公开的技术的又一个方面涉及一种控制用户与装置的动态交互的方法。在代表性的实施方式中，所述方法包括捕获用户的多个时间连续的图像，计算分析用户的图像，以识别多个用户的姿势，计算确定主导姿势，并基于主导姿势将动作显示在装置上。Yet another aspect of the disclosed technology relates to a method of controlling a user's dynamic interaction with a device. In an exemplary embodiment, the method includes capturing a plurality of time-sequential images of a user, computationally analyzing the images of the user to identify gestures of the plurality of users, computationally determining a dominant gesture, and displaying the action on a device based on the dominant gesture superior.

主导姿势可通过过滤所述多个姿势来确定。在一种实施方式中，过滤被迭代地执行。此外，每个所述姿势可以被代表为一个轨迹。在一些实施方式中，每个轨迹可以代表为沿着欧拉空间中的六个欧拉自由度的矢量，并且具有最大幅度的示例被确定为主导姿势。A dominant gesture may be determined by filtering the plurality of gestures. In one embodiment, filtering is performed iteratively. Furthermore, each of the gestures can be represented as a trajectory. In some implementations, each trajectory may be represented as a vector along the six Euler degrees of freedom in Euler space, and the instance with the largest magnitude is determined to be the dominant pose.

在不同实施方式中，分析所述用户的图像包括：(i)识别用户的图像中的人体的一个或多个部位的形状和位置和(ii)基于图像中的识别的身体部位的形状和位置之间的关系重构人体部位在3D空间中的位置和形状。在一种实施方式中，方法还包括限定身体部位的3D模型，并基于3D模型重构人体部位在3D空间中的位置和形状。分析用户的图像包括在3D空间中依时间次序组合所述身体部位的重构位置和形状。In various embodiments, analyzing the image of the user includes: (i) identifying the shape and location of one or more parts of the human body in the image of the user and (ii) identifying the shape and location of the identified body part based on the image The relationship between reconstructs the position and shape of human body parts in 3D space. In one embodiment, the method further includes defining a 3D model of the body part, and reconstructing the position and shape of the body part in 3D space based on the 3D model. Analyzing the image of the user includes combining the reconstructed positions and shapes of the body parts in time order in 3D space.

在另一个方面，公开的技术涉及一种控制用户与装置的动态交互的系统。在不同实施方式中，该系统包括朝向视场的一个或多个摄像机；一个或多个源，所述源(例如光源和/或声源)将照明引导至视场中的用户；姿势数据库，所述姿势数据库包括一系列电子存储记录，每个记录指定一个姿势；以及联接到摄像机和数据库的图像分析仪。在一种实施方式中，图像分析仪被配置成操作所述摄像机捕获用户的多个时间连续的图像，分析用户的图像以识别多个用户姿势，确定主导姿势，并基于主导姿势将动作显示在装置上。In another aspect, the disclosed technology relates to a system for controlling a user's dynamic interaction with a device. In various embodiments, the system includes one or more cameras facing the field of view; one or more sources, such as light sources and/or sound sources, that direct illumination to the user in the field of view; a gesture database, The gesture database includes a series of electronically stored records, each record specifying a gesture; and an image analyzer coupled to the camera and the database. In one embodiment, the image analyzer is configured to operate the camera to capture a plurality of time-sequential images of the user, analyze the images of the user to identify a plurality of user gestures, determine a dominant gesture, and display actions based on the dominant gesture in on the device.

图像分析仪可被进一步配置成通过过滤所述多个姿势来确定主导姿势。在一种实施方式中，过滤被迭代地执行。此外，图像分析仪可以配置成将每个姿势代表为一个轨迹。每个轨迹可以被代表为沿着欧拉空间中的六个欧拉自由度的矢量，并且具有最大的幅度的示例被确定为主导姿势。The image analyzer may be further configured to determine a dominant gesture by filtering the plurality of gestures. In one embodiment, filtering is performed iteratively. Additionally, the image analyzer can be configured to represent each pose as a trajectory. Each trajectory can be represented as a vector along the six Euler degrees of freedom in Euler space, and the instance with the largest magnitude is determined to be the dominant pose.

在一个方面，控制用户与装置的动态交互的方法包括：捕获用户的多个时间连续的图像，计算分析所述用户的图像的子集，以便识别用户的与屏幕虚拟圆盘的位置相接触的姿势，计算分析所述用户的图像的又一个子集，以便识别用户的使屏幕虚拟圆盘运动至新位置的姿势，根据圆盘的新位置修改软件应用程序的参数。In one aspect, a method of controlling a user's dynamic interaction with a device includes capturing a plurality of time-sequential images of a user, and computationally analyzing a subset of the user's images to identify the user's contact with the location of a virtual puck on the screen. Gestures, yet another subset of the images of the user are computationally analyzed to identify gestures of the user that move the virtual puck on the screen to a new position, modifying the parameters of the software application according to the new position of the puck.

圆盘可以是圆形、方形或三角形；参数可以是色彩，滑动圆盘可改变颜色。识别用户的滑动屏幕虚拟圆盘的姿势可以包括：姿势在修改参数之前超过阈值距离。圆盘可在用户姿势停止接触圆盘之后继续运动一段时间。圆盘可在用户姿势停止接触圆盘之后弹回固定位置。用户的图像的子集可以被计算分析，以便识别用户的创建用户界面元件的命令的姿势。姿势可以是圆周运动，两个手指横向运动，或用户手指的向前或反向运动，用户界面元件的运动分别可以是按钮，滑块，或鼠标点击。The disc can be a circle, square or triangle; the parameter can be a color, and sliding the disc changes the color. Identifying the user's gesture of swiping the screen virtual puck may include the gesture exceeding a threshold distance before modifying a parameter. The puck may continue to move for a period of time after the user gesture ceases to contact the puck. The puck can spring back to a fixed position after the user gesture stops contacting the puck. A subset of the user's images may be computationally analyzed to identify gestures of the user's commands to create user interface elements. Gestures can be circular motions, two-finger lateral motions, or forward or reverse motions of the user's fingers, and motions of user interface elements can be buttons, sliders, or mouse clicks, respectively.

在另一个方面，使得用户能够与具有显示屏的装置进行动态交互的系统包括：朝向视场的摄像机；源，所述源将照明引导至视场中的用户；姿势数据库，所述姿势数据库包括一系列电子存储记录，每个记录将姿势与输入参数相关联。图像分析仪联接到摄像机，被配置为捕获用户的多个时间连续的图像，计算分析所述用户的图像的子集，以便识别用户的与屏幕虚拟圆盘的位置相接触的姿势，计算分析所述用户的图像的又一个子集，以便识别用户的使屏幕虚拟圆盘运动至新位置的姿势，根据圆盘的新位置修改软件应用程序的参数。In another aspect, a system that enables a user to dynamically interact with a device having a display screen includes: a camera facing a field of view; a source directing illumination to a user in the field of view; a gesture database comprising A series of electronically stored records, each of which associates a pose with an input parameter. an image analyzer coupled to the camera and configured to capture a plurality of time-sequential images of the user, computationally analyze a subset of the images of the user to identify gestures of the user in contact with the location of the virtual puck on the screen, computationally analyze the Yet another subset of the image of the user is described in order to recognize the user's gestures that move the virtual puck on the screen to a new position, modifying the parameters of the software application according to the new position of the puck.

圆盘可以是圆形、方形或三角形；参数可以是色彩，滑动圆盘可改变颜色。识别用户的滑动屏幕虚拟圆盘的姿势可以包括：姿势在修改参数之前超过阈值距离。圆盘可在用户姿势停止接触圆盘之后继续运动一段时间。圆盘可在用户姿势停止接触圆盘之后弹回固定位置。用户的图像的子集可以被计算分析，以识别用户的创建用户界面元件的命令的姿势。姿势可以是圆周运动，两个手指横向运动，或用户手指的向前或反向运动，用户界面元件的运动分别可以是按钮，滑块，或鼠标点击。The disc can be a circle, square or triangle; the parameter can be a color, and sliding the disc changes the color. Identifying the user's gesture of swiping the screen virtual puck may include the gesture exceeding a threshold distance before modifying a parameter. The puck may continue to move for a period of time after the user gesture ceases to contact the puck. The puck can spring back to a fixed position after the user gesture stops contacting the puck. A subset of the user's images may be computationally analyzed to identify gestures of the user's commands to create user interface elements. Gestures can be circular motions, two-finger lateral motions, or forward or reverse motions of the user's fingers, and motions of user interface elements can be buttons, sliders, or mouse clicks, respectively.

在整个本说明书中对“实施例”、“示例”、“一个实施方式”或者“实施方式”的引用意味着与示例结合说明的特定特征、结构或特性被包括在本技术的至少一个示例中。因此，在整个说明书的不同的地方中出现的短语“在一个示例中”、“在示例中”、“一个实施方式”、或“一个实施”不一定都指代同一示例。此外，特定的特征、结构、程序、动作或特性可以以任何方式结合在本技术的一个或多个示例中。这里提供的标题仅用于方便，不旨在限制或解释权利技术的范围或含义。Reference throughout this specification to "an embodiment," "an example," "one implementation," or "an implementation" means that a particular feature, structure, or characteristic described in connection with the example is included in at least one example of the present technology . Thus, appearances of the phrases "in one example," "in an example," "one implementation," or "one implementation" in various places throughout this specification are not necessarily all referring to the same example. Furthermore, the particular features, structures, procedures, acts or characteristics may be combined in any manner in one or more examples of the present technology. The headings are provided here for convenience only and are not intended to limit or interpret the scope or meaning of the claimed technology.