Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.
In order to explain the technical means of the present invention, the following description will be given by way of specific examples.
A first aspect of an embodiment of the present invention provides a gesture recognition method, as shown in fig. 1, the method specifically includes the following steps:
and S101, point cloud data of the target gesture are obtained.
Optionally, as a specific implementation manner of the gesture recognition method provided by the embodiment of the present invention, after the point cloud data of the target gesture is obtained, the method further includes:
preprocessing point cloud data of the target gesture; wherein the pretreatment comprises: and removing wrong point cloud data, and performing data enhancement on the point cloud data of the target gesture.
In the embodiment of the invention, because different people have different understandings on the standard gesture and errors caused by human factors, the point cloud data of the target gesture may contain error data, and the error data can be removed through preprocessing. In addition, if the data amount of the point cloud data is small, data enhancement can be performed on the point cloud data, for example, random disturbance is added to the original data to form new data, and the like.
And S102, determining a plurality of feature extraction parameters, and extracting feature values corresponding to the feature extraction parameters from the point cloud data to obtain the gesture features of the target gesture.
Optionally, as a specific implementation manner of the gesture recognition method provided by the embodiment of the present invention, the point cloud data of the target gesture is point cloud data corresponding to each gesture frame of the target gesture, where each gesture frame of the target gesture is obtained by detecting the target gesture by the target detection device; the plurality of feature extraction parameters include distance, velocity, azimuth, and pitch.
Extracting characteristic values corresponding to the characteristic extraction parameters from the point cloud data to obtain gesture characteristics of the target gesture comprises the following steps:
and extracting the distance value, the speed value, the azimuth angle value and the pitch angle value corresponding to each gesture frame from the point cloud data corresponding to each gesture frame to obtain the gesture features of the target gesture.
In the embodiment of the invention, the target gesture is detected by the target detection equipment to obtain the gesture frame sequence of the target gesture, wherein the target detection equipment can be a millimeter wave radar, then each gesture frame in the gesture frame sequence is analyzed to obtain the point cloud data corresponding to each gesture frame, and further the point cloud data corresponding to each gesture frame of the target gesture can be obtained.
By determining four parameters of distance, speed, azimuth angle and pitch angle as characteristic extraction parameters and extracting the distance, speed, azimuth angle and pitch angle corresponding to each gesture frame from the point cloud data corresponding to each gesture frame of the target gesture, each frame of data of the target gesture is not a huge distance-Doppler image any more but is only four simple numerical values, and the number of parameters for gesture recognition is greatly reduced.
In the embodiment of the present invention, the distance refers to a distance between a target gesture and a target detection device, the speed refers to a speed of the target gesture, the azimuth refers to an azimuth of the target gesture relative to the target detection device, and the pitch refers to a pitch of the target gesture relative to the target detection device.
Optionally, as a specific implementation manner of the gesture recognition method provided by the embodiment of the present invention, after obtaining the gesture feature of the target gesture, the method further includes:
judging the size relation between the number of the characteristic values corresponding to each characteristic extraction parameter and a preset threshold value; the preset threshold is a preset input size of the gesture recognition model;
and performing feature alignment on the gesture features of the target gesture based on the size relationship.
Optionally, as a specific implementation manner of the gesture recognition method provided in the embodiment of the present invention, performing feature alignment on the gesture features of the target gesture based on the size relationship includes:
calculating the difference value between the number of the characteristic values corresponding to each characteristic extraction parameter and a preset threshold value, and solving the absolute value of the difference value;
supplementing the feature value of the feature extraction parameter with the number of the feature values smaller than a preset threshold value based on the absolute value of the difference value;
and deleting the characteristic values of the characteristic extraction parameters of which the number of the characteristic values is greater than a preset threshold value based on the absolute values of the differences.
In the embodiment of the invention, as the habits of each person in making gestures are different, the number of gesture frames detected by the target detection device each time is not a determined value, further, the number of feature values corresponding to each feature extraction parameter is not a determined value, and the gesture recognition model requires that the input gesture features conform to the preset input size, therefore, for the feature extraction parameters of which the number of feature values is smaller than the preset threshold value, the feature values of the feature extraction parameters can be supplemented in a linear interpolation or mean value interpolation mode, for the feature extraction parameters of which the number of feature values is larger than the preset threshold value, the feature values of the feature extraction parameters can be deleted in a simple deletion or mean value combination mode, and finally, the gesture features of the target gesture conform to the input conditions of the gesture recognition model.
And S103, recognizing the target gesture based on the gesture characteristics and a preset gesture recognition model.
In the embodiment of the invention, the preset gesture recognition model is a lightweight multilayer neural network model, and compared with a convolutional neural network model, the multilayer neural network model has the advantages of simple structure, high calculation speed and high recognition accuracy.
Optionally, as a specific implementation manner of the gesture recognition method provided in the embodiment of the present invention, the method for establishing the gesture recognition model includes:
acquiring point cloud data of a plurality of preset gestures;
respectively extracting characteristic values corresponding to the characteristic extraction parameters from the point cloud data of each preset gesture to obtain a training set of each preset gesture;
and performing model training based on the training set of each preset gesture to obtain a gesture recognition model.
Optionally, as a specific implementation manner of the gesture recognition method provided by the embodiment of the present invention, point cloud data of a certain preset gesture is point cloud data corresponding to each gesture frame of the preset gesture, where each gesture frame of the preset gesture is obtained by detecting the preset gesture by the target detection device;
after point cloud data of a plurality of preset gestures are acquired, the method further comprises the following steps:
calculating the total number of gesture frames of multiple preset gestures, and averaging;
and determining the preset input size of the gesture recognition model based on the average value.
In the embodiment of the invention, the preset gestures include left-right sliding, right-left sliding, up-down sliding, down-up sliding, clockwise rotating, counterclockwise rotating and calling, and the 7 preset gestures can be collected by a millimeter wave radar and then trained to obtain a gesture recognition model. It should be noted that, in practical application, because the gesture habits of each person are different and the same gesture made by the same person is not completely the same, for the same gesture, the gesture data of a plurality of persons can be collected and the same person can be collected for a plurality of times, and the recognition accuracy of the gesture recognition model can be obviously improved.
In addition, in the embodiment of the present invention, the process of extracting the gesture feature of each preset gesture is similar to the process of extracting the gesture feature of the target gesture, and is not repeated here.
According to the invention, the plurality of feature extraction parameters are determined, the feature values corresponding to the feature extraction parameters are extracted from the point cloud data of the target gesture, so that the gesture features of the target gesture are obtained, the number of the feature values of the gesture is greatly reduced, and the target gesture is identified based on the gesture features and the preset gesture identification model. Compared with the method for extracting the gesture features according to the range-Doppler image in the prior art, the method can effectively reduce the calculated amount in the gesture recognition process and improve the gesture recognition efficiency.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.
A second aspect of the embodiments of the present invention provides a gesture recognition apparatus, as shown in fig. 2, the gesture recognition apparatus 2 includes:
and theacquisition module 21 is configured to acquire point cloud data of the target gesture.
And thefeature extraction module 22 is configured to determine a plurality of feature extraction parameters, and extract feature values corresponding to the feature extraction parameters from the point cloud data to obtain gesture features of the target gesture.
And thegesture recognition module 23 is configured to recognize the target gesture based on the gesture feature and a preset gesture recognition model.
Optionally, as a specific implementation manner of the gesture recognition apparatus according to the second aspect of the embodiment of the present invention, the obtainingmodule 21 is further configured to:
preprocessing point cloud data of the target gesture; wherein the pretreatment comprises: and removing wrong point cloud data, and performing data enhancement on the point cloud data of the target gesture.
Optionally, as a specific implementation manner of the gesture recognition apparatus provided in the second aspect of the embodiment of the present invention, the point cloud data of the target gesture is point cloud data corresponding to each gesture frame of the target gesture, where each gesture frame of the target gesture is obtained by detecting the target gesture by the target detection device; the plurality of feature extraction parameters include distance, velocity, azimuth, and pitch. Thefeature extraction module 22 is specifically configured to:
and extracting the distance value, the speed value, the azimuth angle value and the pitch angle value corresponding to each gesture frame from the point cloud data corresponding to each gesture frame to obtain the gesture features of the target gesture.
Optionally, as a specific implementation manner of the gesture recognition apparatus provided in the second aspect of the embodiment of the present invention, thegesture recognition module 23 is further configured to:
judging the size relation between the number of the characteristic values corresponding to each characteristic extraction parameter and a preset threshold value; the preset threshold is a preset input size of the gesture recognition model;
and performing feature alignment on the gesture features of the target gesture based on the size relationship.
Optionally, as a specific implementation manner of the gesture recognition apparatus provided in the second aspect of the embodiment of the present invention, the feature alignment is performed on the gesture feature of the target gesture based on the size relationship, which may be detailed as follows:
calculating the difference value between the number of the characteristic values corresponding to each characteristic extraction parameter and a preset threshold value, and solving the absolute value of the difference value;
supplementing the feature value of the feature extraction parameter with the number of the feature values smaller than a preset threshold value based on the absolute value of the difference value;
and deleting the characteristic values of the characteristic extraction parameters of which the number of the characteristic values is greater than a preset threshold value based on the absolute values of the differences.
Optionally, as a specific implementation manner of the gesture recognition apparatus provided in the second aspect of the embodiment of the present invention, thegesture recognition module 23 is further configured to:
acquiring point cloud data of a plurality of preset gestures;
respectively extracting characteristic values corresponding to the characteristic extraction parameters from the point cloud data of each preset gesture to obtain a training set of each preset gesture;
and performing model training based on the training set of each preset gesture to obtain a gesture recognition model.
Optionally, as a specific implementation manner of the gesture recognition apparatus provided in the second aspect of the embodiment of the present invention, the point cloud data of a certain preset gesture is point cloud data corresponding to each gesture frame of the preset gesture, where each gesture frame of the preset gesture is obtained by detecting the preset gesture by the target detection device. Thegesture recognition module 23 is further configured to:
calculating the total number of gesture frames of multiple preset gestures, and averaging;
and determining the preset input size of the gesture recognition model based on the average value.
Fig. 3 is a schematic diagram of a terminal according to an embodiment of the present invention. As shown in fig. 3, theterminal 3 of this embodiment includes: aprocessor 30, amemory 31, and acomputer program 32 stored in thememory 31 and executable on theprocessor 30. Theprocessor 30, when executing thecomputer program 32, implements the steps in the above-described embodiments of the gesture recognition method, such as the steps S101 to S103 shown in fig. 1. Alternatively, theprocessor 30, when executing thecomputer program 32, implements the functions of the respective modules in the above-described apparatus embodiments, such as the functions of themodules 21 to 23 shown in fig. 2.
Illustratively, thecomputer program 32 may be partitioned into one or more modules, which are stored in thememory 31 and executed by theprocessor 30 to implement the present invention. One or more of the modules may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution of thecomputer program 32 in theterminal 3. For example, thecomputer program 32 may be divided into anacquisition module 31, afeature extraction module 32, and a gesture recognition module 33 (a module in a virtual device), and each module has the following specific functions:
and theacquisition module 21 is configured to acquire point cloud data of the target gesture.
And thefeature extraction module 22 is configured to determine a plurality of feature extraction parameters, and extract feature values corresponding to the feature extraction parameters from the point cloud data to obtain gesture features of the target gesture.
And thegesture recognition module 23 is configured to recognize the target gesture based on the gesture feature and a preset gesture recognition model.
Theterminal 3 may be a desktop computer, a notebook, a palm computer, a cloud server, or other computing devices. Theterminal 3 may include, but is not limited to, aprocessor 30, amemory 31. It will be appreciated by those skilled in the art that fig. 3 is only an example of aterminal 3 and does not constitute a limitation of theterminal 3 and may comprise more or less components than those shown, or some components may be combined, or different components, e.g. theterminal 3 may further comprise input output devices, network access devices, buses, etc.
TheProcessor 30 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Thememory 31 may be an internal storage unit of theterminal 3, such as a hard disk or a memory of theterminal 3. Thememory 31 may also be an external storage device of theterminal 3, such as a plug-in hard disk provided on theterminal 3, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like. Further, thememory 31 may also include both an internal storage unit of theterminal 3 and an external storage device. Thememory 31 is used for storing computer programs and other programs and data required by theterminal 3. Thememory 31 may also be used to temporarily store data that has been output or is to be output.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules, so as to perform all or part of the functions described above. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus/terminal and method may be implemented in other ways. For example, the above-described apparatus/terminal embodiments are merely illustrative, and for example, a module or a unit may be divided into only one logical function, and may be implemented in other ways, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow in the method according to the embodiments of the present invention may also be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the computer program may implement the steps of the embodiments of the method. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying computer program code, recording medium, U.S. disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution media, and the like. It should be noted that the computer readable medium may contain other components which may be suitably increased or decreased as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, in accordance with legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunications signals.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.