Disclosure of Invention
The invention aims to overcome the defects of low gesture recognition efficiency and high hardware cost of application equipment in the prior art, and provides the application equipment and an isolated gesture recognition method thereof.
The invention solves the technical problems through the following technical scheme:
an empty gesture recognition method of an application device comprises the following steps:
acquiring a target image;
identifying a target gesture from the target image; and the number of the first and second groups,
and judging whether the recognized target gesture is a preset gesture, if so, generating an operation response signal predefined by the preset gesture and outputting the operation response signal so as to execute the operation corresponding to the target gesture.
Optionally, before the step of acquiring the target image, the method for recognizing the spaced gesture further includes:
activating the application device in response to receiving an activation trigger signal.
Optionally, before the step of activating the application device, the method for recognizing an empty gesture further includes:
responding to the detection of a preset induction condition through an inductor of the application equipment, generating an activation trigger signal and outputting the activation trigger signal; or the like, or, alternatively,
and generating and outputting an activation trigger signal in response to the detection of a preset activation action by a low-power-consumption device of the application equipment.
Optionally, when the sensor is used, the sensor comprises a proximity sensor and/or an infrared sensor;
when the low power devices are used, the low power devices include gravity accelerators and/or gyroscopes.
Optionally, after the step of activating the application device, the method for recognizing an empty gesture further includes:
in response to not recognizing a target gesture from the target image, transitioning an active state of the application device to a standby state or an off state.
Optionally, the step of acquiring the target image includes:
and acquiring a target image through a two-dimensional image acquisition module of the application equipment.
Optionally, the step of recognizing a target gesture from the target image includes:
and identifying a target gesture from the target image through an AI target detection model.
Optionally, the AI target detection model includes an EfficientDet model (target detection model).
Optionally, the method further comprises:
and quantizing the weight and the deviation value of the AI target detection model from a full-precision floating point to a precision reduction floating point or 8bit, and recognizing a target gesture from the target image through the quantized AI target detection model.
Optionally, the method further comprises:
and in response to that no target gesture is recognized from the target image, returning to the step of acquiring the target image after a preset time period.
Optionally, the preset gesture comprises a leftward gesture, and the operation response signal predefined by the leftward gesture comprises a leftward operation response signal; and/or the presence of a gas in the gas,
the preset gesture comprises a rightward gesture, and the operation response signal predefined by the rightward gesture comprises a right direction operation response signal; and/or the presence of a gas in the gas,
the preset gesture comprises an upward gesture, and the operation response signal predefined by the upward gesture comprises an upward operation response signal; and/or the presence of a gas in the gas,
the preset gesture comprises a downward gesture, and the operation response signal predefined by the downward gesture comprises a downward operation response signal; and/or the presence of a gas in the gas,
the preset gesture comprises an upward-left gesture, and the operation response signal predefined by the upward-left gesture comprises an upward-left direction operation response signal; and/or the presence of a gas in the gas,
the preset gesture comprises a downward left gesture, and the operation response signal predefined by the downward left gesture comprises a downward left direction operation response signal; and/or the presence of a gas in the gas,
the preset gesture comprises an upward right gesture, and the operation response signal predefined by the upward right gesture comprises an upward right operation response signal; and/or the presence of a gas in the gas,
the preset gesture comprises a downward right gesture, and the operation response signal predefined by the downward right gesture comprises a downward right direction operation response signal.
Optionally, the step of acquiring the target image includes:
acquiring at least two target images at a preset time interval;
the step of recognizing a target gesture from the target image comprises:
respectively recognizing a target gesture from each target image;
the step of judging whether the identified target gesture is a preset gesture, if so, generating an operation response signal predefined by the preset gesture and outputting the operation response signal so as to execute the operation corresponding to the target gesture comprises the following steps:
and judging whether each recognized target gesture is a preset gesture, if so, recognizing position change information according to at least two preset gestures, generating an operation response signal predefined by the position change information, and outputting the operation response signal so as to execute the operation corresponding to at least two target gestures.
An application device, comprising:
an image acquisition module configured to acquire a target image;
an image recognition module configured to recognize a target gesture from the target image; and the number of the first and second groups,
and the operation identification module is configured to judge whether the identified target gesture is a preset gesture, and if so, generate and output an operation response signal predefined by the preset gesture so as to execute an operation corresponding to the target gesture.
Optionally, the method further comprises:
an activation control module configured to activate the application device and invoke the image acquisition module in response to receiving an activation trigger signal.
Optionally, the method further comprises:
the sensor is configured to respond to the detection of a preset sensing condition, generate an activation trigger signal and output the activation trigger signal to the activation control module; or the like, or, alternatively,
and the low-power consumption device is configured to generate and output an activation trigger signal to the activation control module in response to detecting a preset activation action.
Optionally, when the sensor is used, the sensor comprises a proximity sensor and/or an infrared sensor;
when the low power devices are used, the low power devices include gravity accelerators and/or gyroscopes.
Optionally, the image recognition module is further configured to, in response to no target gesture being recognized from the target image, invoke the activation control module to transition the active state of the application device to a standby state or an off state.
Optionally, the method further comprises:
and the two-dimensional image acquisition module is configured to acquire a target image and output the target image to the image acquisition module.
Optionally, the image recognition module is further configured to recognize a target gesture from the target image through an AI target detection model.
Optionally, the AI target detection model comprises an EfficientDet model.
Optionally, the image recognition module is further configured to:
a target gesture is identified from the target image by an AI target detection model that quantizes weights and bias values from full precision floating point to reduced precision floating point or 8 bit.
Optionally, the image recognition module is further configured to, in response to a target gesture not being recognized from the target image, invoke the image acquisition module after a preset time period has elapsed.
Optionally, the preset gesture comprises a leftward gesture, and the operation response signal predefined by the leftward gesture comprises a leftward operation response signal; and/or the presence of a gas in the gas,
the preset gesture comprises a rightward gesture, and the operation response signal predefined by the rightward gesture comprises a right direction operation response signal; and/or the presence of a gas in the gas,
the preset gesture comprises an upward gesture, and the operation response signal predefined by the upward gesture comprises an upward operation response signal; and/or the presence of a gas in the gas,
the preset gesture comprises a downward gesture, and the operation response signal predefined by the downward gesture comprises a downward operation response signal; and/or the presence of a gas in the gas,
the preset gesture comprises an upward-left gesture, and the operation response signal predefined by the upward-left gesture comprises an upward-left direction operation response signal; and/or the presence of a gas in the gas,
the preset gesture comprises a downward left gesture, and the operation response signal predefined by the downward left gesture comprises a downward left direction operation response signal; and/or the presence of a gas in the gas,
the preset gesture comprises an upward right gesture, and the operation response signal predefined by the upward right gesture comprises an upward right operation response signal; and/or the presence of a gas in the gas,
the preset gesture comprises a downward right gesture, and the operation response signal predefined by the downward right gesture comprises a downward right direction operation response signal.
Optionally, the image acquisition module is further configured to acquire at least two target images at preset time intervals;
the image recognition module is further configured to respectively recognize a target gesture from each of the target images;
the operation recognition module is further configured to judge whether each recognized target gesture is a preset gesture, if yes, recognize position change information according to at least two preset gestures, generate and output an operation response signal predefined by the position change information, and execute operations corresponding to the at least two target gestures.
An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the method for spaced gesture recognition of an application device as described above when executing the computer program.
A computer readable medium having stored thereon computer instructions which, when executed by a processor, implement the steps of the spaced gesture recognition method of an application device as described above.
On the basis of the common knowledge in the field, the preferred conditions can be combined randomly to obtain the preferred embodiments of the invention.
The positive progress effects of the invention are as follows:
according to the application equipment and the air-separating gesture recognition method thereof, the obvious gesture types are distinguished through the preset characteristics, the target gesture detection is intelligently and conveniently carried out on the image, the operation of the air-separating gesture of the user is effectively recognized, the error recognition rate and the missing recognition rate of the gesture are effectively reduced, the recognition efficiency is effectively improved, the user experience degree is improved, the requirement on the computing capacity of a processor is effectively reduced, and the hardware cost of the application equipment is effectively reduced.
Detailed Description
The invention is further illustrated by the following examples, which are not intended to limit the scope of the invention.
In order to overcome the above existing defects, the present embodiment provides a method for identifying an empty gesture of an application device, including: acquiring a target image; identifying a target gesture from the target image; and judging whether the recognized target gesture is a preset gesture, if so, generating an operation response signal predefined by the preset gesture and outputting the operation response signal so as to execute the operation corresponding to the target gesture.
In this embodiment, the application device is an embedded device such as a smart phone, a smart television, a vehicle-mounted terminal device, and the like, but the type of the application device is not particularly limited, and the application device may be selected and adjusted according to actual requirements.
In the embodiment, the more obvious gesture types are distinguished through the preset characteristics, the target gesture detection is intelligently and conveniently carried out on the image, the operation of the space gesture of the user is effectively recognized, the error recognition rate and the missing recognition rate of the gesture are effectively reduced, the recognition efficiency is effectively improved, the user experience is improved, the requirement on the computing capacity of the processor is effectively reduced, and the hardware cost of the application equipment is effectively reduced.
Specifically, as an embodiment, as shown in fig. 1, the method for recognizing an empty gesture of an application device provided in this embodiment mainly includes the following steps:
step 101, detecting an activation condition in an inactive state.
In this step, when the application device is in an inactive state such as a standby state or an off state, the activation condition is detected in real time.
In one embodiment, the sensor of the application device detects the activation trigger signal in real time, and generates and outputs the activation trigger signal in response to the detection of a preset sensing condition by the sensor.
Preferably, in this embodiment, the sensor may include a proximity sensor and/or an infrared sensor, but the type of the sensor is not particularly limited, and as long as the corresponding function can be achieved, the sensor may be selected and adjusted according to actual needs.
Specifically, the target object may be a hand of an operator or another body part of the operator or another object, that is, a preset sensing condition is detected in real time by the proximity sensor (light sense) or the infrared sensor, for example, when the hand of the operator swings from a sensing position of the sensor or another specified object passes through the sensing position of the sensor, the activation trigger signal is generated, but the preset sensing condition is not specifically limited in this embodiment, and corresponding selection and adjustment may be performed according to actual needs, and whether a hand or a gesture target exists or not may not be detected during activation detection, so that resource consumption in an activation stage may be reduced and time delay may be reduced.
As another embodiment, a preset activation action is detected in real time by a low power consumption device of the application device, and an activation trigger signal is generated and output in response to the detection of the preset activation action by the low power consumption device.
Preferably, in this embodiment, the low power consumption device may include a gravity accelerator and/or a gyroscope, but the type of the low power consumption device is not particularly limited, and as long as the corresponding function can be achieved, the selection and the adjustment may be performed according to actual requirements.
Specifically, the preset activation action may be that an operator forcibly throws the application device or rotates the application device, that is, the movement of the application device is detected through the gravity accelerator or the gyroscope, but the preset activation action is not specifically limited in this embodiment, and corresponding selection and adjustment may be performed according to actual requirements.
Of course, the activation step of the present embodiment may also be omitted for application devices that do not require consideration of power consumption or that do not have sensors or low power consumption devices.
Step 102, converting to an activated state and starting a gesture recognition function.
In this step, in response to receiving an activation trigger signal, the application device is activated, i.e. the state of the application device is converted into an activated state, and a gesture recognition procedure is started in the background.
In this embodiment, for an application device that needs to consider power consumption, enabling the activation step as described above can effectively reduce the power consumption of the application device.
And step 103, acquiring a target image.
In this step, an image capture module of the application device is turned on, and a target image photographed toward an operator is acquired by the image capture module.
Preferably, in this embodiment, the image capturing module is a two-dimensional image capturing module, that is, a camera, and the like, and certainly, the type of the image capturing module is not particularly limited, as long as corresponding functions can be realized, and corresponding selection and adjustment can be performed according to actual requirements.
And 104, judging whether the target gesture can be recognized from the target image, if so, executing thestep 105, otherwise, returning to thestep 101, or optionally returning to thestep 103.
In the step, whether a target gesture can be recognized from the acquired target image is judged through an AI target detection model; if yes, go to step 105, or as another embodiment, determine the position of the target gesture in the target image (e.g., a specific coordinate set of a target gesture outline, etc.), and go to step 105; if not, returning to thestep 101 after a preset time period, that is, returning to thestep 101 after the activation state of the application device is converted into the standby state or the off state, so as to effectively reduce the power consumption of the application device, where the preset time period may be set according to actual requirements, or may be selected to return to thestep 103 after the preset time period elapses.
In this embodiment, the two-dimensional image is mainly obtained, and a target detection algorithm is used to detect whether a target gesture and its related position (coordinate) information exist.
Preferably, in this embodiment, the AI target detection model includes an EfficientDet model, but the type of the AI target detection model is not particularly limited, and as long as the corresponding function can be implemented, the AI target detection model can be correspondingly selected and adjusted according to actual requirements.
In this embodiment, the AI target detection model may be retrained and optimized in the following manner to improve accuracy and processing efficiency:
1. optimization and tailoring of predefined models
1) The classification definition of the model is simplified as much as possible, various necessary gesture type definitions are newly added, and some common type definitions which are easily confused with gesture types or are common in the implementation of scene application in the classification of the predefined model are newly added or reserved so as to reduce the error recognition rate of the target gesture.
Such as: yaml profile was used for retraining and prediction as follows:
num_classes:6
var_freeze_expr:'(efficientnet|fpn_cells|resample_p6)'
label_id_mapping:{1:"face",2:"palm_up",3:"palm_left",4:"palm_right",5:"hand_back_down"}
2) the position detection process of the too small target Anchor size such as 4 × 4, 8 × 8, 16 × 16 in the original model is removed, and the box _ net output (such as box _ net/box-prediction _4, box _ net/box-prediction _3, box _ net/box-prediction _2) with too small size such as 4 × 4, 8 × 8, 16 × 16 may be removed. Because the distance between the device camera and the device is limited in an actual application scene, the situation that an undersized gesture target is detected cannot exist, and therefore the part is optimized, and the scale and the calculation amount of the model are effectively reduced).
3) In combination with the above 2), the class _ net prediction results (e.g., class _ net/class _ prediction _4, class _ net/class _ prediction _3, class _ net/class _ prediction _2, etc.) corresponding to the undersized target sizes such as 4 × 4, 8 × 8, 16 × 16, etc. can be correspondingly removed.
4) In combination with above 3), in the post-processing process of the dest post process after the original model runs and outputs boxes and classes, the processes of class _ net and box _ net corresponding to the undersized target sizes such as 4 × 4, 8 × 8, 16 × 16 may also be correspondingly removed (e.g., box _ net/box _ prediction _4, box _ net/box _ prediction _3, box _ net/box _ prediction _2, class _ net/class _ prediction _3, class _ net/class _ prediction _ 2).
2. Acquisition of training data
1) The collected and newly acquired data set needs to be overlaid to all class types redefined in 1.
2) By collecting and newly collecting pictures containing gestures with obvious features as a new model training and verification/test data set, the materials with unobvious features or easily confused features are removed.
3) The collected training and validation/test data sets conform to the features in 1) as much as possible and contain data sets of various different scenes/backgrounds, so that the trained model can have good compatible coverage for detection in various application scenes.
4) The method includes the steps that an actual application scene is simulated by using a front Camera of a mobile phone to acquire pictures as a data set, wherein the size proportion of gestures in a picture can be screened by combining the characteristics of a Camera device in actual implementation and the proportion of the distance between a target gesture and the Camera, and picture data which are not in good conformity with the actual application scene (such as too large or too small proportion of a gesture target area in the whole picture) are removed so as to improve training efficiency and reduce false recognition rate.
3. Training of models
1) In the model Training process, related indexes (such as AP, AP50, AP75, APl, APm, APs, ARl, ARm, ARmax1, ARmax10, ARmax100, ARs, box _ loss, cls _ loss, global _ step, loss and the like) of Training and Validation are checked through log information, and meanwhile, the change trend of the related indexes in a log file and whether over-fitting or under-fitting exist or not are checked by combining visualization tools such as a Tensoboard and the like.
2) And adjusting the training num _ epochs value to enable the model to reach better training indexes (such as Precision, Recall, IoU, loss and the like) after the model training is finished. 4. Optimization (quantification) of retraining model
1) And quantifying int8 for the pb model generated after retraining and generating a corresponding tflite model.
2) In the tflite quantization generation process, a data set which covers various application scenes as much as possible is used as a reference data set quantized by int8 in a representational _ dataset _ gen () function (a data set when a verification data set or even a training data set or a subset thereof is directly used as a data gen can be considered.
Preferably, in this embodiment, the weight and the deviation value of the AI target detection model are quantized from full-precision floating point to precision reduction floating point or 8bit, and a target gesture is recognized from the target image through the quantized AI target detection model.
By further quantizing the AI target detection model, for example, quantizing the weight and bias value of the model from full-precision floating point (float32) to reduced-precision floating point (float16) or 8bit (int8), the size of the model can be reduced without significantly reducing inference precision, thereby effectively reducing memory consumption of model operation and accelerating model inference speed.
Step 105, determining whether the recognized target gesture is a preset gesture, if so, executingstep 106, otherwise, returning to executestep 101, or optionally returning to executestep 103.
In this step, it is determined whether the recognized target gesture is a preset gesture, if so,step 106 is executed, and if not, thestep 101 is executed again after a preset time period elapses, that is, thestep 101 is executed again after the active state of the application device is converted into the standby state or the off state, so as to effectively reduce the power consumption of the application device, where the preset time period may be set correspondingly according to actual requirements, or may be selected to return to step 103 after the preset time period elapses.
And 106, generating and outputting an operation response signal predefined by the preset gesture to execute corresponding operation. Afterstep 106 is executed, thestep 103 may be returned to after a preset time interval elapses to implement the circular recognition gesture function, or thestep 101 may be returned to after the state of the application device is converted into the inactive state.
In this step, an operation response signal predefined by the preset gesture is generated and output to execute the operation corresponding to the target gesture. In this embodiment, the executed operation may be an operation corresponding to a single target gesture.
As another embodiment, after thestep 106 is executed, thestep 103 is executed again after a preset time interval, and thestep 104 and thestep 105 are executed continuously, where the preset time interval may be a known short time interval when the gesture motion is captured, and of course, may be set according to actual requirements.
After the above-mentioned loop returns to step 106, instep 106, the position change information of the gesture, that is, the position movement information of the gesture is recognized according to at least two preset gestures (that is, the preset gesture recognized in the current loop process and the preset gesture recognized in at least one past loop process).
Instep 106, after the position change information of the gesture is recognized, an operation response signal predefined by the position change information is generated and output to execute the operation corresponding to at least two target gestures. In this embodiment, the executed operation may also be an operation corresponding to a change in position based on the target gesture in at least two short time intervals.
In this embodiment, referring to fig. 2a, the preset gesture includes a leftward gesture, and the operation response signal predefined by the leftward gesture includes a left direction operation response signal;
referring to fig. 2b, the preset gesture includes a rightward gesture, and the predefined operation response signal of the rightward gesture includes a rightward operation response signal;
referring to fig. 2c, the preset gesture includes an upward gesture, and the operation response signal predefined by the upward gesture includes an upward operation response signal;
referring to fig. 2d, the preset gesture includes a downward gesture, and the operation response signal predefined by the downward gesture includes a downward operation response signal.
Specifically, the definition and recognition manner of the preset gesture are illustrated below with reference to table 1 and shown in fig. 2a to 2 d.
Table 1:
a) gesture discrimination
The palm of the left hand or the right hand is unfolded, the fingers are straightened, and the palm or the back of the hand is opposite to the image acquisition module.
i. The method comprises the following steps of respectively identifying 4 different gestures, namely upward fingertip, downward fingertip, leftward fingertip and rightward fingertip;
and ii, whether the palm or the back of the hand faces the camera, the hand can be merged and recognized as the same type of gesture processing as long as the orientation of the finger tips is the same.
b) Hand movement judgment (optional)
i. Preferably, the gesture direction is judged firstly, and the operation type is defined according to the gesture direction;
the gesture direction may also be ignored, and the operation type is defined based only on the hand movement direction, i.e. when the hand movement direction is left movement, a left direction operation is defined.
In this embodiment, the specific operation may be defined by itself, such as turning up, down, left, right, or turning pages, or as up, down, left, right direction keys (such as a game interface).
Preferably, in this embodiment, the preset gesture further includes an upward-left gesture, and the operation response signal predefined by the upward-left gesture further includes an upward-left operation response signal;
the preset gestures further comprise a downward left gesture, and the operation response signals predefined by the downward left gesture further comprise a downward left directional operation response signal;
the preset gestures further comprise an upward right gesture, and the operation response signals predefined by the upward right gesture further comprise an upward right operation response signal;
the preset gestures further comprise a downward right gesture, and the operation response signals predefined by the downward right gesture further comprise a downward right direction operation response signal.
Of course, the preset gesture of the embodiment may be any form of custom gesture, and is not limited to the above gesture, and may be set according to actual requirements.
The method for identifying the spaced gesture of the application device provided by the embodiment mainly has the following beneficial effects:
1) the gesture type and the position change thereof which are obvious are distinguished through the design characteristics to be judged, so that the gesture error recognition rate and the missing recognition rate are effectively reduced, the recognition efficiency is improved, better practicability is obtained, and the user experience degree is improved.
2) Target detection is carried out on the gesture by using AI target detection algorithms such as an EfficientDet model and the like, and the requirement on the computing capacity of the processor is reduced on the basis of not reducing the target recognition rate; in addition, the AI target detection model is further quantized, so that better operation efficiency can be achieved under the condition that an accelerator is not used on application equipment, and the hardware cost is saved.
3) For application equipment which uses batteries and is sensitive to power consumption, a secondary activation mode is adopted, and the power consumption is effectively reduced.
In order to overcome the above existing drawbacks, the present embodiment further provides an application device, including: an image acquisition module configured to acquire a target image; an image recognition module configured to recognize a target gesture from the target image; and the operation recognition module is configured to judge whether the recognized target gesture is a preset gesture, and if so, generate and output an operation response signal predefined by the preset gesture so as to execute an operation corresponding to the target gesture.
In this embodiment, the application device is an embedded device such as a smart phone, a smart television, a vehicle-mounted terminal device, and the like, but the type of the application device is not particularly limited, and the application device may be selected and adjusted according to actual requirements.
In the embodiment, the more obvious gesture types are distinguished through the preset characteristics, the target gesture detection is intelligently and conveniently carried out on the image, the operation of the space gesture of the user is effectively recognized, the error recognition rate and the missing recognition rate of the gesture are effectively reduced, the recognition efficiency is effectively improved, the user experience is improved, the requirement on the computing capacity of the processor is effectively reduced, and the hardware cost of the application equipment is effectively reduced.
Specifically, as another embodiment, as shown in fig. 3, the application device provided by this embodiment utilizes the above-mentioned method for recognizing the blank gesture of the application device, and the application device mainly includes asensor 211, a lowpower consumption device 212, anactivation control module 22, animage acquisition module 23, animage acquisition module 24, animage recognition module 25, and anoperation recognition module 26.
Thesensor 211 is configured to generate and output an activation trigger signal to theactivation control module 22 in response to sensing the target object.
Preferably, in this embodiment, thesensor 211 may include a proximity sensor and/or an infrared sensor, but the type of thesensor 211 is not particularly limited, and the sensor may be selected and adjusted according to actual requirements as long as the corresponding function is achieved.
Thelow power device 212 is configured to generate and output an activation trigger signal to theactivation control module 22 in response to detecting the preset activation action.
Preferably, in the present embodiment, thelow power device 212 may include a gravity accelerator and/or a gyroscope, but the type of thelow power device 212 is not particularly limited, and the low power device may be selected and adjusted according to actual requirements as long as the corresponding function is achieved.
Theactivation control module 22 is configured to activate the application device and invoke theimage acquisition module 24 in response to receiving an activation trigger signal.
Theimage acquisition module 23 is configured to acquire a target image and output the target image to theimage acquisition module 24, and theimage acquisition module 24 is configured to acquire the target image through theimage acquisition module 23 and send the target image to theimage recognition module 25.
Preferably, in this embodiment, theimage capturing module 23 is a two-dimensional image capturing module, that is, a camera, etc., and certainly, the type of theimage capturing module 23 is not particularly limited, as long as the corresponding function can be realized, and the corresponding selection and adjustment can be performed according to actual requirements.
Theimage recognition module 25 is configured to recognize a target gesture from the target image through an AI target detection model, and to call theoperation recognition module 26 after recognizing the target gesture.
Preferably, in this embodiment, the AI target detection model includes an EfficientDet model, but the type of the AI target detection model is not particularly limited, and as long as the corresponding function can be implemented, the AI target detection model can be correspondingly selected and adjusted according to actual requirements.
Specifically,image recognition module 25 is further configured to recognize a target gesture from the target image by an AI target detection model that quantizes weights and bias values from full precision floating point to reduced precision floating point or 8 bit.
Theimage recognition module 25 is further configured to invoke theactivation control module 22 to transition the active state of the application device to a standby state or an off state in response to no target gesture being recognized from the target image.
Theimage recognition module 25 is further configured to invoke theimage acquisition module 24 after a preset period of time has elapsed in response to no target gesture being recognized from the target image.
Theoperation recognition module 26 is configured to determine whether the recognized target gesture is a preset gesture, if so, generate an operation response signal predefined by the preset gesture and output the operation response signal so as to execute an operation corresponding to the target gesture, and if not, recall theimage acquisition module 24 after a preset time period.
Specifically, the preset gesture comprises a leftward gesture, and the operation response signal predefined by the leftward gesture comprises a leftward operation response signal;
the preset gesture comprises a rightward gesture, and the operation response signal predefined by the rightward gesture comprises a right direction operation response signal;
the preset gesture comprises an upward gesture, and the operation response signal predefined by the upward gesture comprises an upward operation response signal;
the preset gesture comprises a downward gesture, and the operation response signal predefined by the downward gesture comprises a downward operation response signal.
Preferably, in this embodiment, the preset gesture further includes an upward-left gesture, and the operation response signal predefined by the upward-left gesture further includes an upward-left operation response signal;
the preset gestures further comprise a downward left gesture, and the operation response signals predefined by the downward left gesture further comprise a downward left directional operation response signal;
the preset gestures further comprise an upward right gesture, and the operation response signals predefined by the upward right gesture further comprise an upward right operation response signal;
the preset gestures further comprise a downward right gesture, and the operation response signals predefined by the downward right gesture further comprise a downward right direction operation response signal.
Preferably, as another embodiment, theimage acquisition module 24 is further configured to acquire at least two target images through theimage acquisition module 23 at preset time intervals.
In this embodiment, the preset time interval may be a known short time interval when capturing the gesture motion, and may be set according to actual requirements.
Theimage recognition module 25 is further configured to recognize a target gesture from each of the target images, respectively.
Theoperation recognition module 26 is further configured to determine whether each recognized target gesture is a preset gesture, if so, recognize position change information according to at least two preset gestures, and generate and output an operation response signal predefined by the position change information, so as to execute operations corresponding to the at least two target gestures, and if not, recall theimage acquisition module 24 after a preset time period. The application device provided by the embodiment mainly has the following beneficial effects:
1) the gesture type and the position change thereof which are obvious are distinguished through the design characteristics to be judged, so that the gesture error recognition rate and the missing recognition rate are effectively reduced, the recognition efficiency is improved, better practicability is obtained, and the user experience degree is improved.
2) Target detection is carried out on the gesture by using AI target detection algorithms such as an EfficientDet model and the like, and the requirement on the computing capacity of the processor is reduced on the basis of not reducing the target recognition rate; in addition, the AI target detection model is further quantized, so that better operation efficiency can be achieved under the condition that an accelerator is not used on application equipment, and the hardware cost is saved.
3) For application equipment which uses batteries and is sensitive to power consumption, a secondary activation mode is adopted, and the power consumption is effectively reduced.
Fig. 4 is a schematic structural diagram of an electronic device according to another embodiment of the present invention. The electronic device comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the program to realize the method for recognizing the spaced gesture of the application device in the above embodiment. Theelectronic device 30 shown in fig. 4 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiment of the present invention.
As shown in fig. 4, theelectronic device 30 may be embodied in the form of a general purpose computing device, which may be, for example, a server device. The components of theelectronic device 30 may include, but are not limited to: the at least oneprocessor 31, the at least onememory 32, and abus 33 connecting the various system components (including thememory 32 and the processor 31).
Thebus 33 includes a data bus, an address bus, and a control bus.
Thememory 32 may include volatile memory, such as Random Access Memory (RAM)321 and/orcache memory 322, and may further include Read Only Memory (ROM) 323.
Memory 32 may also include a program/utility 325 having a set (at least one) ofprogram modules 324,such program modules 324 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
Theprocessor 31 executes various functional applications and data processing, such as the spaced gesture recognition method of the application device in the above embodiment of the present invention, by running the computer program stored in thememory 32.
Theelectronic device 30 may also communicate with one or more external devices 34 (e.g., keyboard, pointing device, etc.). Such communication may be through input/output (I/O) interfaces 35. Also, model-generatingdevice 30 may also communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet) vianetwork adapter 36. As shown in FIG. 4,network adapter 36 communicates with the other modules of model-generatingdevice 30 viabus 33. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the model-generatingdevice 30, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID (disk array) systems, tape drives, and data backup storage systems, etc.
It should be noted that although in the above detailed description several units/modules or sub-units/modules of the electronic device are mentioned, such a division is merely exemplary and not mandatory. Indeed, the features and functionality of two or more of the units/modules described above may be embodied in one unit/module according to embodiments of the invention. Conversely, the features and functions of one unit/module described above may be further divided into embodiments by a plurality of units/modules.
The present embodiment also provides a computer-readable storage medium on which a computer program is stored, which when executed by a processor implements the steps in the method for spaced gesture recognition of an application device as in the above embodiments.
More specific examples, among others, that the readable storage medium may employ may include, but are not limited to: a portable disk, a hard disk, random access memory, read only memory, erasable programmable read only memory, optical storage device, magnetic storage device, or any suitable combination of the foregoing.
In a possible embodiment, the invention may also be implemented in the form of a program product comprising program code for causing a terminal device to perform the steps of the method for isolated gesture recognition implementing an application device as in the above embodiments, when the program product is run on the terminal device.
Where program code for carrying out the invention is written in any combination of one or more programming languages, the program code may execute entirely on the user device, partly on the user device, as a stand-alone software package, partly on the user device and partly on a remote device or entirely on the remote device.
While specific embodiments of the invention have been described above, it will be appreciated by those skilled in the art that this is by way of example only, and that the scope of the invention is defined by the appended claims. Various changes and modifications to these embodiments may be made by those skilled in the art without departing from the spirit and scope of the invention, and these changes and modifications are within the scope of the invention.