Summary of the invention
For this reason, it may be necessary to a kind of gesture judging method and storage medium be provided, to solve gesture identification side in the prior artMethod low problem of robustness under complex environment.
To achieve the above object, a kind of gesture judging method is inventor provided, comprising the following steps:
The image of picture pick-up device acquisition is received, and identifies all people's body skeleton on image;The human skeleton includes handPortion;
Judge whether the human skeleton information identified matches with the human skeleton information stored in database;
If so, the hand for the human skeleton that tracking matches, and corresponding hand is determined according to the change in location of the handGesture information;
According to the corresponding relationship of preset gesture information and operational order, the corresponding operational order of the gesture information is executed.
Further, before before judging equipment with the presence or absence of the human skeleton of database storage, further includes:
The image of picture pick-up device acquisition is received, and identifies the facial image of all people on image;
Judgement identifies the facial image for whether being stored in database and matching with collected facial image;
If so, determining the face according to the corresponding relationship of the facial image and human skeleton information that store in databaseThe corresponding human skeleton information of image.
Further, judgement identifies the face figure for whether being stored in database and matching with collected facial imagePicture, specifically includes the following steps:
Calculate the similar value of the facial image of collected facial image and database storage;
Judge whether the similar value of the facial image of collected facial image and database storage is greater than 50%.
Further, corresponding gesture information is determined according to the change in location of the hand, specifically includes the following steps:
Judge whether the change in location of the hand streaks starting point;
If so, the key point for marking the change in location of the hand to streak;
Judge whether the change in location of the hand streaks end point;
If so, into next step.
Further, according to the corresponding relationship of preset gesture information and operational order, it is corresponding to execute the gesture informationOperational order, specifically includes the following steps:
Parse the be in graphical information for the key point streaked;
According to the corresponding relationship of preset be in graphical information and operational order, executes the corresponding operation of the gesture information and refer toIt enables.
Inventor additionally provides a kind of storage medium, and the storage medium is stored with computer program, the computer journeyIt is performed the steps of when sequence is executed by processor
The image of picture pick-up device acquisition is received, and identifies all people's body skeleton on image;The human skeleton includes handPortion;
Judge whether the human skeleton information identified matches with the human skeleton information stored in database;
If so, the hand for the human skeleton that tracking matches, and corresponding hand is determined according to the change in location of the handGesture information;
According to the corresponding relationship of preset gesture information and operational order, the corresponding operational order of the gesture information is executed.
Further, before before judging equipment with the presence or absence of the human skeleton of database storage, the computer program quiltProcessor realizes following steps when executing;
The image of picture pick-up device acquisition is received, and identifies the facial image of all people on image;
Judgement identifies the facial image for whether being stored in database and matching with collected facial image;
If so, determining the face according to the corresponding relationship of the facial image and human skeleton information that store in databaseThe corresponding human skeleton information of image.
Further, judgement identifies the face figure for whether being stored in database and matching with collected facial imagePicture, the computer program perform the steps of when being executed by processor
Calculate the similar value of the facial image of collected facial image and database storage;
Judge whether the similar value of the facial image of collected facial image and database storage is greater than 50%.
Further, determine that corresponding gesture information, the computer program are processed according to the change in location of the handDevice performs the steps of when executing
Judge whether the change in location of the hand streaks starting point;
If so, the key point for marking the change in location of the hand to streak;
Judge whether the change in location of the hand streaks end point;
If so, into next step.
Further, according to the corresponding relationship of preset gesture information and operational order, it is corresponding to execute the gesture informationOperational order, the computer program perform the steps of when being executed by processor
Parse the be in graphical information for the key point streaked;
According to the corresponding relationship of preset be in graphical information and operational order, executes the corresponding operation of the gesture information and refer toIt enables.
It is different from the prior art, computer formed by gesture judging method described in above-mentioned technical proposal and execution this methodThe storage medium of program, the described method comprises the following steps: receiving the image of picture pick-up device acquisition, and identifies all on imageHuman skeleton;The human skeleton includes hand;Judge the human skeleton information that identifies whether with the people that is stored in databaseBody framework information matches;If so, the hand for the human skeleton that tracking matches, and determined according to the change in location of the handCorresponding gesture information;According to the corresponding relationship of preset gesture information and operational order, the corresponding behaviour of the gesture information is executedIt instructs.Such method and the storage medium for executing method, before gesture identification, the identification of advanced row human skeleton canTo filter out the information of the ambient enviroment other than human skeleton, interference of the ambient enviroment to gesture identification is greatly reduced;It will adoptThe pairing of human skeleton information in the human skeleton information and date library collected, can also be by collected many human skeletonsThe human skeleton not stored in database in information filters out, and retains the human skeleton stored in database, this methodThe human skeleton that can then store in identification database, accurately correctly tracking has the human skeleton of operating rightHand determines corresponding gesture information finally by the change in location of the hand of human body skeleton, and identification or permission are knownThe stability and robustness of this method and storage medium are not improved in conjunction with gesture identification.
Specific embodiment
Technology contents, construction feature, the objects and the effects for detailed description technical solution, below in conjunction with specific realityIt applies example and attached drawing is cooperated to be explained in detail.
Fig. 1 to Figure 13 is please referred to, the present invention provides a kind of gesture judging method and storage mediums.Referring to Fig. 1, havingIn the embodiment of body, it the described method comprises the following steps:
It enters step S104 and receives the image of picture pick-up device acquisition, and identify all people's body skeleton on image;The peopleBody skeleton includes hand;
Subsequently into the human skeleton information that identifies of step S105 judgement whether with the human skeleton that is stored in databaseInformation matches;
If so, the hand of the human skeleton to match subsequently into step S106 tracking, and according to the position of the handVariation determines corresponding gesture information;
Step S107 is finally entered according to the corresponding relationship of preset gesture information and operational order, executes the gesture informationCorresponding operational order.
In the methods described above, the picture pick-up device can be RGB-D video camera somatosensory device, which contains infraredDepth camera, a RGB camera, one group of microphone array have real-time imaging transmission, voice transfer, multi-person interactive etc.Numerous functions.This equipment can achieve the effect that somatosensory recognition and equipment control by human body, get rid of previous handleControl and mouse manipulation make operator just can reach the effect of " controlling every sky " without contacting the equipment such as PC.RGB-D video camera is logicalIt crosses the color data that intermediate colored camera lens obtains in reality and infrared signal is emitted, due to red by a pair of of infrared ray lens groupOutside line can generate reflection to the object touched, received by infrared receiver camera lens, counted by inside to infrared signalIt calculates, is transformed into depth data.
Picture pick-up device can be used to analyze the data information of RGB-D video camera reading, and be split to scene, exportHuman skeleton information.Therefore, the process for acquiring image can be with are as follows: first using Asus-Xtion depth camera (RGB-D) intoSecondly row image data acquiring is further synthesized using the image that computer vision open source library Opencv comes out acquisition, is obtainedImage three-dimensional information, then the human skeleton information of extraction operation person.Ambient enviroment can be filtered out in this way, only extract peopleTherefore body framework information can reject inhuman information under conditions of ambient enviroment complexity.
Position where RGB-D video camera as cartesian coordinate system origin, it is specified that equipment coordinate system are as follows: with equipmentPlane parallel plane in place is x-z-plane, and horizontal direction is x-axis, and depth direction is z-axis, and vertical direction is y-axis.Therefore, it takes the photographAs equipment can collect coordinate P=(x, y, z) a little, and its calculation formula is:
Wherein Dx, Dy, Rx, Ry are constant, and i is the value of x-axis, and j is the value of y-axis, and d depth value is respectively: Dx=321, Dy=241, Rx=Ry=0.00173667, the value are that resolution ratio is, value when 640 × 480.Such picture pick-up device can be withThe coordinate of human skeleton, the especially coordinate in the joint of the hand of human skeleton are recorded, therefore, can be identified by recordThe coordinate of human skeleton, and the coordinate of the human skeleton stored in the coordinate and database is matched, then it may determine that knowledgeNot Chu human skeleton information whether match with the human skeleton information stored in database;And it can be by obtaining human bodyThe changes in coordinates of the hand joint of skeleton determines corresponding gesture information, realizes the filtering to non-hand and inhuman interference.
It in the particular embodiment, is UserTracker by the usertracking device interface that NiTE is provided.It provides accessMost of algorithm of NiTE.The object provides scene cut, skeleton, plane monitoring-network and attitude detection.Usertracking device algorithmFirst purpose is to look for all active users in special scenes.It tracks the people of each discovery respectively, and provides himThe mode that separates of profile and background.Once scene is divided, usertracking device also be used to start bone tracking and postureDetection algorithm.Each user can provide an ID when being detected.As long as user retains in the frame, User ID is kept notBecome.If user leaves the visual field of picture pick-up device, the tracking of the user will be lost, then may had when detecting the user againThere is different ID.It, can be with quick obtaining image using UserTracker.readFrame function by creating UserTrackerIn human skeleton information, information includes the unique ID number and the important joint coordinates of human body of skeleton: head, neck, left hand the palm, the right handThe palm, left shoulder, right shoulder, left finesse, right finesse, trunk, left foot toe, right crus of diaphragm toe, left knee, right knee.By obtaining ID users,By using the function startSkeletonTracking of UserTracker choose whether to skeleton corresponding to User ID intoLine trace.
Before carrying out gesture identification, the human skeleton in image that the above method passes through identification acquisition, and judgement figureAs the interior human skeleton for whether having user (i.e. databases contain information people), and track the hand of human body skeletonPortion has achieved the purpose that improve stability and robustness.It (is not deposited in database to more accurately filter non-userContain the people of information), referring to Fig. 1, in a further embodiment, with the presence or absence of the people of database storage before judging equipmentBefore body skeleton, further includes:
It initially enters step S101 and receives the image of picture pick-up device acquisition, and identify the face figure of all people on imagePicture;
It identifies whether to be stored in database subsequently into step S102 judgement and match with collected facial imageFacial image;
If so, corresponding with human skeleton information according to the facial image stored in database subsequently into step S103Relationship determines the corresponding human skeleton information of the facial image.
The facial image of all people can take haar feature and Adaboost to carry out cascade and be formed by force on identification imageThe method of classifier carrys out the face key point of the image of positioning acquisition.Haar feature is largely divided into three classes: linear character, center are specialSign, edge feature and diagonal line feature are combined into feature templates.Two different rectangles inside feature templates are white respectivelyAnd black, the characteristic value by pre-defining template are the pixel that the pixel value of rectangle where white subtracts the rectangle where blackValue and.The characteristic value key reaction of Haar the grey scale change situation of image, main feature structure are as shown in Figure 2.
Haar Characteristic Number calculation formula:
W is the width of picture, and H is the height of picture, and w is the width of rectangular characteristic, and h is the height of rectangular characteristic,Indicate rectangular characteristic in the horizontal direction with the maximum ratio coefficient that can amplify of vertical direction.Single HaarThe information that feature is included is considerably less, so cascading multiple Haar features by using Adaboost algorithm.Adaboost algorithm allows designer by using the continuous method that new " Weak Classifier " is added, to make some scheduled pointClass device possesses relatively small error rate.In Adaboost algorithm, a different power is owned by the sample of each trainingWeight, for indicating that it is selected into the probability into training set by the classifier of some component, if some sample point is correctIt is categorized into corresponding classification point, then under construction in a training set.Selected probability will be lowered;Opposite, ifSome sample point is not classified correctly, next time selected probability will the previous probability of no-load voltage ratio come high, strong classifierCascade process it is as shown in Figure 3.
Classifier YM is combined by numerous a weak typings, is voted by last m Weak Classifier to determineClassification results, and the right of speech α of each Weak Classifier is different.AdaBoost algorithm realizes detailed process is as follows instituteShow:
It (1) is that wherein N is sample number by the weights initialisation of whole training examples
(2) for m=1,2,3 ... M:
A) Weak Classifier YM is trained, so that the error function such as formula (3) of its minimum weight
B) the h right of speech α of the Weak Classifier is next calculated:
C) weight is updated:
Wherein Zm:
(3) classifier to the end is obtained:
It can be seen that previous classifier changes weight w, while last classifier is formed, if a training examples existBy accidentally point in previous classifier, then the weight that will lead to it aggravates, the sample weight correctly classified accordingly will be becauseThis is reduced.
Finally the weight Weak Classifiers classified are subjected to cascade more and form its formula of strong classifier:
In order to improve Face datection identification speed and precision, the classifier finally obtained also needs using multiple strong pointsClass device is cascaded, and in cascade sort system, every input picture sequence is passed through each strong classifier, the strong classification of frontDevice is relatively easy, so the Weak Classifier that it includes is also relatively fewer, and subsequent strong classifier becomes increasingly complex step by step, onlyIt is detected by the strong classification and Detection of front just to enter subsequent strong classifier after correct picture, it is earlierWhat classifier can filter out incongruent image of the overwhelming majority, only pass through the picture region of all strong classifier detectionsDomain just effective human face region at last, as shown in Figure 4.
Face can be identified by convolutional neural networks, judgement identifies in database whether be stored with and acquireTo the facial image that matches of facial image, filter out the people of non-user (the not stored people for having information i.e. in database)Body skeleton further guarantees the correctness of gesture identification, and numerous neurons, which passes through to combine, forms neural network, nerve netEach neural unit of network is as shown in Figure 5.
Corresponding formula are as follows:
X is vector, and W is weight corresponding to vector x, and b is constant.The unit is also referred to as Logistic regression model.It is moreA unit combines, and when being formed with mode layered, just becomes neural network model.
Fig. 6 is to show the neural network structure with hidden layer, can neuron in launch plan 6 according to formula (9)Corresponding formula are as follows:
Compare similar, a hidden layer that 2,3,4,5,6 can be extended to ....The training method of neural network andLogistic is approximate, but because of its multilayer, it is also necessary to it manipulates chain type Rule for derivation and derivation is carried out to the node of hidden layer, thisIt is exactly backpropagation.
CNN can reduce number of parameters, by local sensing open country, people to extraneous understanding, be first from regional area again toGlobal area, but connection between the space of image is but also the pixel connection of regional area is more close, and distance is farther outRegion, the correlation between pixel is then relatively weak.Each neuron does not need to perceive the information of global image, only needsThe information in localized region is wanted to be perceived, then integrating local message in the network of higher just can obtain entirelyThe information of office.The connection thought of network portion is also inspired by vision system structure in biology in neural network, as Fig. 7 is completeShown in Connection Neural Network figure, and as shown in Fig. 8 Local Connection Neural Network figure.
In Fig. 7, if each neuron is only connected with each other with 10 × 10 pixels, weighted data number is 1000000Data can be reduced to script one thousandth by × 100 parameters.And that 10 × 10 pixel values, corresponding 10 × 10 ginsengsNumber, is equivalent to carry out convolution operation.But in this case, still make parameter excessive, so using the second way, i.e. using weightsIt is shared.
If parameter above only has 100, only 1 100 × 100 convolution kernel is indicated, hence it is evident that, feature extraction isIt is insufficient, can be by the multiple convolution kernels of addition, such as using 32 convolution kernels, 32 kinds of different features can be learnt.HavingWhen multiple convolution kernels, as shown in Figure 9 and Figure 10.
In Fig. 9, a color image is split into three figures, the figure of different color channels according to tri- channels R, G, BAs corresponding different convolution kernel.Each convolution kernel can synthesize image another piece image.
It is exactly that aggregate statistics are carried out to the characteristic point of different location, these should statistical nature to describe big imageNot only with low dimensional (all extracting obtained features compared to using), but also result can be improved, it is not easy to poor fitting orOver-fitting.The operation of this polymerization is known as pond (pooling), and pond process is as shown in figure 11, finally by full articulamentum intoRow propagated forward matches corresponding label.
In a further embodiment, judgement identifies whether be stored in database and collected facial image phaseThe facial image matched, specifically includes the following steps:
Calculate the similar value of the facial image of collected facial image and database storage;
Judge whether the similar value of the facial image of collected facial image and database storage is greater than 50%.
In a further embodiment, corresponding gesture information is determined according to the change in location of the hand, specifically include withLower step:
Judge whether the change in location of the hand streaks starting point;
If so, the key point for marking the change in location of the hand to streak;
Judge whether the change in location of the hand streaks end point;
If so, into next step.
In this above method, gesture identification is identified using key point, by marking multiple key points in space,While hand streaks key point, mark the point, after the completion of acting, parse to multiple key points, and according to default handGesture judges that operator thinks that the idea of expression, such benefit done are precision height, and relative to traditional DTW, (Dynamic Programming is calculatedMethod) for algorithm it is relatively easy, do not need complicated calculating, and combination free, training sample do not needed, relative to static handGesture identification, can be combined into miscellaneous gesture motion.It allows the operator to rapidly adapt to this system in a short time.Key pointAs shown in figure 12, the point in figure is preset key point, and the color of key point can be preset au bleu, utilize hand for descriptionAfter key point, key point can be marked as red point from blue dot in portion, by judging the color of key point, to judge gesture,Finally obtain gesture motion.
In a further embodiment, according to the corresponding relationship of preset gesture information and operational order, the gesture is executedThe corresponding operational order of information, specifically includes the following steps:
Parse the be in graphical information for the key point streaked;
According to the corresponding relationship of preset be in graphical information and operational order, executes the corresponding operation of the gesture information and refer toIt enables.
Determine for after user (i.e. databases contain information people), start to track this human skeleton obtain it is leftThe coordinate information of the right hand, and coordinate is converted from the origin coordinate system transform in NiTE as the coordinate system of RGB-D video camera.It preventsOccurs coordinate system confusion during coordinates computed.As shown in figure 13, start point, coordinate using the point that coordinate is 31 as identification32 point is controlled as end point by the right hand, remaining six point is controlled by left hand, and the number of the point slipped over according to left hand is suitableThe difference of sequence, to identify gesture, Tu13Zhong, hand successively streak a little 11, point 21, point 22,23 identification of point result be 7.Such as Figure 13It is shown, successively streak a little 11, after point 12, point 13, point 23, right hand touch point 32, the information identified is L.
In the particular embodiment, the storage medium is stored with computer program, and the computer program is by processorIt is performed the steps of when execution
The image of picture pick-up device acquisition is received, and identifies all people's body skeleton on image;The human skeleton includes handPortion;
Judge whether the human skeleton information identified matches with the human skeleton information stored in database;
If so, the hand for the human skeleton that tracking matches, and corresponding hand is determined according to the change in location of the handGesture information;
According to the corresponding relationship of preset gesture information and operational order, the corresponding operational order of the gesture information is executed.
In a further embodiment, before before judging equipment with the presence or absence of the human skeleton of database storage, the meterCalculation machine program realizes following steps when being executed by processor;
The image of picture pick-up device acquisition is received, and identifies the facial image of all people on image;
Judgement identifies the facial image for whether being stored in database and matching with collected facial image;
If so, determining the face according to the corresponding relationship of the facial image and human skeleton information that store in databaseThe corresponding human skeleton information of image.
In a further embodiment, judgement identifies whether be stored in database and collected facial image phaseThe facial image matched, the computer program perform the steps of when being executed by processor
Calculate the similar value of the facial image of collected facial image and database storage;
Judge whether the similar value of the facial image of collected facial image and database storage is greater than 50%.
In a further embodiment, corresponding gesture information, the computer are determined according to the change in location of the handIt is performed the steps of when program is executed by processor
Judge whether the change in location of the hand streaks starting point;
If so, the key point for marking the change in location of the hand to streak;
Judge whether the change in location of the hand streaks end point;
If so, into next step.
In a further embodiment, according to the corresponding relationship of preset gesture information and operational order, the gesture is executedThe corresponding operational order of information, the computer program perform the steps of when being executed by processor
Parse the be in graphical information for the key point streaked;
According to the corresponding relationship of preset be in graphical information and operational order, executes the corresponding operation of the gesture information and refer toIt enables.
It should be noted that being not intended to limit although the various embodiments described above have been described hereinScope of patent protection of the invention.Therefore, it based on innovative idea of the invention, change that embodiment described herein is carried out and is repairedChange, or using equivalent structure or equivalent flow shift made by description of the invention and accompanying drawing content, it directly or indirectly will be withUpper technical solution is used in other related technical areas, is included within scope of patent protection of the invention.