Movatterモバイル変換


[0]ホーム

URL:


CN109977818A - A kind of action identification method and system based on space characteristics and multi-target detection - Google Patents

A kind of action identification method and system based on space characteristics and multi-target detection
Download PDF

Info

Publication number
CN109977818A
CN109977818ACN201910192305.XACN201910192305ACN109977818ACN 109977818 ACN109977818 ACN 109977818ACN 201910192305 ACN201910192305 ACN 201910192305ACN 109977818 ACN109977818 ACN 109977818A
Authority
CN
China
Prior art keywords
target
direction vector
video
target detection
targets
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910192305.XA
Other languages
Chinese (zh)
Inventor
刘维
张奕
李滇博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jilian Network Technology Co Ltd
Original Assignee
Shanghai Jilian Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jilian Network Technology Co LtdfiledCriticalShanghai Jilian Network Technology Co Ltd
Priority to CN201910192305.XApriorityCriticalpatent/CN109977818A/en
Publication of CN109977818ApublicationCriticalpatent/CN109977818A/en
Pendinglegal-statusCriticalCurrent

Links

Classifications

Landscapes

Abstract

The invention discloses a kind of action identification method and system based on space characteristics and multi-target detection, this method comprises: step S1, the movement detected to needs is decomposed, obtain each decomposition goal, the collection of data set is carried out to each decomposition goal, and each target data set is trained based on deep learning to obtain the target detection model of each decomposition goal;Step S2, constantly obtain video flowing, using the target detection model to the video flow detection target of input, obtain location information of the single goal in video image, and calculate the direction vector feature between target, compare the variation tendency of direction vector feature in video streaming, constantly close target is merged into fresh target;Step S3 extracts the fresh target of synthesis, merges to obtain only remaining major heading and when time target in all targets of a movement decomposition, the generation of the movement is judged according to IOU of the two between the feature and target position for the direction vector that video interframe generates.

Description

A kind of action identification method and system based on space characteristics and multi-target detection
Technical field
The present invention relates to technical field of image processing, more particularly to a kind of dynamic based on space characteristics and multi-target detectionMake recognition methods and system.
Background technique
The fast development of science and technology makes the demand for understanding video content and analyzing more and more, the prison being seen everywhereControl camera allows us to obtain the video information of magnanimity, and the method efficiency of traditional artificial treatment video is too low, andAnd accuracy rate is not high, therefore, by computer and vision algorithm in conjunction with technology received widely to substitute artificial methodConcern, this technology can not only greatly reduce cost of labor, improve the efficiency of processing event, but also discovery thing can be improvedThe accuracy rate of part and the timeliness for handling event.In field of video content analysis, action recognition is one very importantBranch, detection effect and performance are most important for the detection of anomalous event and behavior, so, action recognition has very strongSocial effect.
By the way that discovery is investigated to existing action identification method, most of action recognitions are all based in deep learningConvolutional neural networks carry out single frames picture single goal detection or extract feature after, classification prediction is carried out by SVM classifier,Also there is the Optical-flow Feature for extracting single goal in successive frame, be then based on deep learning and be trained classification, ballot statistics obtains pre-It surveys as a result, extract feature however, these action identification methods are all based on and act single goal itself and analyzed, thus when dynamicWhen complexity of making comparisons, certain features are easy to be ignored, so as to cause the ineffective of detection.
Summary of the invention
In order to overcome the deficiencies of the above existing technologies, purpose of the present invention is to provide a kind of based on space characteristics and moreThe action identification method and system of target detection, with by simplifying compound action, and by utilizing the space between multiple targetThe accuracy rate of feature raising action recognition.
In order to achieve the above object, the present invention proposes a kind of action identification method based on space characteristics and multi-target detection, packetInclude following steps:
Step S1, the movement detected to needs are decomposed, and each decomposition goal is obtained, and carry out data to each decomposition goalThe collection of collection, and each target data set is trained based on deep learning to obtain the target detection model of each decomposition goal;
Step S2, constantly acquisition video flowing are obtained using the target detection model to the video flow detection target of inputLocation information of the single goal in video image, and the direction vector feature between target is calculated, compare direction vector feature and is regardingConstantly close target is merged into fresh target by the variation tendency in frequency stream;
Step S3 extracts the fresh target of synthesis, in all targets of a movement decomposition merge only remaining major heading andWhen secondary target, this is judged according to IOU of the two between the feature and target position for the direction vector that video interframe generatesThe generation of movement.
Preferably, step S1 further comprises:
Step S100, the movement detected to needs are decomposed, several decomposition goals are obtained;
Step S101 carries out data set collection to each decomposition goal, obtains multiple target data sets;
Step S102 pre-processes the target data set of acquisition;
Step S103 is trained each target data set using YoloV3 network to obtain the target detection of each decomposition goalModel.
Preferably, in step S102, the pretreatment includes but is not limited to the target in the image of target data setIt is translated, mirror image operation, some Gaussian noises, salt-pepper noise is added in target position, random cropping partial target image is rightImage shaken, padding.
Preferably, step S2 further comprises:
Step S200 carries out target detection using the target detection model to video flowing present frame, obtains each targetLocation information, and the direction vector between each target is calculated, extract the feature of direction vector between target two-by-two;
Step S201 carries out target detection using the target detection model to next frame video, obtains the position of each targetConfidence breath, and the direction vector between each target is calculated, extract the feature of direction vector;
The direction vector feature obtained according to before and after frames video is compared, compares direction vector feature by step S202Constantly close target is merged into fresh target by the variation tendency of length characteristic and direction character in video streaming.
Preferably, in step S202, judgement whether there is close trend between target two-by-two, close if it exists to becomeGesture, then continue next frame video, and return step S201 is overlapped until target is close two-by-two, and is merged into fresh target.
Preferably, in step S202, if in the video frame of front and back two-by-two the direction vector length of target it is continuous reduction withAnd direction is consistent in video interframe, then illustrates two targets constantly close.
Preferably, two direction vector walking direction of video interframe indicates are as follows:
uN-1, n·u′N-1, n=(xn-1-xn)·(x′n-1-xn′)+(yn-1-yn)·(y′n-1-yn′)
Wherein, uN-1, nIndicate t1Direction vector between frame video object, uN-1, nIndicate t2Between frame video object direction toAmount, (xn, yn) indicate target position coordinates,
In continually entering video flowing, if it exists | uN-1, n| > | u 'N-1, n| and uN-1, n·u′N-1, n> 0, then it represents that twoTarget has the tendency that close.
Preferably, in step S202, by the IOU size between two targets to determine whether two targets should be closedAnd at a fresh target.
Preferably, in step S202, the IOU between two targets is calculated, then merges two targets if more than some threshold valueFor fresh target.
In order to achieve the above objectives, the present invention also provides a kind of action recognition system based on space characteristics and multi-target detectionSystem, comprising:
Target detection model training acquiring unit, the movement for detecting to needs are decomposed, and each decomposition mesh is obtainedMark carries out the collection of data set to each decomposition goal, and is trained to obtain each point to each target data set based on deep learningSolve the target detection model of target;
Object detection unit is examined for constantly obtaining video flowing using video flowing of the target detection model to inputSurvey target, obtain location information of the single goal in video image, and calculate the direction vector feature between target, compare direction toConstantly close target is merged into fresh target by the variation tendency of measure feature in video streaming;
Action recognition unit, for extracting the fresh target of synthesis, when all targets of a movement decomposition merge only surplusWhen lower major heading and secondary target, according to the two between the feature and target position for the direction vector that video interframe generatesIOU come judge movement generation.
Compared with prior art, a kind of action identification method and system based on space characteristics and multi-target detection of the present inventionBy being multiple simple targets by movement decomposition and establishing target detection model, the space in video between multiple target is made full use ofVector characteristics, by interframe vector variation characteristic, movement relation and positional relationship by multiple targets in continuous interframe are examinedSurvey movement, realizes the purpose for improving action recognition accuracy rate.
Detailed description of the invention
Fig. 1 is a kind of step flow chart of the action identification method based on space characteristics and multi-target detection of the present invention;
Fig. 2 is the network structure of YoloV3 network in the specific embodiment of the invention;
Fig. 3 is the schematic diagram that IOU is calculated in the specific embodiment of the invention;
Fig. 4 is a kind of system architecture diagram of the motion recognition system based on space characteristics and multi-target detection of the present invention;
Fig. 5 is the detail structure chart of target detection model training acquiring unit in the specific embodiment of the invention;
Fig. 6 is the detail structure chart of object detection unit in the specific embodiment of the invention;
Fig. 7 is the process of the action identification method based on space characteristics and multi-target detection of the specific embodiment of the inventionFigure.
Specific embodiment
Below by way of specific specific example and embodiments of the present invention are described with reference to the drawings, those skilled in the art canUnderstand further advantage and effect of the invention easily by content disclosed in the present specification.The present invention can also pass through other differencesSpecific example implemented or applied, details in this specification can also be based on different perspectives and applications, without departing substantially fromVarious modifications and change are carried out under spirit of the invention.
Fig. 1 is a kind of step flow chart of the action identification method based on space characteristics and multi-target detection of the present invention.Such asShown in Fig. 1, a kind of action identification method based on space characteristics and multi-target detection of the present invention includes the following steps:
Step S1 decomposes the movement that detects of needs, obtains each decomposition goal, such as this movement of drinking can be withBe decomposed into wineglass, hand, three targets of mouth (such as Manual definition will drink movement decomposition be three wineglass, hand and mouth meshMark), carry out the collection of data set to each decomposition goal, the data set some can be looked in online certain disclosed data setsIt arrives, some then need to collect pictures is labeled by software such as labelImg, and based on deep learning to each target dataCollect and is trained to obtain the target detection model of each decomposition goal, in the present invention, the data set training based on all decomposition goalsA target detection model is obtained, this target detection model can detecte all targets in data set.
Specifically, step S1 further comprises:
Step S100, the movement detected to needs are decomposed, several decomposition goals are obtained.
Step S101 carries out data set collection to each decomposition goal, obtains multiple target data sets.Such as it is received from networkCollection includes the picture of each decomposition goal, and the picture comprising identical decomposition goal is collected together, the mesh of the decomposition goal is formedData set is marked, such as the picture comprising each decomposition goal can be downloaded from the Internet, the figure of single decomposition goal can also be downloaded, is led toIt crosses annotation tool the target in picture is marked out to form the target data set of the decomposition goal, it includes original imagesThe comment file generated with mark.
Step S102 pre-processes the target data set of acquisition.Specifically, in order to improve the property of target detectionCan, before being trained based on deep learning to target data set, the image concentrated to the target data of acquisition is pre-processed,The pretreatment includes but is not limited to: being translated to the target in the image of target data set, mirror image operation, in target positionBe added some Gaussian noises, salt-pepper noise etc., random cropping partial target image, image is done shake, padding.
Step S103 is trained each target data set using YoloV3 network to obtain the target detection of each decomposition goalModel.
In the present invention, the network structure of YoloV3 is the residual block that is formed using 3*3 convolution sum 1*1 convolution as basic portionPart detects target in three various sizes of outputs, then will test target and merged to obtain final goal by NMS, defeatedScale is respectively 13*13,26*26,52*52 out, and the network structure of YoloV3 network is as shown in Fig. 2, wherein residual block part one21 convolutional layers (convolutional layer including several 3*3 and 1*1 convolution) is shared, remaining is res layers, and the part YOLO is yolo networkFeature interaction layer, is divided into three scales, and in each scale, local feature interaction is realized by way of convolution kernel, acts on classIt is similar to full articulamentum but is the part realized by way of convolution kernel (3*3 and 1*1) between characteristic pattern (feature map)Feature interaction, in the specific embodiment of the invention, Far Left is smallest dimension yolo layers, and input is the characteristic pattern of 13*13(feature map) exports the characteristic pattern (feature map) of 13*13 size, in this base by a series of convolution operationClassification is carried out on plinth and position returns;Centre is mesoscale yolo layers, the characteristic pattern that yolo layers of smallest dimension are exported(feature map) carries out a series of convolution operations, exports the characteristic pattern (feature map) of 26*26 size, then herein intoRow classification and position return;Rightmost is the yolo layer of large scale, the characteristic pattern (feature that yolo layers of mesoscale are exportedMap a series of convolution operations) are carried out, the characteristic pattern (feature map) of 52*52 size is exported, then carry out herein classification andPosition returns.
Step S2, constantly acquisition video flowing obtain monocular using target detection model to the video flow detection target of inputThe location information being marked in video image, and the direction vector feature between target is calculated, compare direction vector feature in video flowingIn variation tendency, constantly close target is merged into fresh target.In the specific embodiment of the invention, before and after frames are regarded respectivelyFrequency meter calculates the direction vector between target two-by-two, and the space characteristics i.e. length of vector for extracting direction vector and direction are in video flowingBetween variation tendency, if the length of the direction vector between two targets constantly reduces and direction is consistent in video interframe, explanationTwo targets constantly it is close, then calculate the IOU between two targets, then merge into fresh target for two if more than some threshold value.
Specifically, step S2 further comprises:
Step S200 carries out target detection using the target detection model that training obtains in S1 to video flowing present frame, obtainsTo the location information of each target, and the direction vector between each target is calculated, extracts the feature of direction vector between target two-by-two.
In the specific embodiment of the invention, the direction vector method obtained between target two-by-two is as follows: assuming that each single goal existsLocation information in image is represented by (x1, y1, t1), (x2, y2, t1) ..., (xn, yn, t1), wherein t1Indicate t1Frame viewFrequently, (xn, yn) position coordinates that indicate target, the direction vector between target may be expressed as:
u1,2=(x1-x2, y1-y2, t1)
u1, n=(x1-xn, y1-yn, t1)
uN-1, n=(xn-1-xn, yn-1-yn, t1)
Wherein, uN-1, nIndicate the direction vector between (n-1)th target and n-th of target.
In the embodiment of the present invention, direction vector feature generally refers to the length characteristic and direction character of direction vector, i.e.,The feature for extracting direction vector between target two-by-two is exactly to calculate length and the direction of direction vector, wherein the length of direction vectorDegree feature may be expressed as:
Step S201 carries out target detection using target detection model to next frame, obtains the location information of each target, andThe direction vector between each target is calculated, the feature of direction vector is extracted.The calculating of target detection and direction vector hereinIdentical as step S200, it will not be described here.
Step S202, the direction vector feature that step S201 is obtained and the direction vector obtained according to former frame video are specialSign compares, and compares the variation tendency of direction vector feature length characteristic in video streaming and direction character, will be constantly closeTarget is merged into fresh target, that is, judges to whether there is close trend between target two-by-two, if it exists close trend, then under continuingOne frame video, return step S201, until target merges constantly close target very close to (such as close coincidence) two-by-twoAt fresh target, if close trend is not present in continuous several frames between two targets, illustrate it is not related between the two targets,The relationship between the two targets is then no longer paid close attention to below.
In the present invention, trend close between target is by the direction vector of target two-by-two in the video frame of front and back two-by-twoWhat the feature that the continuous reduction of length and direction are consistent was judged, i.e. two direction vector walking direction of video interframe canIt indicates are as follows:
uN-1, n·u′N-1, n=(xn-1-xn)·(x′n-1-xn′)+(yn-1-yn)·(y′n-1-yn′)
Wherein, uN-1, nIndicate t1Direction vector between frame video object, uN-1, nIndicate t2Between frame video object direction toAmount.In continually entering video flowing, if it exists | uN-1, n|>|u′N-1, n| and uN-1, n·u′N-1, n> 0, then it represents that two targets connectClose trend.
In the specific embodiment of the invention, merging target is by the IOU (Intersection-over- between two targetsUnion is handed over and is compared) size be to determine whether a fresh target should be merged by two targets.Specifically, when two targetsBetween IOU (hand over and compare) two targets can synthesize to when being more than some threshold value T a fresh targets, between two targetsIOU may be expressed as:
Wherein A, B indicate two targets, and the schematic diagram of IOU is as shown in Figure 3.
Step S3 extracts the fresh target of synthesis, when all targets of a movement decomposition merge only remaining major heading andWhen secondary target, judge to move according to IOU of the two between the feature and target position for the direction vector that video interframe generatesThe generation of work.
In the specific embodiment of the invention, the difference of major heading and time target refers in multiple targets of movement decomposition,The motion change of video interframe is less major heading, and remaining decomposition goal is kept moving in interframe, one be finally synthesizingFresh target is referred to as time target.Secondary target can be constantly close to major heading in video streaming, by the direction between two targets toThe IOU between space characteristics, that is, vector length, direction and the variable quantity and two targets of length of formation, which is measured, to judge movement isNo generation.
Fig. 4 is a kind of system architecture diagram of the motion recognition system based on space characteristics and multi-target detection of the present invention.Such asShown in Fig. 4, a kind of motion recognition system based on space characteristics and multi-target detection of the present invention, comprising:
Target detection model training acquiring unit 401, the movement for detecting to needs are decomposed, and each decomposition is obtainedTarget carries out the collection of data set to each decomposition goal, and is trained to obtain respectively to each target data set based on deep learningThe target detection model of decomposition goal.For example, three wineglass, hand, mouth targets can be decomposed into for this movement of drinking, it is rightEach decomposition goal carries out the collection of data set, some can be focused to find out the data set in online certain disclosed data, someIt then needs to collect pictures and be labeled by software such as labelImg, be then based on deep learning and each target data set is carried outTraining obtains the target detection model of each decomposition goal, this target detection model can detecte all targets in data set.
Specifically, as shown in figure 5, target detection model training acquiring unit 401 further comprises:
Movement decomposition unit 4010, the movement detected to needs are decomposed, several decomposition goals are obtained.
Target data set collector unit 4011 obtains multiple number of targets for carrying out data set collection to each decomposition goalAccording to collection.Such as the picture comprising each decomposition goal is collected from network, the picture comprising identical decomposition goal is collected together,Form the target data set of the decomposition goal.
Pretreatment unit 4012, for being pre-processed to the target data set of acquisition.Specifically, in order to improve targetThe performance of detection need to be obtained using 4012 Duis of pretreatment unit before being trained based on deep learning to target data setThe image that target data is concentrated is pre-processed, and the pretreatment includes but is not limited to: to the mesh in the image of target data setMark translated, mirror image operation, and some Gaussian noises, salt-pepper noise etc., random cropping partial target figure is added in target positionPicture, image is done shake, padding.
Model training unit 4013 obtains each decomposition mesh for being trained using YoloV3 network to each target data setTarget target detection model.
In the present invention, the network structure of YoloV3 is the residual block that is formed using 3*3 convolution sum 1*1 convolution as basic portionPart detects target in three various sizes of outputs, then will test target and merged to obtain final goal by NMS, defeatedScale is respectively 13*13,26*26,52*52 out.
Object detection unit 402, for constantly obtaining video flowing, using target detection model to the video flow detection of inputTarget obtains location information of the single goal in video image, and calculates the direction vector feature between target, compares direction vectorConstantly close target is merged into fresh target by the variation tendency of feature in video streaming.In the specific embodiment of the invention, meshDetection unit 402 is marked respectively to the direction vector between the calculating of before and after frames video two-by-two target, and the space for extracting direction vector is specialSign is the variation tendency between video flowing of length and direction of vector, if the length of the direction vector between two targets constantly reduce andDirection is consistent in video interframe, illustrate two targets constantly it is close, then the IOU between two targets is calculated, if more than someThreshold value then merges into fresh target for two.
Specifically, as shown in fig. 6, object detection unit 402 further comprises:
Former frame module of target detection 4021 carries out target detection using target detection model to video flowing present frame, obtainsTo the location information of each target, and the direction vector between each target is calculated, extracts the feature of direction vector between target two-by-two.
In the specific embodiment of the invention, the direction vector method obtained between target two-by-two is as follows: assuming that each single goal existsLocation information in image is represented by (x1, y1, t1), (x2, y2, t1) ..., (xn, yn, t1), wherein t1Indicate t1Frame viewFrequently, (xn, yn) position coordinates that indicate target, the direction vector between target may be expressed as:
u1,2=(x1-x2, y1-y2, t1)
u1, n=(x1-xn, y1-yn, t1)
uN-1, n=(xn-1-xn, yn-1-yn, t1)
Wherein, uN-1, nIndicate the direction vector between (n-1)th target and n-th of target.
In the embodiment of the present invention, direction vector feature generally refers to the length characteristic and direction character of direction vector, i.e.,The feature for extracting direction vector between target two-by-two is exactly to calculate length and the direction of direction vector, wherein the length of direction vectorDegree feature may be expressed as:
A later frame object detection unit 4022 is obtained for carrying out target detection using target detection model to next frameThe location information of each target, and the direction vector between each target is calculated, extract the feature of direction vector.Target detection hereinAnd the calculating of direction vector is identical as step S200, it will not be described here.
Trend judgement processing unit 4023, the direction vector feature and basis that a later frame object detection unit 4022 is obtainedThe direction vector feature that former frame video obtains compares, and judges to whether there is close trend between target two-by-two, close if it existsTrend, then continue next frame video, return to a later frame object detection unit 4022, until two-by-two target very close to (such asClose to coincidence), merge target.If close trend is not present in continuous several frames between two targets, illustrate the two targets itBetween it is not related, behind then no longer pay close attention to relationship between the two targets.
In the present invention, trend close between target is by the direction vector of target two-by-two in the video frame of front and back two-by-twoWhat the feature that the continuous reduction of length and direction are consistent was judged, i.e. two direction vector walking direction of video interframe canIt indicates are as follows:
uN-1, n·u′N-1, n=(xn-1-xn)·(x′n-1-xn′)+(yn-1-yn)·(y′n-1-yn′)
Wherein, uN-1, nIndicate t1Direction vector between frame video object, uN-1, nIndicate t2Between frame video object direction toAmount.In continually entering video flowing, if it exists | uN-1, n| > | u 'N-1, n| and uN-1, n·u′N-1, n> 0, then it represents that two targets haveClose trend.
In the specific embodiment of the invention, merging target is by the IOU (Intersection-over- between two targetsUnion is handed over and is compared) size be to determine whether a fresh target should be merged by two targets.Specifically, when two targetsBetween IOU (hand over and compare) two targets can synthesize to when being more than some threshold value T a fresh targets, between two targetsIOU may be expressed as:
Wherein A, B indicate two targets.
Action recognition unit 403, for extracting the fresh target of synthesis, when all targets of a movement decomposition merge onlyWhen remaining major heading and secondary target, according to the two between the feature and target position for the direction vector that video interframe generatesIOU come judge movement generation.
In the specific embodiment of the invention, the difference of major heading and time target refers in multiple targets of movement decomposition,The motion change of video interframe is less major heading, and remaining decomposition goal is kept moving in interframe, one be finally synthesizingFresh target is referred to as time target.Secondary target can be constantly close to major heading in video streaming, by the direction between two targets toThe IOU between space characteristics, that is, vector length, direction and the variable quantity and two targets of length of formation, which is measured, to judge movement isNo generation.
Fig. 7 is the process of the action identification method based on space characteristics and multi-target detection of the specific embodiment of the inventionFigure.In the present embodiment, it is as follows to be somebody's turn to do the action recognition process based on space characteristics and multi-target detection:
Step 1, will need the movement decomposition that detects is multiple targets, collects the data set of those targets, to data set intoRow pretreatment, is trained it to obtain the model of target detection by the YoloV3 network in deep learning.
In the present embodiment, carrying out pretreated method to data set has: being translated to the target in image, mirror image is graspedMake, some Gaussian noises, salt-pepper noise etc. is added in target position, random cropping partial target image does image and shakes, fills outFill operation.The network structure of YoloV3 is the residual block that is formed using 3*3 convolution sum 1*1 convolution as the basic element of character, in three differencesThe output of size detects target, then will test target and merged to obtain final goal by NMS, and output scale is respectively13*13,26*26,52*52.
Step 2, input video frame carry out target detection to the frame video, obtain the location information of target, calculate eachDirection vector between target extracts the feature of target direction vector two-by-two;
In the embodiment of the present invention, the direction vector method obtained between target two-by-two is: the position letter of single goal in the pictureBreath is represented by (x1, y1, t1), (x2, y2, t1) ..., (xn, yn, t1), wherein t1Indicate t1Frame video, (xn, yn) indicate meshTarget position coordinates, direction vector may be expressed as:
u1,2=(x1-x2, y1-y2, t1)
u1, n=(x1-xn, y1-yn, t1)
un-1, n=(xn-1-xn, yn-1-yn, t1)
Wherein, uN-1, nIndicate the direction vector between (n-1)th target and n-th of target.
In the embodiment of the present invention, the length characteristic of direction vector be may be expressed as:
Step 3, input video stream, same to carry out target detection and calculate the direction vector between target two-by-two again, andIt is compared with direction vector before, judges to whether there is close trend between target two-by-two, continue input video, until meshMark is very close to merging target;
Two direction vector walking direction of video interframe may be expressed as:
uN-1, n·u′N-1, n=(xn-1-xn)·(x′n-1-xn′)+(yn-1-yn)·(y′n-1-yn′)
Wherein, uN-1, nIndicate t1Direction vector between frame video object, uN-1, nIndicate t2Between frame video object direction toAmount.
In continually entering video flowing, exist | uN-1, n| > | u 'N-1, n| and uN-1, n·u′N-1, n> 0, then it represents that two meshClose trend is indicated, two targets can be synthesized into a new mesh when the IOU between two targets is more than some threshold value TIt marks, the IOU between two targets may be expressed as:
Wherein A, B indicate two targets.
Step 4, when all targets of a movement decomposition merge to obtain only remaining major heading and when time target, according to the twoVideo interframe generate direction vector feature and target position between IOU come judge movement generation.In this implementationIn example, a movement decomposition is referred to as major heading, remaining multiple lists for the little target of the mobile variation in multiple targetsTarget constantly moves one fresh target of synthesis in interframe, and the fresh target finally obtained is referred to as time target, and secondary target can be in videoIt is constantly close to major heading in stream, pass through space characteristics, that is, vector length of the direction vector formation between two targets, directionWhether the IOU between the variable quantity and two targets of length occurs to judge to act.
In conclusion a kind of action identification method and system based on space characteristics and multi-target detection of the present invention pass through byMovement decomposition is multiple simple targets and establishes target detection model, makes full use of the space vector in video between multiple target specialSign, by interframe vector variation characteristic, by multiple targets continuous interframe movement relation and positional relationship come detection operation,Realize the purpose for improving action recognition accuracy rate.
The above-described embodiments merely illustrate the principles and effects of the present invention, and is not intended to limit the present invention.AnyWithout departing from the spirit and scope of the present invention, modifications and changes are made to the above embodiments by field technical staff.Therefore,The scope of the present invention, should be as listed in the claims.

Claims (10)

CN201910192305.XA2019-03-142019-03-14A kind of action identification method and system based on space characteristics and multi-target detectionPendingCN109977818A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201910192305.XACN109977818A (en)2019-03-142019-03-14A kind of action identification method and system based on space characteristics and multi-target detection

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201910192305.XACN109977818A (en)2019-03-142019-03-14A kind of action identification method and system based on space characteristics and multi-target detection

Publications (1)

Publication NumberPublication Date
CN109977818Atrue CN109977818A (en)2019-07-05

Family

ID=67078860

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201910192305.XAPendingCN109977818A (en)2019-03-142019-03-14A kind of action identification method and system based on space characteristics and multi-target detection

Country Status (1)

CountryLink
CN (1)CN109977818A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN110942011A (en)*2019-11-182020-03-31上海极链网络科技有限公司Video event identification method, system, electronic equipment and medium
CN111695638A (en)*2020-06-162020-09-22兰州理工大学Improved YOLOv3 candidate box weighted fusion selection strategy
WO2021017291A1 (en)*2019-07-312021-02-04平安科技(深圳)有限公司Darkflow-deepsort-based multi-target tracking detection method, device, and storage medium
CN112418278A (en)*2020-11-052021-02-26中保车服科技服务股份有限公司Multi-class object detection method, terminal device and storage medium
CN112967320A (en)*2021-04-022021-06-15浙江华是科技股份有限公司Ship target detection tracking method based on bridge collision avoidance

Citations (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN102663429A (en)*2012-04-112012-09-12上海交通大学Method for motion pattern classification and action recognition of moving target
US20170039431A1 (en)*2015-08-032017-02-09Beijing Kuangshi Technology Co., Ltd.Video monitoring method, video monitoring apparatus and video monitoring system
CN108288032A (en)*2018-01-082018-07-17深圳市腾讯计算机系统有限公司Motion characteristic acquisition methods, device and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN102663429A (en)*2012-04-112012-09-12上海交通大学Method for motion pattern classification and action recognition of moving target
US20170039431A1 (en)*2015-08-032017-02-09Beijing Kuangshi Technology Co., Ltd.Video monitoring method, video monitoring apparatus and video monitoring system
CN108288032A (en)*2018-01-082018-07-17深圳市腾讯计算机系统有限公司Motion characteristic acquisition methods, device and storage medium

Cited By (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
WO2021017291A1 (en)*2019-07-312021-02-04平安科技(深圳)有限公司Darkflow-deepsort-based multi-target tracking detection method, device, and storage medium
CN110942011A (en)*2019-11-182020-03-31上海极链网络科技有限公司Video event identification method, system, electronic equipment and medium
CN111695638A (en)*2020-06-162020-09-22兰州理工大学Improved YOLOv3 candidate box weighted fusion selection strategy
CN112418278A (en)*2020-11-052021-02-26中保车服科技服务股份有限公司Multi-class object detection method, terminal device and storage medium
CN112967320A (en)*2021-04-022021-06-15浙江华是科技股份有限公司Ship target detection tracking method based on bridge collision avoidance
CN112967320B (en)*2021-04-022023-05-30浙江华是科技股份有限公司Ship target detection tracking method based on bridge anti-collision

Similar Documents

PublicationPublication DateTitle
Ke et al.Multi-dimensional traffic congestion detection based on fusion of visual features and convolutional neural network
CN109977818A (en)A kind of action identification method and system based on space characteristics and multi-target detection
CN105760849B (en)Target object behavioral data acquisition methods and device based on video
US10432896B2 (en)System and method for activity monitoring using video data
CN106897670A (en)A kind of express delivery violence sorting recognition methods based on computer vision
CN108256426A (en)A kind of facial expression recognizing method based on convolutional neural networks
CN110287826A (en) A Video Object Detection Method Based on Attention Mechanism
CN115410162B (en)Multi-target detection and tracking algorithm under complex urban road environment
CN114677633B (en)Multi-component feature fusion-based pedestrian detection multi-target tracking system and method
CN114973305B (en)Accurate human body analysis method for crowded people
CN114332921A (en) Pedestrian detection method based on Faster R-CNN network based on improved clustering algorithm
CN111444913B (en)License plate real-time detection method based on edge guiding sparse attention mechanism
CN116128926A (en)Satellite video single-target tracking method, system, equipment and storage medium
Azizi et al.Vehicle counting using deep learning models: a comparative study
WO2025118541A1 (en)Skeleton detection and fall detection method based on improved spatio-temporal adaptive graph convolution
Han et al.Progressive feature interleaved fusion network for remote-sensing image salient object detection
CN112288778A (en)Infrared small target detection method based on multi-frame regression depth network
Rohra et al.MSFFNet: multi-scale feature fusion network with semantic optimization for crowd counting
CN119649467B (en)Theft behavior identification method and system based on computer vision
CN117523614A (en) A YOLOv7 head recognition method based on fused NAM
Wei et al.Gaitdlf: global and local fusion for skeleton-based gait recognition in the wild: S. Wei et al.
CN114037950A (en)Multi-pedestrian tracking method and device based on pedestrian and head detection
Zhu et al.Data Generation Scheme for Thermal Modality with Edge-Guided Adversarial Conditional Diffusion Model
Zheng et al.Lightweight Multiscale Spatio-Temporal Graph Convolutional Network for Skeleton-Based Action Recognition
CN116503649A (en)Visual SLAM loop detection method based on improved convolutional neural network in indoor scene

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
AD01Patent right deemed abandoned
AD01Patent right deemed abandoned

Effective date of abandoning:20230404


[8]ページ先頭

©2009-2025 Movatter.jp