Movatterモバイル変換


[0]ホーム

URL:


CN110751674A - Multi-target tracking method and corresponding video analysis system - Google Patents

Multi-target tracking method and corresponding video analysis system
Download PDF

Info

Publication number
CN110751674A
CN110751674ACN201810821668.0ACN201810821668ACN110751674ACN 110751674 ACN110751674 ACN 110751674ACN 201810821668 ACN201810821668 ACN 201810821668ACN 110751674 ACN110751674 ACN 110751674A
Authority
CN
China
Prior art keywords
target
information
matching
detected
predicted
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810821668.0A
Other languages
Chinese (zh)
Inventor
刘吉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xilinx Technology Beijing Ltd
Original Assignee
Beijing Shenjian Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Shenjian Intelligent Technology Co LtdfiledCriticalBeijing Shenjian Intelligent Technology Co Ltd
Priority to CN201810821668.0ApriorityCriticalpatent/CN110751674A/en
Publication of CN110751674ApublicationCriticalpatent/CN110751674A/en
Pendinglegal-statusCriticalCurrent

Links

Images

Classifications

Landscapes

Abstract

A multi-target tracking method and a corresponding video analysis system implementing the scheme are disclosed. The method comprises the following steps: acquiring detection position information of a target; acquiring predicted position information of a target; matching the acquired predicted position information with the detected position information; updating state information of a certain target based on a detected position of the target in a case where it is determined that the detected position information of the certain target matches predicted position information of the certain target; determining that a target is a disappearing target in a case where it is determined that the predicted position of the target does not have a detected position matching therewith; and in the case where it is determined that the detected position of a certain target does not have a detected position matching therewith, determining that the target is a new target or a newly appearing target. Therefore, through reasonable allocation of the functional modules, the high-speed tracking of low computation power can be realized, and meanwhile, the high tracking precision requirement can be met.

Description

Multi-target tracking method and corresponding video analysis system
Technical Field
The invention relates to the field of image processing, in particular to a multi-target tracking method and a corresponding video analysis system.
Background
Target detection and tracking has been an important research direction in academia and industry. For example, a video monitoring system, as an important component of smart security and smart traffic in the application of the internet of things facing the urban public safety integrated management, faces a great challenge of deep application. Moreover, object detection and tracking has tremendous utility and potential implications in areas such as vehicle-assisted driving, transportation, and gaming.
With the rapid development of target detection algorithms and target attribute analysis algorithms in recent years, the accuracy of target detection and attribute analysis is higher and higher, but the calculation amount is also larger and larger. When the algorithms need to be locally deployed at embedded terminals such as cameras and unmanned aerial vehicles for the sake of real-time performance and security, the calculation amount of the algorithms is limited to a smaller order of magnitude due to the power consumption limitation of the embedded terminals.
Therefore, how to realize accurate multi-target tracking under the condition of meeting the power consumption limitation becomes a big problem to be solved in the field of target tracking.
Disclosure of Invention
In view of at least one of the above problems, the present invention provides a multi-target tracking scheme and a corresponding video analysis system implementing the same, which can achieve high-speed tracking with low computational power and meet high tracking accuracy requirements through reasonable deployment of each functional module.
According to one aspect of the invention, a multi-target tracking method is provided, which comprises the following steps: acquiring detection position information of a target; acquiring predicted position information of a target; matching the acquired predicted position information with the detected position information; updating state information of a certain target based on a detected position of the target in a case where it is determined that the detected position information of the certain target matches predicted position information of the certain target; determining that a target is a disappearing target in a case where it is determined that the predicted position of the target does not have a detected position matching therewith; and in the case where it is determined that the detected position of a certain target does not have a detected position matching therewith, determining that the target is a new target or a newly appearing target.
Thus, by matching prediction and detection and subsequent processing steps, various situations (e.g., object loss, reappearance, new object, etc.) encountered in object detection are flexibly coped with, and by reasonable cooperation of the detection, tracking and analysis modules, the amount of computation required to correctly display the object (and its attributes) is reduced while ensuring video analysis accuracy.
Acquiring the detected position information of the target may include: the currently input image frame is subject to object detection to obtain detection position information of at least one object, and the currently input image frame for subject detection may be preferably selected at intervals from consecutive video image frames according to a predetermined rule. Thereby, the amount of computation required for video analysis is further reduced by reducing the detection requirements.
Obtaining the predicted location information of the target may include: obtaining predicted position information of a certain object based on previous state information of the object detected in a previous frame; preferably, the target may be modeled based on previous state information of the target to predict the position of the target in the current frame based on the state model of the target. Therefore, the prediction of each target position in the subsequent frame can be realized through more efficient model calculation. Further, in a case where it is determined that the detected position information of the certain object matches the predicted position information of the certain object, updating the position state information of the object using the detected position of the object includes: the state model of the object is corrected using the detected position of the object. Therefore, the prediction accuracy of the prediction model is improved by introducing a correction mechanism.
The previous state information of a certain target may include: the position, velocity, acceleration, deformation velocity, and/or template information of the object. Modeling based on prior state information for the target includes: modeling the target using at least one of: a Kalman (Kalman) filter; a linear filter; a nuclear correlation filtering (KCF) tracker; a mean shift (MeanShift) tracker; and a continuous adaptive mean shift (Camshift) tracker, and the type of state information is determined based on the particular model used. Therefore, the required prediction mechanism can be flexibly selected according to each application occasion.
The multi-target tracking method of the invention further comprises the following steps: storing all active targets as a target state list in a manner that the target numbers and the state information correspond to each other, and acquiring the predicted position information of the targets includes: deriving predicted location information for each active target in the target status list based on previous status information for that target, and updating the status information for the target based on the detected location of the target comprises: and updating the state information of the target entry based on the detection position of the target. Therefore, the change of the target state is comprehensively managed by introducing the state list, and the operating efficiency of the multi-target tracking frame is further improved.
Matching the acquired predicted location information and the detected location information may include: and taking the predicted target rectangular frame and the detected target rectangular frame which have the highest coincidence degree and are higher than a preset threshold value as a matching judgment standard. In addition, acquiring the detected position information of the target includes acquiring position information of a detected rectangular frame surrounding the target, and acquiring the predicted position information of the target includes acquiring position information of a predicted rectangular frame surrounding the target and number information of the target. When it is determined that the detected position information of a certain object matches the predicted position information of the certain object, updating the position state information of the object using the detected position of the object includes: assigning the number of the matched prediction target to the matched detection target, and updating the state information of the encoding target using the matched detection target.
The multi-target tracking method of the invention may further comprise: based on the matching result for a certain target, determining the marking state in the display image frame in the target, specifically, the method may include: when the continuous matching times or the matching frequency of the predicted position and the detected position of a certain target exceed a first threshold value, marking the target in a currently displayed image frame; and canceling the marking of the previously marked object in the currently displayed image frame in the case where it is determined that the number of times that the predicted position of a certain object is continuously free from the detected position matching therewith exceeds a second threshold value. Thereby improving the robustness of correctly displaying the appearing objects and hiding the missing objects.
In a case where it is determined that the predicted position of a certain target does not have a detected position matching therewith, determining the target as an extinction target includes: in the case where the number of times of determining that the predicted position of a certain target does not continuously have a detected position matching therewith exceeds the third threshold value, the target is determined to be a disappearing target. And in the case where it is determined that the predicted position of a certain target does not have a detected position matching therewith and the target has not been determined as a disappearing target, using the position of the target acquired by the tracker as the output position of the target. The tracker may be any one of the following: a nuclear correlation filtering (KCF) tracker; a mean shift (MeanShift) tracker; and a continuous adaptive mean shift (Camshift) tracker.
In a case where it is determined that the predicted position of a certain target does not have a detected position matching therewith, determining the target as an extinction target includes: storing the target number and the target characteristics of the casualty target into an casualty target list; and deleting the corresponding entry of the death target from a target state list, wherein the target state list comprises all the activity targets stored in a mode of corresponding target numbers and state information. Therefore, the scheduling efficiency of the multi-target tracking framework is further improved by combining the extinction list and the state list.
In the case where it is determined that the detected position of a certain target does not have a detected position matching therewith, determining that the target is a new target or a reappeared target includes: extracting the target feature of a certain target under the condition that the detection position of the target is judged not to have a detection position matched with the detection position; comparing the extracted target feature of the target with the target features stored in the extinction target list; under the condition that the matched target features exist in the casualty target list, the target is judged as a reappeared target, and the number of the matched target features is endowed to the target again; and under the condition that the matched target features do not exist in the death target list, judging the target as a new target, and assigning a new number to the target. Further, the step of unmatched detection may further comprise: deleting the entry corresponding to the reappeared target from the death target list; and/or storing the new target as a new entry into a target state list, wherein the target state list comprises all active targets stored in a manner that the target numbers correspond to the state information.
According to another aspect of the present invention, there is provided a multi-target tracking apparatus including: a plurality of single target trackers for predicting the position of a certain target in the current frame at least based on the previous state information of the target; a matching unit for comparing the target predicted position of the current frame given by each single target tracker with the target detection position of the current frame obtained from the outside; and a multi-target tracking unit for operating the single-target tracker based on the matching result of the matching unit, and further for: the matching unit updates the parameter of the single target tracker corresponding to the target based on the detected position of the target when it is determined that the detected position information of the target matches the predicted position information of the target, deletes the single target tracker corresponding to the target when it is determined that the predicted position of the target does not have a detected position matching the detected position, and creates a new single target tracker for the target when it is determined that the detected position of the target does not have a detected position matching the detected position.
Preferably, the apparatus may further comprise: and a target re-recognizer for, in a case where the matching unit determines that the detected position of a certain target does not have a detected position matching therewith, determining whether the target is a target that has appeared before, and the multi-target tracking unit reconstructs a previous single-target tracker for the target in a case where the target re-recognizer determines that the target is a target that has appeared before. The object re-recognizer also maintains a casual object list storing object numbers and object features determined to be casual, and determines whether the object is a previously-appeared object by comparing the extracted features of the object with the object features stored in the casual object list.
The multi-target tracking unit may also maintain a target state list including all active targets stored in a manner that the target numbers and the state information correspond.
The multi-target tracking device of the present invention may further include: and the display indicating unit is used for judging whether to outwards output an instruction for changing the marking state of the specific target or not based on the matching result of the matching unit for the specific target. The display indication unit may be further configured to: when the matching unit judges that the continuous matching times or the matching frequency of the predicted position and the detected position of a certain target exceeds a first threshold value, an instruction for marking the target in a currently displayed image frame is sent; and issuing an instruction to cancel the marking of a previously marked object in a currently displayed image frame in the case where the number of times the matching unit determines that the predicted position of a certain object is not continuously matched with the detected position exceeds a second threshold value.
The single target tracker may record the number of times that the predicted position of the target corresponding thereto is not continuously matched with the detected position, and the multi-target tracking unit determines that the target is a disappearing target and deletes the single target tracker corresponding to the disappearing target when the number of times of unmatched prediction recorded by the single target tracker exceeds a third threshold.
The single target tracker may acquire a target predicted position for matching with the target detected position using the filter model, and acquire the target predicted position for output using the tracker model in a case where the number of times of the unmatched predictions recorded has not reached a third threshold. The filter model may be established using at least one of the following models: a Kalman (Kalman) filter; and a linear filter, and the tracker model implements at least one of the following model building: a nuclear correlation filtering (KCF) tracker; a mean shift (MeanShift) tracker; and a continuous adaptive mean shift (Camshift) tracker, and the type of state information stored in the target state list is determined based on a model used and includes at least one of: the position, velocity, acceleration, deformation velocity, and/or template information of the object.
According to another aspect of the present invention, there is also provided a video analysis system, including: the frame buffer queue is used for storing the video image frames which are continuously input; the target detection module is used for processing continuous image frames from the frame buffer queue to determine the detection position information of a target contained in the current frame; the target tracking module realized by the multi-target tracking device is used for tracking the position of a moving target, matching the tracking position information of the obtained target with the detection position information of the target obtained by the target detection module, and adjusting the tracking operation based on the matching result; and the target analysis module is used for performing feature extraction operation on a certain target under the condition that the target is determined as a new target according to the detection position which is not matched with the detection position of the target.
The object detection module may select a currently input image frame for object detection at intervals from consecutive video image frames according to a predetermined rule.
The target detection module and the target analysis module are implemented at least in part by a GPU, FPGA, or ASIC circuit capable of performing high-parallelism computations. Preferably, the target detection module and the target analysis module may share at least part of a GPU, FPGA or ASIC circuit capable of performing convolutional neural network computations.
Therefore, the multi-target tracking framework provided by the invention can integrate the work of each module in the video analysis system by introducing the matching of the prediction and the detection position, and further by distinguishing the matching, unmatched detection and unmatched prediction and subsequent distinguishing operation, thereby reducing the calculation requirement and simultaneously keeping the tracking precision. Furthermore, the scheme can also improve the resource scheduling efficiency of the system on the whole by introducing the target state list and matching with the extinction target list, and eliminate unnecessary calculation requirements.
Drawings
The above and other objects, features and advantages of the present disclosure will become more apparent by describing in greater detail exemplary embodiments thereof with reference to the attached drawings, in which like reference numerals generally represent like parts throughout.
Fig. 1 shows the overall framework of a common real-time video structured intelligent analysis system.
FIG. 2 shows a schematic flow diagram of a multi-target tracking method according to one embodiment of the invention.
FIG. 3 shows a schematic flow diagram of determining that a predicted location matches a detected location, according to one embodiment of the invention.
FIG. 4 shows a schematic flow diagram of determining a detected location with which a predicted location does not match according to one embodiment of the invention.
FIG. 5 illustrates a schematic flow chart diagram of determining a predicted location with which a detected location does not match, according to one embodiment of the present invention.
FIG. 6 illustrates a schematic diagram of a multi-target tracking device, according to one embodiment of the invention.
Fig. 7 shows one example of a SoC that may be used to implement the video analysis system of the present invention.
Detailed Description
Preferred embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While the preferred embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
At present, the application bottleneck of target detection and tracking is how to efficiently extract video information, and how to perform standard data exchange, interconnection and intercommunication and semantic interoperation with other information systems. To solve this problem, video structured description techniques have been proposed. A video structural description technology is used for transforming the traditional video monitoring system, so that a new generation of intelligent, semantic and informative semantic video monitoring system of the video monitoring system is formed.
The video structured description is a technology for extracting video content information, which adopts processing means such as space-time segmentation, feature extraction, object identification and the like to organize text information which can be understood by a computer and people according to semantic relation. Fig. 1 shows the overall framework of a common real-time video structured intelligent analysis system.
As shown in fig. 1, the real-time video structured intelligent analysis system 20 collects a data stream from a data source 10, which may be a real-time input to a camera or a stored video file. The local system 20 performs structured analysis of the collected data stream and stores the corresponding analysis results in a local or remote database 30.
The real-time video structured intelligent analysis system 20 may include a video codec module 21, a frame buffer 22, and a video analysis module 23. Video codec module 21 encodes or decodes the data stream from data source 10 into specified format frame data. Frame buffer 22 buffers the video frame data for use by video analysis module 23.
The video analysis module 23 may be broadly divided into a target detection module, a target tracking module, and a target recognition and analysis module. The target detection module performs target detection on the input video stream by using a deep learning algorithm, and extracts information such as the position and the category of a target to be analyzed from the frame image. The target tracking module tracks and deduplicates the target output by the target detection module by utilizing a deep learning or traditional algorithm, so that the repeated operation of the target analysis module is avoided, the analysis quality is improved, and the analysis calculation amount is reduced. And the target recognition and analysis module extracts a target sub-image from the frame image according to the output result of the target detection module and analyzes each target by utilizing a deep learning algorithm. The specific analysis content may vary according to different application scenarios, and the common analysis content includes target identification comparison, target attribute analysis, and the like.
In recent years, with the rapid development of target detection algorithms and target attribute analysis algorithms, the accuracy of target detection and attribute analysis is higher and higher, but the calculation amount required by the target detection and attribute analysis is also larger and larger. When the algorithms need to be locally deployed at embedded terminals such as cameras and unmanned aerial vehicles for the sake of real-time performance and security, the calculation amount of the algorithms is limited to a smaller order of magnitude due to the power consumption limitation of the embedded terminals. Therefore, a multi-target tracking algorithm needs to be introduced, so that after a target position is given by a certain frame detection algorithm, the position of each target in a plurality of subsequent frames is given by the tracking algorithm, and meanwhile, the target attribute analysis only analyzes a certain target once in a plurality of continuous frames to obtain the attribute of the target, so that the calculation times of target detection and target attribute analysis are reduced by about one order of magnitude, and an algorithm with high enough precision is deployed under the limited calculation force of an embedded terminal.
In addition, because the current detection algorithm mainly relies on a still picture for training, the situation that a target frame obtained by detection shakes in a plurality of adjacent frames easily occurs in a video scene, and the stability of the detection algorithm is poor, a multi-target tracking algorithm also needs to be introduced, so that the shake of the target frame is reduced, and the stability of the detection algorithm is improved.
However, in the existing video analysis system, the multi-target tracking algorithm needs a target detection result of each frame, and needs to extract target features relatively frequently. These requirements are computationally expensive and difficult to withstand in embedded systems. In order to solve the problems, the invention provides a high-speed multi-target tracking framework, which can realize low-computation high-speed tracking and meet higher tracking precision requirements at the same time by reasonably allocating all functional modules.
FIG. 2 shows a schematic flow diagram of a multi-target tracking method according to one embodiment of the invention. It should be noted that the sequence numbers of the respective steps in the following methods are merely used as a representation of the steps for the convenience of description, and should not be construed as representing the execution order of the respective steps. The method need not be performed in the exact order shown, unless explicitly stated; similarly, blocks may be performed in parallel, rather than sequentially. It should also be understood that the method may be implemented on a variety of devices as well.
As shown in FIG. 2, a multi-target tracking method 200 according to one embodiment of the invention may include the following steps. The method 200 may be performed, for example, by the video analysis module 23 in the video analysis system 20 shown in fig. 1, and more specifically, by a target tracking module in the video analysis module 23. It should be understood that the existing video analytics system 20, if it has the multi-target tracking framework of the present invention deployed, is also a new video analytics system.
In step S210, detected position information of the target is acquired. In step S220, predicted position information of the target is acquired. It should be appreciated that steps S210 and S220 described above may be performed in any relative order, or may be performed simultaneously. Subsequently, in step S230, the acquired predicted position information and the detected position information are matched.
The method 200 gives different processing actions for different matching results. In the case where it is determined that the detected position information of a certain object matches the predicted position information of a certain object, the position state information of the object is updated using the detected position of the object in step S240. If it is determined that the predicted position of a certain target does not have a detected position matching the predicted position, the target is determined to be a disappearing target in step S250. In the case where it is determined that the detected position of a certain object does not have a detected position matching therewith, it is determined in step S260 that the object is a new object or a newly appeared object.
Thus, by matching prediction and detection and subsequent processing steps, various situations (e.g., object loss, reappearance, new object, etc.) encountered in object detection are flexibly coped with, and by reasonable cooperation of the detection, tracking and analysis modules, the amount of computation required to correctly display the object (and its attributes) is reduced while ensuring video analysis accuracy.
Step S210 may include performing object detection on the currently input image frame to acquire detection position information of at least one object. Here, the object detection may be performed for each image frame that is continuously input, or may be extracted from continuous video input frames at, for example, a predetermined or variable interval. For example, the object detection module in the system of fig. 1 may read image frames at predetermined intervals from the frame buffer 22 and process the image frames to determine the object included in the current image frame and the detected position information of the object. The above-described interval detection can greatly reduce the amount of computation required for video analysis compared to the prior art for which detection is required for each frame. In one embodiment, the object detection module may extract detection position information of a plurality of objects from the processed image frame. The extracted detection position information may preferably be a plurality of rectangular frames surrounding the respective targets.
The target prediction in step S220 may be performed for each activity target. Here, the "moving target" may refer to a target that is detected from a previous image frame and has not yet died out (died out refers to being determined not to be present in the image frame any more). In one embodiment, step S220 may include deriving predicted position information of a certain object based on previous state information of the object detected from a previous frame. Here, "previous frame" may have a flexible reference depending on the specific implementation. When the prediction is made based on the latest state only, the "previous frame" may be the last frame before the current frame in which the target was detected. When the prediction needs to be made based on the historical state of the target, "previous frame" may generally refer to a number of previous frames before the current frame in which the target was detected. When an object first appears (as will be described in detail below in connection with step S260), feature extraction may be performed on the object. For example, the target analysis module shown in fig. 1 may analyze the subgraph in the new target rectangular box using a deep learning algorithm to extract the required features and assign a number to the target. The initial characteristics and number of the new target may then be provided to a target prediction module for location prediction of the target based on the initial state information.
Preferably, the model can be modeled based on previous state information of a certain object to predict the position of the object in the current frame based on the state model of the object. After new detection information is obtained (e.g., after the predicted position and the detected position are matched as determined in step S240), the state model may be modified using the new detection information to provide a smoother motion trajectory.
In practice, one or more models (and their required state information) may be selected from various existing models to model the change in position of a target in a tracking manner, depending on the particular application.
In one embodiment, a Kalman filter may be used to predict and update the position of the target in the subsequent frames using the motion state information of the target. When the Kalman filter is used for prediction, the position of a target in a current frame needs to be predicted by modeling a historical motion track of the target, and after new detection information is obtained, a motion model of the target is corrected to give a smoother motion track. In one embodiment. In order to obtain input meeting the requirement of a Kalman filter, a rectangular frame obtained by detection needs to be processed to obtain four parameters of a horizontal coordinate and a vertical coordinate of a central point of the rectangular frame, the area of the rectangular frame and the width-height ratio of the rectangular frame, and the horizontal coordinate and the vertical coordinate of the rectangular frame and the area are assumed to move at uniform speed, and the width-height ratio is uniform linear motion, so that a 7 x 4-dimensional driving matrix is provided, other basic parameters are determined, and a filter which can be used in actual prediction is obtained. In another embodiment, the prediction may also be performed using a linear filter. The input of the filter can also be the horizontal and vertical coordinates, the area and the aspect ratio of the center point of the rectangular frame, but the least square method is carried out on the historical track during calculation to give the parameters of the filter. In the case of performing target motion trajectory prediction using the filter, the required previous state information may include target position information (for example, the four parameters may be obtained based on coordinates of two opposite corners of a rectangular frame), a target velocity, an acceleration, a deformation velocity, and the like. The above-mentioned state information may be acquired from the detected position information of the object, and optionally from the extracted object feature information.
In another embodiment, the position of the target may be predicted kernel updated with appearance information of the target position using KCF (kernel correlation filtering). KCF is a template matching type tracking method. After an initial position is given (e.g., the position of a certain object within the frame when the object is determined as a newly detected object), the KCF tracker may extract HOG (histogram of gradient) features of the object as template information of the object. In the subsequent frame, the characteristics are extracted by adopting a sliding window (sliding window) method near the position where the target appears in the previous frame, similarity calculation is carried out on the characteristics and the template information, and the position with the maximum approximation degree is found out and used as the position where the target newly appears. After the position of the current frame is determined, the template information is updated by using the new position to acquire new target information of the target. Therefore, in the prediction of the target position using the KCF tracker, the required previous state information may include previous template information of the target, and likewise, the above state information may be acquired from the detected position information of the target and optionally from the extracted target feature information. In other embodiments, the target location may also be predicted using median trackers (MeanShift) and (CamShift), which also use previous template information for prediction.
It should be understood that the target detection position information obtained by the present invention is the target position information obtained by performing the convolutional neural network detection calculation on the current frame, for example, a rectangular frame surrounding each target in the current frame; the target prediction information acquisition in the present invention predicts the target position of the current frame based on the target state information of the previous frame. Such predictions may or may not require computational processing of the subsequent frames themselves, in addition to the state information of the previous frame, based on the model on which the prediction depends. For example, in the case of using a filter (e.g., a kalman filter), a predicted position of the target may be directly obtained by predicting the motion trajectory, and the motion model correction may be continuously performed using the matched detected position. Therefore, the use of the kalman filter eliminates the need for additional processing on the frame itself where the target is predicted. Whereas in the case of using a tracker (e.g., a KCF, MeanShift, or CamShift tracker), in addition to the target template based on the state information of the previous frame, additional processing (e.g., sliding window processing) is required for the frame itself where the target is predicted to be located to determine the position of the target in the predicted frame. In an embodiment in which the calculation amount limit needs to be considered, it is preferable to acquire the predicted position of the target using a filter (e.g., kalman filter) that performs prediction using a model without involving calculation for the current frame. In a preferred embodiment, different prediction models may also be employed for different prediction scenarios. For example, as described below, a less computationally intensive filter model is used in the prediction that matches the detected position and does not require processing of the current frame, while a more computationally intensive tracker model is used for the prediction that is still required for output when the prediction does not match and requires analysis of the current frame.
In general, the predicted target of the current frame includes a plurality of targets predicted based on the previous information, and the detected target also includes a plurality of targets detected based on the current frame itself. Thus, the matching of step S230 may involve matching between a plurality of predicted targets and a plurality of detected targets. In a specific application, the matching determination in step S230 may be performed based on an IOU (interaction over intersection). Thus, in one embodiment, step S230 may include, as the match determination criterion, the prediction target rectangular frame and the detection target rectangular frame having the highest degree of coincidence and being higher than a predetermined threshold. For example, rectangular frames detected by a current frame may be grouped into a sequence, if the current frame has no detection result, the sequence is empty, meanwhile, a group of single-target trackers is maintained for a moving target, a target position corresponding to the current frame is given as a prediction frame, and then the two groups of target frames are subjected to IOU matching. Specifically, the IOU may be computed between any two rectangular boxes in the two sequences, the two corresponding boxes that are the largest and exceed a certain threshold are selected as a match, and the two boxes are deleted in the two sequences, and the process is repeated.
Since detection is made only for the current frame, the acquired detection position information of the target may include only position information of a detection rectangular frame that surrounds the target. And the detection is made based on at least the previous state information of the active object, so that acquiring the predicted position information of the object includes not only acquiring the position information of the predicted rectangular frame surrounding the object but also the number information of the object. In other words, the predicted target is a target that has been previously detected and numbered. Thus, when it is determined that the detected position information of a certain target matches the predicted position information of a certain target, the number of the matching predicted target is assigned to the matching detected target, and the state information of the encoding target, for example, the parameters of the kalman filter or the target template information of the KCF tracker used to predict the target, can be updated using the matching detected target.
In an extreme case, the number of detection target positions acquired in step S210 may be zero. In other words, when there is no target within the current frame or no target is detected using the SSD or yolo model, step S210 cannot acquire any detected target (and its corresponding location information) from within the current frame. Likewise, in an extreme case, the number of predicted target positions acquired in step S220 may be zero. In other words, when there is no target in the previous frame (i.e., the previous active target is zero), step S220 cannot acquire the position information of any prediction target from the previous frame. When no target position is acquired in both steps S210 and S220, no subsequent operation is performed, and the method jumps to processing for the next frame. When no detection target is acquired in step S210 and one or more predicted target positions are acquired in step S220, step S250 in which prediction is not matched may be performed for each of the predicted target positions. When none of the predicted targets is obtained in step S220 and one or more detected target positions are obtained in step S210, step S260 of detecting that the detected targets are not matched may be performed for each detected target.
The target is usually in a moving state, so that the false detection and the missed detection of the detector can be caused by the problems of poor shooting angle, occlusion and the like. Here, the detector "missing detection" means that the detector does not detect an object actually present in the current frame. The detector "false detection" means that the detector detects other objects as targets. In order to reduce the influence of the false detection and the missed detection of the detector on the tracking scheme, the multi-target tracking scheme can improve the robustness of the scheme by comprehensively considering the matching results of a plurality of continuous frames.
In one embodiment, the multi-target tracking method of the invention further includes determining the marking state in the display image frame in a certain target based on the matching result for the target. In one embodiment, rectangular boxes surrounding matched objects may be displayed on the output monitor, for example, in real-time, or rectangular boxes of unmatched objects may be dismissed from display on the monitor. As described above, in order to reduce the influence of the erroneous detection of the detector on the correct labeling, a certain target may be labeled in the currently displayed image frame when the number of consecutive matches or the frequency of matches of the predicted position and the detected position for the target exceeds the first threshold. For example, a marker for an object, such as a rectangular box surrounding the object, may not be displayed on the monitor until the detected and predicted positions of the object match three consecutive times. It is also possible that the rectangular frame surrounding a certain target is displayed on the monitor after the detected and predicted position of the target is successfully matched more than four times in the six consecutive matching determinations. Likewise, to reduce the effect of detector miss-detection on correct registration, a previously registered target may be de-registered in a currently displayed image frame if the number of times that a predicted position of the target is determined to have no consecutive detected positions matching it exceeds a second threshold. With respect to the target rectangular frame that has been displayed on the monitor, the display of the above-described target rectangular frame is canceled only in a case where, for example, no detected position matches the predicted position three times in succession.
Also in order to avoid the influence of the missed detection of the detector, in step S250, in the case where the number of times that the predicted position of a certain target is determined to have no detected position matching therewith continuously exceeds the third threshold value, the target may be determined to be a disappearing target. The third threshold value may preferably be greater than the second threshold value. For example, with respect to a target rectangular frame that has been displayed in the monitor, if no detected position matches the predicted position three consecutive times (i.e., the second threshold is 3), the display of the above-described target rectangular frame is cancelled. If the next two times there is still no detected position matching the predicted position (i.e., the third threshold is 5), the target for which the predicted position is intended is determined to be a death target, i.e., the numbered target is no longer considered to be an active target.
Here, since the detection of the mismatch is not directly equal to the determination target extinction but the extinction is determined in the case where the number of times of the mismatch detection satisfies a certain threshold, it is possible to distinguish between the conventional prediction of the match with the detected position and the prediction that still needs to be output in the case of the mismatch prediction. For example, the use of the kalman filter and the linear filter does not require additional processing on the frame itself where the prediction target is located, and the amount of calculation required for prediction is small, and therefore, it can be used for conventional prediction calculation matching the detection target. Whereas in the case of using a tracker (e.g., a KCF, MeanShift, or CamShift tracker), in addition to the target template based on the state information of the previous frame, additional processing (e.g., sliding window processing) is required for the frame itself where the target is predicted to be located to determine the position of the target in the predicted frame. It is clear that the amount of computation required for prediction (also a detection for the current frame in a sense) using a tracker is larger, but also more accurate. Thus, in the case where the predicted position lacks correction of the detected position and the predicted position still needs to be output, a tracker model that is more accurate and actually involves processing of the current frame can be used.
In order to improve the efficiency of the multi-target tracking scheme, the movable targets can be managed in a unified mode. Therefore, in one embodiment, the multi-target tracking method of the present invention further comprises: all the active targets are stored as a target state list in a mode that the target numbers and the state information correspond to each other, so that the management efficiency of multiple targets is greatly facilitated.
Thus, in step S220, obtaining the predicted location information of the target may include obtaining the previous state information of each active target in the list of states of the targets to obtain the predicted location information of the target. In step S240, when the predicted position of a certain object matches a certain detected position in the current frame, the detected position may be used to update the state information of the object under the predicted object number entry. In step S250, when the predicted position of a certain target does not have a detection position matching it in the current frame, the target may be determined as a death target and the corresponding entry may be removed from the target state list. Under the above-mentioned more robust determination criterion for death, when the predicted position of a certain target does not have a detection position matching with the predicted position in the current frame, the number of times that the target is not matched with the detection position may be recorded in the target state list, and when the number of times that the target is not matched reaches the above-mentioned third threshold, the target may be determined as a death target and the corresponding entry may be removed from the target state list. Similarly, in step S260, if a detected object is determined as a new object, a new number may be assigned to the object and stored as a new entry in the object status list.
The target state list of the present invention may also be used in conjunction with a casualty target list, thereby further enhancing the overall efficiency of the present invention. Here, the extinction target list is used to store targets that are determined to be extinct, thereby optimizing the processing of reoccurring targets. Then, step S250 may include storing the object number and the object feature of the extinction object in an extinction object list; and deleting the corresponding entry of the death target from the target state list. In step S260, the list of disappearing targets may be used for a determination whether the target of the unmatched detection is a reoccurring target or a new target. Thus, step S260 may include extracting a target feature of a certain target (for example, using the target analysis module shown in fig. 1) in a case where it is determined that the detected position of the target does not have a detected position matching therewith; comparing the extracted target feature of the target with the target features stored in the extinction target list; under the condition that the matched target features exist in the casualty target list, the target is judged as a reappeared target, and the number of the matched target features is endowed to the target again; and under the condition that the matched target features do not exist in the death target list, judging the target as a new target, and assigning a new number to the target. Correspondingly, deleting the entry corresponding to the reappeared target from the death target list; and/or establishing an entry for the new target in the target status list.
The multi-target tracking method of the present invention and the preferred embodiment thereof have been described above in connection with fig. 2. To further facilitate understanding, a specific example of operation under different match determinations is given below in conjunction with fig. 3-5.
FIG. 3 shows a schematic flow diagram of determining that a predicted location matches a detected location, according to one embodiment of the invention. As shown in fig. 3, step S240 of the multi-target tracking method of the present invention may include the following sub-steps. In sub-step S341, the status information in the target status list in the corresponding entry of the target is updated by using the position given by the detector. In sub-step S342, it is determined whether the target meets a rendering condition, e.g., whether a consecutive match reaches a first threshold. In sub-step S343, if the rendering condition is satisfied, a display instruction is output to the outside (e.g., a monitor). The display instruction may include rectangular box position information and an object number that surround the object. In the case where the number needs to be displayed, the display instruction may also cause the number of the object to be displayed, for example, in the upper left corner of the rectangular frame. In sub-step S344, if the display condition is not satisfied, the display instruction is not output to the outside. When the object is not yet displayed, the object remains hidden. In another embodiment, it may also be determined in sub-step S342 whether the target is already displayed, and the display is maintained if already displayed. Subsequently, in step S345, if the display status indication of the target is included in the target status list, the target entry may be continuously updated based on the step of displaying or not. If the update is only made when there is a change in the display condition, it is located directly after the appearance branch, i.e. after sub-step S343, as shown in fig. 3 at sub-step S345. If the hiding also involves updating the list, the substep S345 can also be located on the backbone after the two branches converge.
FIG. 4 shows a schematic flow diagram of determining a detected location with which a predicted location does not match according to one embodiment of the invention. As shown in fig. 4, step S250 of the multi-target tracking method of the present invention may include the following sub-steps. Since the current frame target position given according to the previous state information of a certain target cannot match with all target positions given by the detector, it can be considered that the target does not have the real position given by the detector in the current frame. Thus, in sub-step S451, the target corresponding entry in the target status list is updated, e.g., the number of detected non-matches is updated. Subsequently, in sub-step S452, it is determined whether the target satisfies a hidden condition, e.g., whether consecutive non-matches reach a second threshold. In sub-step S453, if the concealment condition is satisfied, a concealment instruction is output to the outside (e.g., a monitor). The hiding instructions may include numbering information for the object so that, for example, the monitor hides the corresponding object it is displaying. In sub-step S454, if the hiding condition is not satisfied, the hiding instruction is not output to the outside. When the object has not been hidden, the object remains displayed. In another embodiment, it can also be determined in sub-step S452 whether the target has been hidden, and the hiding is maintained if the target has been hidden. Subsequently, in sub-step S455, it is determined whether the target meets a casualty condition, e.g., whether consecutive mismatches reach a third threshold. In sub-step S456, if the extinction condition is satisfied, the extinction target list and the target state list are updated. For example, the number of the target and its information are added to the list of disappearing targets, and the target entry is deleted from the list of target states. In one embodiment, after determining that the extinction condition is satisfied, a feature of the target may be further computed to determine whether the target is a newly-extinct feature. In sub-step S457, if the extinction condition is not satisfied, the target entry may be updated for the number of non-matches, provided that the target status list includes a target to-be-extinct status indication.
FIG. 5 illustrates a schematic flow chart diagram of determining a predicted location with which a detected location does not match, according to one embodiment of the present invention. As shown in fig. 5, step S260 of the multi-target tracking method of the present invention may include the following sub-steps. Since a certain object position detected by the detector cannot match the predicted positions given by all the state information, the object given by the detector can be considered as a new object for the previous frame. Then, it is necessary to further determine whether the target has ever appeared. In sub-step S561, the feature of the target is calculated, for example, the target sub-graph is subjected to feature extraction by using the target detection module shown in fig. 1 to find the feature of the target. Subsequently, a feature determination is made in sub-step S562. Specifically, the feature is compared with each target feature stored in the extinction target list to determine whether they are the same. In sub-step S563, if the feature matches a target feature in the casual target list, the detected feature is assigned the number of the matching feature in the casual list. Subsequently, in sub-step S565, the corresponding entry in the extinction target list is deleted. If the feature does not match all of the target features in the list of erased targets, the detected target is identified as a new target and assigned a new number, via step 564. Subsequently, whether it is a reoccurring target or a new target, in sub-step S566 the target status list is updated, i.e., the target is added as a new entry to the target status list. In specific application, a deep convolutional network can be adopted to extract features of a target subgraph, whether two target features are close to a metric function which can be constructed by deep learning is judged, and when the score is higher than a certain threshold value, a target which has appeared before is considered to be the target which reappears.
It should be appreciated that the multi-target tracking method described above in connection with fig. 2-5 may be made by a target tracking module in a video analytics system for each frame of an update. When the target detection module detects a target for each video image frame that is continuously input, the target tracking module also tracks the target for each frame and performs the matching determination as described above to determine the state of each target.
More preferably, the multi-target tracking method of the invention is particularly suitable for situations where computational power is limited. For example, when the target detection module cannot perform target detection on each input image frame under the limitation of effort and power consumption, the target detection module may, for example, detect an interval of one frame every ten input frames (the interval is fixed), or detect only key frames (the interval is not fixed) meeting the requirement, keep continuously tracking each target by using the target tracking module of the present invention, and perform the above-mentioned matching and subsequent operations of the present invention for each frame with detection update. Thus, when the target is in the display state, for example, when a corresponding rectangular frame is displayed in the monitor, in the frame without the detection result, the continuous display of the rectangular frame is maintained using the prediction result, and in the frame with the detection result, the display of the rectangular frame is updated using the detection result (and in the case where the display condition is satisfied).
In one embodiment, the multi-target tracking scheme of the present invention can also be implemented as a multi-target tracking device, which can be a novel target analysis module in a video analysis platform for executing the multi-target tracking method, or a relatively independent target tracking device. The multi-target tracking device can be realized as a software functional module, can be realized based on a specific special hardware circuit, and can also be realized as a part of a special SoC for intelligent video analysis.
FIG. 6 illustrates a schematic diagram of a multi-target tracking device, according to one embodiment of the invention. As shown in fig. 6, the multi-target tracking apparatus 600 may include a plurality of single target trackers 610, a matching unit 620, and a multi-target tracking unit 630.
Each single target tracker 610 is configured to perform position tracking on a current active target, which predicts a position of the target in a current frame based on at least previous state information of the target.
The matching unit 620 is used to compare the target predicted position of the current frame given by each single target tracker 610 with the target detection position of the current frame obtained from the outside (e.g., target detection module). The multi-target tracking unit 630 may then operate all of the single-target trackers 610 based on the matching results of the matching units to achieve efficient and accurate tracking of multiple active targets. Specifically, in the case where the matching unit 620 determines that the detected position information of a certain target matches the predicted position information of a certain target, the multi-target tracking unit 630 may update the parameters of the single-target tracker 610 corresponding to the target based on the detected position of the target. In the case where the matching unit 620 determines that the predicted position of a certain target does not have a detected position matching therewith, the multi-target tracking unit 630 may delete the single-target tracker 610 corresponding to the target. In the case where the matching unit 620 determines that the detected position of a certain target does not have a detected position matching therewith, the multi-target tracking unit 630 may create a new single-target tracker 610 for the target.
Similarly, the multi-target tracking unit 630 may maintain a target state list including all active targets stored in a manner corresponding to the target number and the state information. The multi-target tracking unit 630 may update corresponding entries of the target status list, create new entries, or delete erased entries based on the matching result, thereby facilitating management of all the single-target trackers 610.
The single-target tracker may be built using at least one of the following models based on the state information of the corresponding number: a Kalman (Kalman) filter; a nuclear correlation filtering (KCF) tracker; a mean shift (MeanShift) tracker; and a continuous adaptive mean shift (Camshift) tracker, and the type of state information stored in the target state list is determined based on a model used and includes at least one of: the position, velocity, acceleration, deformation velocity, and/or template information of the object. As described above, in the case of modeling using the kalman filter, prediction can be achieved based on the horizontal and vertical coordinates of the center point of the rectangular frame, the area and aspect ratio of the rectangular frame, and the corresponding speed, without the need for calculation processing of the subsequent frames themselves. When the MeanShift tracking is used, each subsequent frame needs to be processed to update the template information of the corresponding target.
In one embodiment, the multi-target tracking device 600 may further include a target re-recognizer (not shown) to assist the multi-target tracking unit 630 in further operations on the unmatched detection targets. Specifically, the target re-recognizer may determine whether a certain target is a previously-appeared target in a case where the matching unit 620 determines that the detected position of the target does not have a detected position matching therewith, and the multi-target tracking unit 630 may reconstruct a previous single-target tracker for the target in a case where the target re-recognizer determines that the target is a previously-appeared target. Specifically, when the matching unit 620 determines an unmatched detection target, the detection target may be sent to an external target analysis module for feature extraction. Accordingly, the target re-identifier may maintain a casual target list storing target numbers and target features determined to be casual. Thus, the target re-recognizer may compare the extracted features of the unmatched detection targets with the target features in the list of disappearing targets and determine whether the target is a re-emerging target.
As described above, one single target tracker is maintained for each active target. This single target tracker process requires the maintenance of the correspondence of the number and the target box in addition to the position prediction using, for example, a kalman filter or a KCF tracker. In order to improve the robustness of the system, alleviate false alarm and missed detection of a detection algorithm on certain pictures and improve the tracking effect, a single-target tracker can adopt a form similar to Schmidt penalty, a target is considered to be lost when a plurality of continuous frames of targets are lost, and a target is considered to be present when a plurality of continuous frames of targets are present. Therefore, the single-target tracker can organically combine the filter, the tracker and the target number together, and simultaneously expose a corresponding interface for the upper multi-target tracking frame, so that the whole multi-target tracking model can be conveniently realized.
For this reason, the multi-target tracking unit 620 may establish and delete the single-target tracker without depending on a single unmatched result, and perform corresponding actions only when a predetermined number of times is satisfied. In one embodiment, the single target tracker 610 may record the number of times that the predicted position of its corresponding target does not continuously match the detected position, and the multi-target tracking unit 630 may determine that the target is a disappearing target and delete the single target tracker corresponding to the disappearing target if the number of times that the single target tracker records that the predicted position does not match exceeds a third threshold. In this case, a filter for acquiring a predicted position that matches the detected position, and a tracker for outputting the predicted position in the case of no match detection may be included in the single target tracker 610. In one embodiment, the filter may be a less computationally intensive Kalman filter or linear filter that does not require processing of the current frame itself. The tracker may then be any one of the following: a nuclear correlation filtering (KCF) tracker; a mean shift (MeanShift) tracker; and a continuous adaptive mean shift (Camshift) tracker. The tracker described above needs to process the current frame (without involving the target detection of neural network calculations), and therefore enables a more accurate target position prediction to meet the accuracy requirements required for outputting a rectangular frame, for example.
In addition, in one embodiment, the multi-target tracking device of the present invention may further include a display indication unit (not shown) configured to determine whether to output an instruction to change the indication state for a specific target to the outside based on the matching result of the matching unit for the specific target. Similarly, in order to eliminate the influence of false detection and missed detection of the detection unit on the target display correctness, the target display can be performed after a plurality of continuous frame targets appear, and the target is lost after a plurality of continuous frame targets are lost. Thus, the display indication unit may be further configured to: when the matching unit 620 determines that the number of consecutive matches or the matching frequency of the predicted position and the detected position for a certain target exceeds a first threshold, an instruction to mark the target in the currently displayed image frame is issued; and in a case where the matching unit 620 determines that the number of times that the predicted position of a certain object is not continuously matched with the detected position thereof exceeds the second threshold value, an instruction to cancel the marking of the previously marked object in the currently displayed image frame is issued.
The multi-target tracking framework described in connection with fig. 6 may be implemented as a target analysis module within the video analysis system shown in fig. 1, thereby resulting in a new video analysis system. The system may include: the frame buffer queue is used for storing the video image frames which are continuously input; the target detection module is used for processing continuous image frames from the frame buffer queue to determine the detection position information of a target contained in the current frame; the target tracking module implemented by the multi-target tracking device described above with reference to fig. 6 is configured to track the position of the moving target, match the tracking position information of the acquired target with the detection position information of the target acquired by the target detection module, and adjust the tracking operation based on the matching result; and the target analysis module is used for performing feature extraction operation on a certain target under the condition that the target is determined as a new target according to the detection position which is not matched with the detection position of the target.
A video analysis system incorporating the multi-target tracking framework of the present invention may allow a target detection module to select image frames at intervals from consecutive video image frames for target detection according to predetermined rules, thereby reducing the computational requirements for target detection. By matching the target detection result with the target prediction result and performing corresponding updating, deleting and adding operations, the precision requirement of the video analysis system can be met.
In practical use, part or all of the functions of the video analysis system can be realized by digital circuits. The target detection operation performed by the target detection module and/or the target analysis operation performed by the target analysis module are implemented at least in part using neural network computations. For example, the object detection module and the object analysis module described above are implemented at least in part by GPU, FPGA or ASIC circuitry capable of performing high-parallelism computations. In a preferred embodiment, the target detection module and the target analysis module share at least part of a GPU, FPGA or ASIC circuit capable of performing convolutional neural network computations.
In one embodiment, the video analytics system of the present invention may be implemented in a system on a chip (SoC) that includes a general purpose processor, memory, and digital circuitry. Fig. 7 shows one example of a SoC that may be used to implement the video analysis system of the present invention.
In one embodiment, the deep learning network required by the present system, such as a convolutional neural network, may be implemented by a digital circuit portion (e.g., an FPGA or ASIC chip) on the SoC. For example, the FPGA circuit or the ASIC chip is used to implement part or all of the target detection module and the target analysis module in the video analysis system of the present invention. Because CNNs perform parallel computations, implementing target detection and attribute analysis functions via logic hardware or custom circuitry has natural computational advantages and can achieve lower power consumption than software implementations.
In one embodiment, all the parameters related to CNN obtained by the previous training may be stored in a memory (e.g., a main memory) of the system on chip, and when the target detection is performed subsequently, the parameters of the CNN layers are first read from the main memory to perform neural network calculation on the input image, thereby obtaining the non-linear feature. Subsequently, a large number of consecutive features (e.g., features for all channels of a particular region) are read from the main memory into the cache module of the logical hardware at once. Therefore, the time delay caused by reading data when calculating the next area can be reduced, and the utilization rate of reading the main memory each time is increased, thereby improving the overall calculation efficiency. The buffer module of the logic hardware can comprise a frame buffer queue and a target library to be analyzed, and the image frame parameters (such as life marker values) are reasonably set, so that the image frame can be efficiently stored in a maximized mode and the key frame can be timely reserved, and the video analysis efficiency is improved under the condition that the buffer requirement is not increased.
It is understood that the various components included in the video analysis system of the present invention, such as the frame buffer module, the object detection module, and the object analysis module, may be implemented wholly or partially in hardware, or wholly or partially in software. In one embodiment, the frame buffer module may be a buffer module of logic hardware, the deep learning module in the target detection module and the target analysis module may be implemented by logic hardware, and the specific operation of the multi-target tracking framework of the present invention may be implemented in the form of a thread under the control of a processor.
Furthermore, the method according to the invention may also be implemented as a computer program or computer program product comprising computer program code instructions for carrying out the above-mentioned steps defined in the above-mentioned method of the invention.
Alternatively, the invention may also be embodied as a non-transitory machine-readable storage medium (or computer-readable storage medium, or machine-readable storage medium) having stored thereon executable code (or a computer program, or computer instruction code) which, when executed by a processor of an electronic device (or computing device, server, etc.), causes the processor to perform the steps of the above-described method according to the invention.
Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the disclosure herein may be implemented as electronic hardware, computer software, or combinations of both.
In addition, references to "first," "second," and "third" in this disclosure are intended to distinguish between the different references. For example, the terms "first threshold", "second threshold", and "third threshold" are intended to lack thresholds for different applications, and their respective values may be the same or different.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems and methods according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Having described embodiments of the present invention, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (33)

CN201810821668.0A2018-07-242018-07-24Multi-target tracking method and corresponding video analysis systemPendingCN110751674A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201810821668.0ACN110751674A (en)2018-07-242018-07-24Multi-target tracking method and corresponding video analysis system

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201810821668.0ACN110751674A (en)2018-07-242018-07-24Multi-target tracking method and corresponding video analysis system

Publications (1)

Publication NumberPublication Date
CN110751674Atrue CN110751674A (en)2020-02-04

Family

ID=69275543

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201810821668.0APendingCN110751674A (en)2018-07-242018-07-24Multi-target tracking method and corresponding video analysis system

Country Status (1)

CountryLink
CN (1)CN110751674A (en)

Cited By (25)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN110910428A (en)*2019-12-052020-03-24江苏中云智慧数据科技有限公司Real-time multi-target tracking method based on neural network
CN111428642A (en)*2020-03-242020-07-17厦门市美亚柏科信息股份有限公司Multi-target tracking algorithm, electronic device and computer readable storage medium
CN111563438A (en)*2020-04-282020-08-21厦门市美亚柏科信息股份有限公司Target duplication eliminating method and device for video structuring
CN111723769A (en)*2020-06-302020-09-29北京百度网讯科技有限公司 Method, apparatus, device and storage medium for processing images
CN111768427A (en)*2020-05-072020-10-13普联国际有限公司 A kind of multi-moving target tracking method, device and storage medium
CN111950218A (en)*2020-07-022020-11-17深圳市兴森快捷电路科技股份有限公司 A Circuit for Realizing Target Tracking Algorithm Based on FPGA
CN112053381A (en)*2020-07-132020-12-08北京迈格威科技有限公司Image processing method, image processing device, electronic equipment and storage medium
CN112070802A (en)*2020-09-022020-12-11合肥英睿系统技术有限公司Target tracking method, device, equipment and computer readable storage medium
CN112215155A (en)*2020-10-132021-01-12北京中电兴发科技有限公司Face tracking method and system based on multi-feature fusion
CN112528932A (en)*2020-12-222021-03-19北京百度网讯科技有限公司Method and device for optimizing position information, road side equipment and cloud control platform
CN112929662A (en)*2021-01-292021-06-08中国科学技术大学Coding method for solving object overlapping problem in code stream structured image coding method
CN113052876A (en)*2021-04-252021-06-29合肥中科类脑智能技术有限公司Video relay tracking method and system based on deep learning
CN113112526A (en)*2021-04-272021-07-13北京百度网讯科技有限公司Target tracking method, device, equipment and medium
CN113192106A (en)*2021-04-252021-07-30深圳职业技术学院Livestock tracking method and device
CN114004861A (en)*2020-07-282022-02-01华为技术有限公司 Target tracking method and related system, storage medium, intelligent driving vehicle
CN114299309A (en)*2020-09-212022-04-08西门子(中国)有限公司 A target tracking method, apparatus and computer readable medium
CN114463368A (en)*2021-12-312022-05-10科大讯飞股份有限公司Target tracking method and device, electronic equipment and computer readable storage medium
CN114650453A (en)*2022-04-022022-06-21北京中庆现代技术股份有限公司Target tracking method, device, equipment and medium applied to classroom recording and broadcasting
CN115018880A (en)*2022-06-102022-09-06浙江大华技术股份有限公司Method and device for determining detection frame information, storage medium and electronic device
CN115272125A (en)*2022-08-052022-11-01中国电信股份有限公司Target tracking method, target tracking device, storage medium, and electronic apparatus
CN115564796A (en)*2022-08-312023-01-03广州广日电气设备有限公司 An image processing method, device and equipment
CN115965657A (en)*2023-02-282023-04-14安徽蔚来智驾科技有限公司Target tracking method, electronic device, storage medium, and vehicle
CN116363162A (en)*2022-12-272023-06-30浙江大华技术股份有限公司 Target tracking method, electronic device and storage medium
EP4195149A4 (en)*2020-08-062024-07-10Bigo Technology Pte. Ltd. TARGET DETECTION AND TRACKING METHOD AND DEVICE, ELECTRONIC DEVICE AND STORAGE MEDIUM
CN114463368B (en)*2021-12-312025-10-10科大讯飞股份有限公司 Target tracking method and device, electronic device, and computer-readable storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN102065275A (en)*2009-11-172011-05-18中国科学院电子学研究所Multi-target tracking method in intelligent video monitoring system
US20150009323A1 (en)*2013-07-032015-01-08Zmodo Technology Shenzhen Corp. LtdMulti-target tracking method for video surveillance
CN108053427A (en)*2017-10-312018-05-18深圳大学A kind of modified multi-object tracking method, system and device based on KCF and Kalman

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN102065275A (en)*2009-11-172011-05-18中国科学院电子学研究所Multi-target tracking method in intelligent video monitoring system
US20150009323A1 (en)*2013-07-032015-01-08Zmodo Technology Shenzhen Corp. LtdMulti-target tracking method for video surveillance
CN108053427A (en)*2017-10-312018-05-18深圳大学A kind of modified multi-object tracking method, system and device based on KCF and Kalman

Cited By (39)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN110910428B (en)*2019-12-052022-04-01江苏中云智慧数据科技有限公司Real-time multi-target tracking method based on neural network
CN110910428A (en)*2019-12-052020-03-24江苏中云智慧数据科技有限公司Real-time multi-target tracking method based on neural network
CN111428642A (en)*2020-03-242020-07-17厦门市美亚柏科信息股份有限公司Multi-target tracking algorithm, electronic device and computer readable storage medium
CN111563438A (en)*2020-04-282020-08-21厦门市美亚柏科信息股份有限公司Target duplication eliminating method and device for video structuring
CN111563438B (en)*2020-04-282022-08-12厦门市美亚柏科信息股份有限公司Target duplication eliminating method and device for video structuring
CN111768427B (en)*2020-05-072023-12-26普联国际有限公司Multi-moving-object tracking method, device and storage medium
CN111768427A (en)*2020-05-072020-10-13普联国际有限公司 A kind of multi-moving target tracking method, device and storage medium
CN111723769A (en)*2020-06-302020-09-29北京百度网讯科技有限公司 Method, apparatus, device and storage medium for processing images
CN111723769B (en)*2020-06-302023-10-27北京百度网讯科技有限公司Method, apparatus, device and storage medium for processing image
CN111950218A (en)*2020-07-022020-11-17深圳市兴森快捷电路科技股份有限公司 A Circuit for Realizing Target Tracking Algorithm Based on FPGA
CN112053381B (en)*2020-07-132024-11-15爱芯元智半导体股份有限公司 Image processing method, device, electronic device and storage medium
CN112053381A (en)*2020-07-132020-12-08北京迈格威科技有限公司Image processing method, image processing device, electronic equipment and storage medium
CN114004861A (en)*2020-07-282022-02-01华为技术有限公司 Target tracking method and related system, storage medium, intelligent driving vehicle
WO2022021924A1 (en)*2020-07-282022-02-03华为技术有限公司Target tracking method and related system, storage medium, and intelligent driving vehicle
CN114004861B (en)*2020-07-282023-04-07华为技术有限公司Target tracking method, related system, storage medium and intelligent driving vehicle
EP4195149A4 (en)*2020-08-062024-07-10Bigo Technology Pte. Ltd. TARGET DETECTION AND TRACKING METHOD AND DEVICE, ELECTRONIC DEVICE AND STORAGE MEDIUM
CN112070802B (en)*2020-09-022024-01-26合肥英睿系统技术有限公司 A target tracking method, device, equipment and computer-readable storage medium
WO2022048053A1 (en)*2020-09-022022-03-10合肥英睿系统技术有限公司Target tracking method, apparatus, and device, and computer-readable storage medium
CN112070802A (en)*2020-09-022020-12-11合肥英睿系统技术有限公司Target tracking method, device, equipment and computer readable storage medium
CN114299309A (en)*2020-09-212022-04-08西门子(中国)有限公司 A target tracking method, apparatus and computer readable medium
CN112215155B (en)*2020-10-132022-10-14北京中电兴发科技有限公司Face tracking method and system based on multi-feature fusion
CN112215155A (en)*2020-10-132021-01-12北京中电兴发科技有限公司Face tracking method and system based on multi-feature fusion
CN112528932A (en)*2020-12-222021-03-19北京百度网讯科技有限公司Method and device for optimizing position information, road side equipment and cloud control platform
CN112528932B (en)*2020-12-222023-12-08阿波罗智联(北京)科技有限公司Method and device for optimizing position information, road side equipment and cloud control platform
CN112929662A (en)*2021-01-292021-06-08中国科学技术大学Coding method for solving object overlapping problem in code stream structured image coding method
CN113052876A (en)*2021-04-252021-06-29合肥中科类脑智能技术有限公司Video relay tracking method and system based on deep learning
CN113192106A (en)*2021-04-252021-07-30深圳职业技术学院Livestock tracking method and device
CN113112526B (en)*2021-04-272023-09-22北京百度网讯科技有限公司Target tracking method, device, equipment and medium
CN113112526A (en)*2021-04-272021-07-13北京百度网讯科技有限公司Target tracking method, device, equipment and medium
CN114463368A (en)*2021-12-312022-05-10科大讯飞股份有限公司Target tracking method and device, electronic equipment and computer readable storage medium
CN114463368B (en)*2021-12-312025-10-10科大讯飞股份有限公司 Target tracking method and device, electronic device, and computer-readable storage medium
CN114650453B (en)*2022-04-022023-08-15北京中庆现代技术股份有限公司Target tracking method, device, equipment and medium applied to classroom recording and broadcasting
CN114650453A (en)*2022-04-022022-06-21北京中庆现代技术股份有限公司Target tracking method, device, equipment and medium applied to classroom recording and broadcasting
CN115018880A (en)*2022-06-102022-09-06浙江大华技术股份有限公司Method and device for determining detection frame information, storage medium and electronic device
CN115272125A (en)*2022-08-052022-11-01中国电信股份有限公司Target tracking method, target tracking device, storage medium, and electronic apparatus
CN115564796A (en)*2022-08-312023-01-03广州广日电气设备有限公司 An image processing method, device and equipment
CN116363162A (en)*2022-12-272023-06-30浙江大华技术股份有限公司 Target tracking method, electronic device and storage medium
CN116363162B (en)*2022-12-272025-07-25浙江大华技术股份有限公司Target tracking method, electronic device and storage medium
CN115965657A (en)*2023-02-282023-04-14安徽蔚来智驾科技有限公司Target tracking method, electronic device, storage medium, and vehicle

Similar Documents

PublicationPublication DateTitle
CN110751674A (en)Multi-target tracking method and corresponding video analysis system
US10672131B2 (en)Control method, non-transitory computer-readable storage medium, and control apparatus
CN110610510B (en) Target tracking method, device, electronic device and storage medium
JP6759411B2 (en) Object tracking method and equipment
AU2019343959B2 (en)Region proposal with tracker feedback
Wang et al.Hidden‐Markov‐models‐based dynamic hand gesture recognition
US8705861B2 (en)Context processor for video analysis system
CN111798487B (en)Target tracking method, apparatus and computer readable storage medium
CN112712051B (en)Object tracking method, device, computer equipment and storage medium
CN113112526B (en)Target tracking method, device, equipment and medium
US11790661B2 (en)Image prediction system
CN111652181B (en)Target tracking method and device and electronic equipment
CN113762027B (en)Abnormal behavior identification method, device, equipment and storage medium
CN106971401A (en)Multiple target tracking apparatus and method
Yuan et al.Multiple object detection and tracking from drone videos based on GM-YOLO and multi-tracker
WO2020213099A1 (en)Object detection/tracking device, method, and program recording medium
CN113537077A (en)Label multi-Bernoulli video multi-target tracking method based on feature pool optimization
CN111950507B (en)Data processing and model training method, device, equipment and medium
Du et al.Exploring the state-of-the-art in multi-object tracking: A comprehensive survey, evaluation, challenges, and future directions
CN112200831B (en)Dynamic template-based dense connection twin neural network target tracking method
CN117830355B (en) A target re-tracking method, device, equipment and computer-readable storage medium
CN112686931A (en)Image recognition method, garbage dumping behavior detection method, equipment and medium
CN115994925B (en)Multi-row person rapid tracking method based on key point detection
CN111681264A (en)Real-time multi-target tracking method for monitoring scene
CN115565099A (en) A video-based object detection method and related equipment

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
TA01Transfer of patent application right

Effective date of registration:20200909

Address after:Unit 01-19, 10 / F, 101, 6 / F, building 5, yard 5, Anding Road, Chaoyang District, Beijing 100029

Applicant after:Xilinx Electronic Technology (Beijing) Co.,Ltd.

Address before:100083, 17 floor, four building four, 1 Wang Zhuang Road, Haidian District, Beijing.

Applicant before:BEIJING DEEPHI INTELLIGENT TECHNOLOGY Co.,Ltd.

TA01Transfer of patent application right
WD01Invention patent application deemed withdrawn after publication

Application publication date:20200204

WD01Invention patent application deemed withdrawn after publication

[8]ページ先頭

©2009-2025 Movatter.jp