Movatterモバイル変換


[0]ホーム

URL:


CN106952288B - Based on convolution feature and global search detect it is long when block robust tracking method - Google Patents

Based on convolution feature and global search detect it is long when block robust tracking method
Download PDF

Info

Publication number
CN106952288B
CN106952288BCN201710204379.1ACN201710204379ACN106952288BCN 106952288 BCN106952288 BCN 106952288BCN 201710204379 ACN201710204379 ACN 201710204379ACN 106952288 BCN106952288 BCN 106952288B
Authority
CN
China
Prior art keywords
target
scale
model
init
extract
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201710204379.1A
Other languages
Chinese (zh)
Other versions
CN106952288A (en
Inventor
李映
林彬
杭涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwestern Polytechnical University
Original Assignee
Northwestern Polytechnical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwestern Polytechnical UniversityfiledCriticalNorthwestern Polytechnical University
Priority to CN201710204379.1ApriorityCriticalpatent/CN106952288B/en
Publication of CN106952288ApublicationCriticalpatent/CN106952288A/en
Application grantedgrantedCritical
Publication of CN106952288BpublicationCriticalpatent/CN106952288B/en
Expired - Fee Relatedlegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

Translated fromChinese

本发明涉及一种基于卷积特征和全局搜索检测的长时遮挡鲁棒跟踪方法,通过在跟踪模块中使用卷积特征和多尺度相关滤波方法,增强了跟踪目标外观模型的特征表达能力,使得跟踪结果对于光照变化、目标尺度变化、目标旋转等因素具有很强的鲁棒性;又通过引入的全局搜索检测机制,使得当目标被长时遮挡导致跟踪失败时,检测模块可以再次检测到目标,使跟踪模块从错误中恢复过来,这样即使在目标外观变化的情况下,也能够被长时间持续地跟踪。

The invention relates to a long-term occlusion robust tracking method based on convolution features and global search detection. By using convolution features and multi-scale correlation filtering methods in the tracking module, the feature expression ability of the tracking target appearance model is enhanced, so that The tracking results are highly robust to factors such as illumination changes, target scale changes, and target rotations; and through the introduction of a global search detection mechanism, when the target is blocked for a long time and the tracking fails, the detection module can detect the target again , to make the tracking module recover from errors, so that it can be tracked continuously for a long time even when the target appearance changes.

Description

Translated fromChinese
基于卷积特征和全局搜索检测的长时遮挡鲁棒跟踪方法Robust tracking method for long-term occlusion based on convolutional features and global search detection

技术领域technical field

本发明属计算机视觉领域,涉及一种目标跟踪方法,具体涉及一种基于卷积特征和全局搜索检测的长时遮挡鲁棒跟踪方法。The invention belongs to the field of computer vision and relates to a target tracking method, in particular to a long-term occlusion robust tracking method based on convolution features and global search detection.

背景技术Background technique

目标跟踪的主要任务是获取视频序列中特定目标的位置与运动信息,在视频监控、人机交互等领域具有广泛的应用。跟踪过程中,光照变化、背景复杂、目标发生旋转或缩放等因素都会增加目标跟踪问题的复杂性,尤其是当目标被长时遮挡时,则更容易导致跟踪失败。The main task of target tracking is to obtain the position and motion information of a specific target in a video sequence, which has a wide range of applications in video surveillance, human-computer interaction and other fields. During the tracking process, factors such as illumination changes, complex background, target rotation or scaling will increase the complexity of the target tracking problem, especially when the target is blocked for a long time, it is more likely to cause tracking failure.

文献“Tracking-learning-detection,IEEE Transactions on PatternAnalysis and Machine Intelligence,2012,34(7):1409-1422”提出的跟踪方法(简称TLD)首次将传统的跟踪算法和检测算法结合起来,利用检测结果完善跟踪结果,提高了系统的可靠性和鲁棒性。其跟踪算法基于光流法,检测算法产生大量的检测窗口,对于每个检测窗口,都必须被三个检测器接受才能成为最后的检测结果。针对遮挡问题,TLD提供了一个切实有效的解决思路,能够对目标进行长时跟踪(Long-term Tracking)。但是,TLD使用的是浅层的人工特征,对目标的表征能力有限,且检测算法的设计也较为复杂,有一定的改进空间。The tracking method (TLD for short) proposed in the document "Tracking-learning-detection, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(7): 1409-1422" combines the traditional tracking algorithm and detection algorithm for the first time, and uses the detection results Improve the tracking results, improve the reliability and robustness of the system. Its tracking algorithm is based on the optical flow method, and the detection algorithm generates a large number of detection windows. For each detection window, it must be accepted by three detectors to become the final detection result. For the occlusion problem, TLD provides a practical and effective solution, which can track the target for a long time (Long-term Tracking). However, TLD uses shallow artificial features, which have limited ability to represent targets, and the design of detection algorithms is relatively complex, so there is room for improvement.

发明内容Contents of the invention

要解决的技术问题technical problem to be solved

为了避免现有技术的不足之处,本发明提出一种基于卷积特征和全局搜索检测的长时遮挡鲁棒跟踪方法,解决视频运动目标在跟踪过程中由于长时遮挡或目标平移出视野之外等因素造成外观模型漂移,从而易导致跟踪失败的问题。In order to avoid the deficiencies of the prior art, the present invention proposes a long-term occlusion robust tracking method based on convolutional features and global search detection to solve the problems caused by long-term occlusion or the target moving out of the field of view during the tracking process of video moving targets. External factors cause the appearance model to drift, which easily leads to the problem of tracking failure.

技术方案Technical solutions

一种基于卷积特征和全局搜索检测的长时遮挡鲁棒跟踪方法,其特征在于步骤如下:A long-term occlusion robust tracking method based on convolution features and global search detection, characterized in that the steps are as follows:

步骤1读取视频中第一帧图像数据以及目标所在的初始位置信息[x,y,w,h],其中x,y表示目标中心的横坐标和纵坐标,w,h表示目标的宽和高。将(x,y)对应的坐标点记为P,以P为中心,大小为w×h的目标初始区域记为Rinit,再将目标的尺度记为scale,初始化为1。Step 1 Read the first frame of image data in the video and the initial position information of the target [x, y, w, h], where x, y represent the abscissa and ordinate of the target center, w, h represent the width and high. Record the coordinate point corresponding to (x, y) as P, take P as the center, and mark the initial area of the target with a size of w×h as Rinit , and record the scale of the target as scale, which is initialized to 1.

步骤2以P为中心,确定一个包含目标及背景信息的区域Rbkg,Rbkg的大小为M×N,M=2w,N=2h。采用VGGNet-19作为CNN模型,在第5层卷积层(conv5-4)对R'提取卷积特征图ztarget_init。然后根据ztarget_init构建跟踪模块的目标模型t∈{1,2,...,T},T为CNN模型通道数,计算方法如下:Step 2. Taking P as the center, determine a region Rbkg containing target and background information. The size of Rbkg is M×N, M=2w, N=2h. VGGNet-19 is used as the CNN model, and the convolutional feature map ztarget_init is extracted from R' in the 5th convolutional layer (conv5-4). Then build the target model of the tracking module according to ztarget_init t∈{1,2,...,T}, T is the number of CNN model channels, the calculation method is as follows:

其中,大写的变量为相应的小写变量在频域上的表示,高斯滤波模板m,n为高斯函数自变量,m∈{1,2,...,M},n∈{1,2,...,N},σtarget为高斯核的带宽,⊙代表元素相乘运算,上划线表示复共轭,λ1为调整参数(为了避免分母为0而引入),设定为0.0001。Among them, the uppercase variable is the representation of the corresponding lowercase variable in the frequency domain, and the Gaussian filter template m, n are the independent variables of the Gaussian function, m∈{1,2,...,M}, n∈{1,2,...,N}, σtarget is the bandwidth of the Gaussian kernel, ⊙ represents multiplication of elements, the overline represents complex conjugation, and λ1 is an adjustment parameter (introduced to avoid the denominator being 0), which is set to 0.0001.

步骤3以P为中心,提取S个不同尺度的图像子块,S设定为33。每个子块的大小为w×h×s,变量s为图像子块的尺度因子,s∈[0.7,1.4]。然后分别提取每个图像子块的HOG特征,合并后成为一个S维的HOG特征向量,这里将其命名为尺度特征向量,记为zscale_init。再根据zscale_init构建跟踪模块的尺度模型Wscale,计算方法与步骤2中计算类似(将尺度特征向量替换掉卷积特征图),具体如下:Step 3 takes P as the center and extracts S image sub-blocks of different scales, and S is set to 33. The size of each sub-block is w×h×s, and the variable s is the scale factor of the image sub-block, s∈[0.7,1.4]. Then extract the HOG features of each image sub-block separately, and merge them into an S-dimensional HOG feature vector, which is named as the scale feature vector and recorded as zscale_init . Then build the scale model Wscale of the tracking module according to zscale_init , and the calculation method is the same as that calculated in step 2 Similar (replacing the scale feature vector with the convolution feature map), as follows:

其中,s'为高斯函数自变量,s'∈{1,2,...,S},σscale为高斯核的带宽,λ2为调整参数,设定为0.0001。in, s' is the independent variable of the Gaussian function, s'∈{1,2,...,S}, σscale is the bandwidth of the Gaussian kernel, λ2 is an adjustment parameter, which is set to 0.0001.

步骤4对目标初始区域Rinit提取灰度特征,得到的灰度特征表示是一个二维矩阵,这里将该矩阵命名为目标外观表示矩阵,记为Ak,下标k表示当前帧数,初始时k=1。然后将检测模块的滤波模型D初始化为A1,即D=A1,再初始化历史目标表示矩阵集合Ahis。Ahis的作用是存储当前及之前每一帧的目标外观表示矩阵,即Ahis={A1,A2,...,Ak},初始时Ahis={A1}。Step 4 extracts the grayscale features of the target initial region Rinit , and the obtained grayscale feature representation is a two-dimensional matrix. Here, the matrix is named the target appearance representation matrix, denoted as Ak , the subscript k represents the current frame number, and the initial When k=1. Then the filter model D of the detection module is initialized to A1 , that is, D=A1 , and then the historical target representation matrix set Ahis is initialized. The function of Ahis is to store the target appearance representation matrix of the current and previous frames, that is, Ahis ={A1 , A2 ,...,Ak }, initially Ahis ={A1 }.

步骤5读取下一帧图像,仍然以P为中心,提取大小为Rbkg×scale的经过尺度缩放后的目标搜索区域。然后通过步骤2中的CNN网络提取目标搜索区域的卷积特征,并以双边插值的方式采样到Rbkg的大小得到当前帧的卷积特征图ztarget_cur,再利用目标模型计算目标置信图ftarget,计算方法如下:Step 5: Read the next frame of image, still take P as the center, and extract the scaled target search area whose size is Rbkg × scale. Then use the CNN network in step 2 to extract the convolution features of the target search area, and sample to the size of Rbkg by bilateral interpolation to obtain the convolution feature map ztarget_cur of the current frame, and then use the target model Calculate the target confidence map ftarget , the calculation method is as follows:

其中,为傅里叶逆变换。最后更新P的坐标,将(x,y)修正为ftarget中的最大响应值所对应的坐标:in, is the inverse Fourier transform. Finally, update the coordinates of P, and correct (x,y) to the coordinates corresponding to the maximum response value in ftarget :

步骤6以P为中心,提取S个不同尺度的图像子块,然后分别提取每个图像子块的HOG特征,合并后得到当前帧的尺度特征向量zscale_cur(同步骤3中zscale_init的计算方法)。再利用尺度模型Wscale计算尺度置信图:Step 6 takes P as the center, extracts S image sub-blocks of different scales, then extracts the HOG features of each image sub-block respectively, and obtains the scale feature vector zscale_cur of the current frame after merging (same as the calculation method of zscale_init in step 3 ). Then use the scale model Wscale to calculate the scale confidence map:

最后更新目标的尺度scale,计算方法如下:Finally, the scale of the target is updated, and the calculation method is as follows:

至此,可以得到跟踪模块在当前帧(第k帧)的输出:以坐标为(x,y)的P为中心,大小为Rinit×scale的图像子块TPatchk。另外,将已经计算完成的ftarget中的最大响应值简记为TPeakk,即TPeakk=ftarget(x,y)。So far, the output of the tracking module in the current frame (the kth frame) can be obtained: an image sub-block TPatchk whose size is Rinit ×scale centered on P whose coordinates are (x, y). In addition, the calculated maximum response value in ftarget is abbreviated as TPeakk , that is, TPeakk =ftarget (x, y).

步骤7检测模块以全局搜索的方式将滤波模型D与当前帧的整幅图像进行卷积,计算滤波模型D与当前帧各个位置的相似程度。取响应度最高的前j个值(j设定为10),并分别以这j个值对应的位置点为中心,提取大小为Rinit×scale的j个图像子块。将这j个图像子块作为元素,生成一个图像子块集合DPatchesk,即检测模块在第k帧的输出。Step 7: The detection module convolutes the filter model D with the entire image of the current frame by means of global search, and calculates the degree of similarity between the filter model D and each position of the current frame. Take the first j values with the highest responsivity (j is set to 10), and extract j image sub-blocks with a size of Rinit × scale centered on the position points corresponding to these j values. Taking these j image sub-blocks as elements, generate a set of image sub-blocks DPatchesk , which is the output of the detection module at the kth frame.

步骤8对检测模块输出的集合DPatchesk中各图像子块,分别计算其与跟踪模块输出的TPatchk之间的像素重叠率,可以得到j个值,将其中最高的值记为如果小于阈值(设定为0.05),判定为目标被完全遮挡,需要抑制跟踪模块在模型更新时的学习率β,并转步骤9;否则按初始学习率βinitinit设定为0.02)进行更新,并转步骤10。β的计算公式如下:Step 8 For each image sub-block in the set DPatchesk output by the detection module, calculate the pixel overlap rate between it and the TPatchk output by the tracking module, and j values can be obtained, and the highest value is recorded as if less than threshold ( set to 0.05), it is determined that the target is completely occluded, and it is necessary to suppress the learning rate β of the tracking module when the model is updated, and go to step 9; otherwise, update according to the initial learning rate βinitinit is set to 0.02), and Go to step 10. The calculation formula of β is as follows:

步骤9根据DPatchesk中各图像子块的中心,分别提取大小为Rbkg×scale的j个目标搜索区域,按照步骤5中的方法对每一个目标搜索区域提取卷积特征图并计算目标置信图,可以得到j个目标搜索区域上的最大响应值。在这j个响应值中再进行比较,将最大的值记为DPeakk。如果DPeakk大于TPeakk,则再次更新P的坐标,将(x,y)修正为DPeakk所对应的坐标。并重新计算目标尺度特征向量和目标尺度scale(同步骤6中的计算方式)。Step 9 According to the center of each image sub-block in DPatchesk , respectively extract j target search areas with a size of Rbkg × scale, and extract the convolution feature map for each target search area according to the method in step 5 and calculate the target confidence map , the maximum response value on j target search areas can be obtained. Then compare among the j response values, and record the largest value as DPeakk . If DPeakk is greater than TPeakk , update the coordinates of P again, and correct (x, y) to the coordinates corresponding to DPeakk . And recalculate the target scale feature vector and target scale scale (same calculation method in step 6).

步骤10目标在当前帧最优的位置中心确定为P,最优尺度确定为scale。在图像中标示出新的目标区域Rnew,即以P为中心,宽和高分别为w×scale、h×scale的矩形框。另外,将已经计算完成、并且能够得到最优目标位置中心P的卷积特征图简记为ztarget;同样,将能够得到最优目标尺度scale的尺度特征向量简记为zscaleIn step 10, the center of the target's optimal position in the current frame is determined as P, and the optimal scale is determined as scale. A new target region Rnew is marked in the image, that is, a rectangular frame with P as the center and width and height of w×scale and h×scale respectively. In addition, the convolution feature map that has been calculated and can obtain the optimal target position center P is abbreviated as ztarget ; similarly, the scale feature vector that can obtain the optimal target scale scale is abbreviated as zscale .

步骤11利用ztarget、zscale,以及上一帧建立的跟踪模块中的目标模型和尺度模型Wscale,分别以加权求和的方式进行模型更新,计算方法如下:Step 11 uses ztarget , zscale , and the target model in the tracking module established in the previous frame and the scale model Wscale , the model is updated by weighted summation respectively, and the calculation method is as follows:

其中,β为步骤8计算后的学习率。Among them, β is the learning rate calculated in step 8.

步骤12对新的目标区域Rnew提取灰度特征后得到当前帧的目标外观表示矩阵Ak,将Ak加入到历史目标表示矩阵集合Ahis。如果集合Ahis中元素个数大于c(c设定为20),则从Ahis中随机选择c个元素生成一个三维矩阵Ck,Ck(:,i)对应的是Ahis中任意一个元素(即二维矩阵Ak);否则用Ahis中所有元素生成矩阵Ck。然后对Ck进行平均化得到二维矩阵,将这个二维矩阵作为检测模块新的滤波模型D,计算方法如下:Step 12: After extracting the gray features of the new target region Rnew , the target appearance representation matrix Ak of the current frame is obtained, and adding Ak to the historical target representation matrix set Ahis . If the number of elements in the set Ahis is greater thanc (c is set to 20), then randomly selectc elements from Ahis to generate a three-dimensional matrix Ck , and Ck (:,i) corresponds to any one of Ahis elements (that is, two-dimensional matrix Ak ); otherwise, all elements in Ahis are used to generate matrix Ck . Then Ck is averaged to obtain a two-dimensional matrix, and this two-dimensional matrix is used as the new filtering model D of the detection module, and the calculation method is as follows:

步骤13判断是否处理完视频中所有的图像帧,若处理完则算法结束,否则转步骤5继续执行。Step 13 judges whether all image frames in the video have been processed, and if processed, the algorithm ends, otherwise go to step 5 and continue to execute.

有益效果Beneficial effect

本发明提出的一种基于卷积特征和全局搜索检测的长时遮挡鲁棒跟踪方法,分别设计了跟踪模块和检测模块,跟踪过程中,两个模块协同工作:跟踪模块主要利用卷积神经网络(Convolutional Neural Network,CNN)提取目标的卷积特征用于构建鲁棒的目标模型,并通过方向梯度直方图(Histogram of Oriented Gradient,HOG)特征构建尺度模型,结合相关滤波方法来分别确定目标的位置中心和尺度;检测模块提取灰度特征构建目标的滤波模型,采用全局搜索的方式在整幅图像中对目标进行快速检测并判断遮挡的发生,一旦目标被完全遮挡(或其它因素导致目标外观剧烈变化),检测模块利用检测结果修正跟踪目标的位置,并抑制跟踪模块的模型更新,防止引入不必要的噪声从而导致模型漂移和跟踪失败。A long-term occlusion robust tracking method based on convolution features and global search detection proposed by the present invention, the tracking module and the detection module are designed respectively. During the tracking process, the two modules work together: the tracking module mainly uses the convolutional neural network (Convolutional Neural Network, CNN) extracts the convolutional features of the target to build a robust target model, and constructs a scale model through the Histogram of Oriented Gradient (HOG) feature, and combines the correlation filtering method to determine the target's Position center and scale; the detection module extracts the grayscale features to construct the filter model of the target, and uses the global search method to quickly detect the target in the entire image and judge the occurrence of occlusion. Once the target is completely occluded (or other factors lead to the appearance of the target drastic changes), the detection module uses the detection results to correct the position of the tracking target, and suppresses the model update of the tracking module to prevent the introduction of unnecessary noise that leads to model drift and tracking failure.

优越性:通过在跟踪模块中使用卷积特征和多尺度相关滤波方法,增强了跟踪目标外观模型的特征表达能力,使得跟踪结果对于光照变化、目标尺度变化、目标旋转等因素具有很强的鲁棒性;又通过引入的全局搜索检测机制,使得当目标被长时遮挡导致跟踪失败时,检测模块可以再次检测到目标,使跟踪模块从错误中恢复过来,这样即使在目标外观变化的情况下,也能够被长时间持续地跟踪。Advantages: By using convolution features and multi-scale correlation filtering methods in the tracking module, the feature expression ability of the tracking target appearance model is enhanced, so that the tracking results are highly robust to factors such as illumination changes, target scale changes, and target rotations. Robustness; through the introduction of the global search detection mechanism, when the target is blocked for a long time and the tracking fails, the detection module can detect the target again, so that the tracking module can recover from the error, so that even if the target appearance changes , can also be tracked continuously for a long time.

附图说明Description of drawings

图1:基于卷积特征和全局搜索检测的长时遮挡鲁棒跟踪方法流程图Figure 1: Flowchart of a long-duration occlusion robust tracking method based on convolutional features and global search detection

具体实施方式Detailed ways

现结合实施例、附图对本发明作进一步描述:Now in conjunction with embodiment, accompanying drawing, the present invention will be further described:

步骤1读取视频中第一帧图像数据以及目标所在的初始位置信息[x,y,w,h],其中x,y表示目标中心的横坐标和纵坐标,w,h表示目标的宽和高。将(x,y)对应的坐标点记为P,以P为中心,大小为w×h的目标初始区域记为Rinit,再将目标的尺度记为scale,初始化为1。Step 1 Read the first frame of image data in the video and the initial position information of the target [x, y, w, h], where x, y represent the abscissa and ordinate of the target center, w, h represent the width and high. Record the coordinate point corresponding to (x, y) as P, take P as the center, and mark the initial area of the target with a size of w×h as Rinit , and record the scale of the target as scale, which is initialized to 1.

步骤2以P为中心,确定一个包含目标及背景信息的区域Rbkg,Rbkg的大小为M×N,M=2w,N=2h。采用VGGNet-19作为CNN模型,在第5层卷积层(conv5-4)对R'提取卷积特征图ztarget_init。然后根据ztarget_init构建跟踪模块的目标模型t∈{1,2,...,T},T为CNN模型通道数,计算方法如下:Step 2. Taking P as the center, determine a region Rbkg containing target and background information. The size of Rbkg is M×N, M=2w, N=2h. VGGNet-19 is used as the CNN model, and the convolutional feature map ztarget_init is extracted from R' in the 5th convolutional layer (conv5-4). Then build the target model of the tracking module according to ztarget_init t∈{1,2,...,T}, T is the number of CNN model channels, the calculation method is as follows:

其中,大写的变量为相应的小写变量在频域上的表示,高斯滤波模板m,n为高斯函数自变量,m∈{1,2,...,M},n∈{1,2,...,N},σtarget为高斯核的带宽,⊙代表元素相乘运算,上划线表示复共轭,λ1为调整参数(为了避免分母为0而引入),设定为0.0001。Among them, the uppercase variable is the representation of the corresponding lowercase variable in the frequency domain, and the Gaussian filter template m, n are the independent variables of the Gaussian function, m∈{1,2,...,M}, n∈{1,2,...,N}, σtarget is the bandwidth of the Gaussian kernel, ⊙ represents multiplication of elements, the overline represents complex conjugation, and λ1 is an adjustment parameter (introduced to avoid the denominator being 0), which is set to 0.0001.

步骤3以P为中心,提取S个不同尺度的图像子块,S设定为33。每个子块的大小为w×h×s,变量s为图像子块的尺度因子,s∈[0.7,1.4]。然后分别提取每个图像子块的HOG特征,合并后成为一个S维的HOG特征向量,这里将其命名为尺度特征向量,记为zscale_init。再根据zscale_init构建跟踪模块的尺度模型Wscale,计算方法与步骤2中计算类似(将尺度特征向量替换掉卷积特征图),具体如下:Step 3 takes P as the center and extracts S image sub-blocks of different scales, and S is set to 33. The size of each sub-block is w×h×s, and the variable s is the scale factor of the image sub-block, s∈[0.7,1.4]. Then extract the HOG features of each image sub-block separately, and merge them into an S-dimensional HOG feature vector, which is named as the scale feature vector and recorded as zscale_init . Then build the scale model Wscale of the tracking module according to zscale_init , and the calculation method is the same as that calculated in step 2 Similar (replacing the scale feature vector with the convolution feature map), as follows:

其中,s'为高斯函数自变量,s'∈{1,2,...,S},σscale为高斯核的带宽,λ2为调整参数,设定为0.0001。in, s' is the independent variable of the Gaussian function, s'∈{1,2,...,S}, σscale is the bandwidth of the Gaussian kernel, λ2 is an adjustment parameter, which is set to 0.0001.

步骤4对目标初始区域Rinit提取灰度特征,得到的灰度特征表示是一个二维矩阵,这里将该矩阵命名为目标外观表示矩阵,记为Ak,下标k表示当前帧数,初始时k=1。然后将检测模块的滤波模型D初始化为A1,即D=A1,再初始化历史目标表示矩阵集合Ahis。Ahis的作用是存储当前及之前每一帧的目标外观表示矩阵,即Ahis={A1,A2,...,Ak},初始时Ahis={A1}。Step 4 extracts the grayscale features of the target initial region Rinit , and the obtained grayscale feature representation is a two-dimensional matrix. Here, the matrix is named the target appearance representation matrix, denoted as Ak , the subscript k represents the current frame number, and the initial When k=1. Then the filter model D of the detection module is initialized to A1 , that is, D=A1 , and then the historical target representation matrix set Ahis is initialized. The function of Ahis is to store the target appearance representation matrix of the current and previous frames, that is, Ahis ={A1 , A2 ,...,Ak }, initially Ahis ={A1 }.

步骤5读取下一帧图像,仍然以P为中心,提取大小为Rbkg×scale的经过尺度缩放后的目标搜索区域。然后通过步骤2中的CNN网络提取目标搜索区域的卷积特征,并以双边插值的方式采样到Rbkg的大小得到当前帧的卷积特征图ztarget_cur,再利用目标模型计算目标置信图ftarget,计算方法如下:Step 5: Read the next frame of image, still take P as the center, and extract the scaled target search area whose size is Rbkg × scale. Then use the CNN network in step 2 to extract the convolution features of the target search area, and sample to the size of Rbkg by bilateral interpolation to obtain the convolution feature map ztarget_cur of the current frame, and then use the target model Calculate the target confidence map ftarget , the calculation method is as follows:

其中,为傅里叶逆变换。最后更新P的坐标,将(x,y)修正为ftarget中的最大响应值所对应的坐标:in, is the inverse Fourier transform. Finally, update the coordinates of P, and correct (x,y) to the coordinates corresponding to the maximum response value in ftarget :

步骤6以P为中心,提取S个不同尺度的图像子块,然后分别提取每个图像子块的HOG特征,合并后得到当前帧的尺度特征向量zscale_cur(同步骤3中zscale_init的计算方法)。再利用尺度模型Wscale计算尺度置信图:Step 6 takes P as the center, extracts S image sub-blocks of different scales, then extracts the HOG features of each image sub-block respectively, and obtains the scale feature vector zscale_cur of the current frame after merging (same as the calculation method of zscale_init in step 3 ). Then use the scale model Wscale to calculate the scale confidence map:

最后更新目标的尺度scale,计算方法如下:Finally, the scale of the target is updated, and the calculation method is as follows:

至此,可以得到跟踪模块在当前帧(第k帧)的输出:以坐标为(x,y)的P为中心,大小为Rinit×scale的图像子块TPatchk。另外,将已经计算完成的ftarget中的最大响应值简记为TPeakk,即TPeakk=ftarget(x,y)。So far, the output of the tracking module in the current frame (the kth frame) can be obtained: an image sub-block TPatchk whose size is Rinit ×scale centered on P whose coordinates are (x, y). In addition, the calculated maximum response value in ftarget is abbreviated as TPeakk , that is, TPeakk =ftarget (x, y).

步骤7检测模块以全局搜索的方式将滤波模型D与当前帧的整幅图像进行卷积,计算滤波模型D与当前帧各个位置的相似程度。取响应度最高的前j个值(j设定为10),并分别以这j个值对应的位置点为中心,提取大小为Rinit×scale的j个图像子块。将这j个图像子块作为元素,生成一个图像子块集合DPatchesk,即检测模块在第k帧的输出。Step 7: The detection module convolutes the filter model D with the entire image of the current frame by means of global search, and calculates the degree of similarity between the filter model D and each position of the current frame. Take the first j values with the highest responsivity (j is set to 10), and extract j image sub-blocks with a size of Rinit × scale centered on the position points corresponding to these j values. Taking these j image sub-blocks as elements, generate a set of image sub-blocks DPatchesk , which is the output of the detection module at the kth frame.

步骤8对检测模块输出的集合DPatchesk中各图像子块,分别计算其与跟踪模块输出的TPatchk之间的像素重叠率,可以得到j个值,将其中最高的值记为如果小于阈值(设定为0.05),判定为目标被完全遮挡,需要抑制跟踪模块在模型更新时的学习率β,并转步骤9;否则按初始学习率βinitinit设定为0.02)进行更新,并转步骤10。β的计算公式如下:Step 8 For each image sub-block in the set DPatchesk output by the detection module, calculate the pixel overlap rate between it and the TPatchk output by the tracking module, and j values can be obtained, and the highest value is recorded as if less than threshold ( set to 0.05), it is determined that the target is completely occluded, and it is necessary to suppress the learning rate β of the tracking module when the model is updated, and go to step 9; otherwise, update according to the initial learning rate βinitinit is set to 0.02), and Go to step 10. The calculation formula of β is as follows:

步骤9根据DPatchesk中各图像子块的中心,分别提取大小为Rbkg×scale的j个目标搜索区域,按照步骤5中的方法对每一个目标搜索区域提取卷积特征图并计算目标置信图,可以得到j个目标搜索区域上的最大响应值。在这j个响应值中再进行比较,将最大的值记为DPeakk。如果DPeakk大于TPeakk,则再次更新P的坐标,将(x,y)修正为DPeakk所对应的坐标。并重新计算目标尺度特征向量和目标尺度scale(同步骤6中的计算方式)。Step 9 According to the center of each image sub-block in DPatchesk , respectively extract j target search areas with a size of Rbkg × scale, and extract the convolution feature map for each target search area according to the method in step 5 and calculate the target confidence map , the maximum response value on j target search areas can be obtained. Then compare among the j response values, and record the largest value as DPeakk . If DPeakk is greater than TPeakk , update the coordinates of P again, and correct (x, y) to the coordinates corresponding to DPeakk . And recalculate the target scale feature vector and target scale scale (same calculation method in step 6).

步骤10目标在当前帧最优的位置中心确定为P,最优尺度确定为scale。在图像中标示出新的目标区域Rnew,即以P为中心,宽和高分别为w×scale、h×scale的矩形框。另外,将已经计算完成、并且能够得到最优目标位置中心P的卷积特征图简记为ztarget;同样,将能够得到最优目标尺度scale的尺度特征向量简记为zscaleIn step 10, the center of the target's optimal position in the current frame is determined as P, and the optimal scale is determined as scale. A new target region Rnew is marked in the image, that is, a rectangular frame with P as the center and width and height of w×scale and h×scale respectively. In addition, the convolution feature map that has been calculated and can obtain the optimal target position center P is abbreviated as ztarget ; similarly, the scale feature vector that can obtain the optimal target scale scale is abbreviated as zscale .

步骤11利用ztarget、zscale,以及上一帧建立的跟踪模块中的目标模型和尺度模型Wscale,分别以加权求和的方式进行模型更新,计算方法如下:Step 11 uses ztarget , zscale , and the target model in the tracking module established in the previous frame and the scale model Wscale , the model is updated by weighted summation respectively, and the calculation method is as follows:

其中,β为步骤8计算后的学习率。Among them, β is the learning rate calculated in step 8.

步骤12对新的目标区域Rnew提取灰度特征后得到当前帧的目标外观表示矩阵Ak,将Ak加入到历史目标表示矩阵集合Ahis。如果集合Ahis中元素个数大于c(c设定为20),则从Ahis中随机选择c个元素生成一个三维矩阵Ck,Ck(:,i)对应的是Ahis中任意一个元素(即二维矩阵Ak);否则用Ahis中所有元素生成矩阵Ck。然后对Ck进行平均化得到二维矩阵,将这个二维矩阵作为检测模块新的滤波模型D,计算方法如下:Step 12: After extracting the gray features of the new target region Rnew , the target appearance representation matrix Ak of the current frame is obtained, and adding Ak to the historical target representation matrix set Ahis . If the number of elements in the set Ahis is greater thanc (c is set to 20), then randomly selectc elements from Ahis to generate a three-dimensional matrix Ck , and Ck (:,i) corresponds to any one of Ahis elements (that is, two-dimensional matrix Ak ); otherwise, all elements in Ahis are used to generate matrix Ck . Then Ck is averaged to obtain a two-dimensional matrix, and this two-dimensional matrix is used as the new filtering model D of the detection module, and the calculation method is as follows:

步骤13判断是否处理完视频中所有的图像帧,若处理完则算法结束,否则转步骤5继续执行。Step 13 judges whether all image frames in the video have been processed, and if processed, the algorithm ends, otherwise go to step 5 and continue to execute.

Claims (6)

Translated fromChinese
1.一种基于卷积特征和全局搜索检测的长时遮挡鲁棒跟踪方法,其特征在于步骤如下:1. A long-term occlusion robust tracking method based on convolution features and global search detection, characterized in that the steps are as follows:步骤1:读取视频中第一帧图像数据以及目标所在的初始位置信息[x,y,w,h],其中x,y表示目标中心的横坐标和纵坐标,w,h表示目标的宽和高;将(x,y)对应的坐标点记为P,以P为中心,大小为w×h的目标初始区域记为Rinit,再将目标的尺度记为scale,初始化为1;Step 1: Read the first frame of image data in the video and the initial position information of the target [x, y, w, h], where x, y represent the abscissa and ordinate of the center of the target, and w, h represent the width of the target and height; record the coordinate point corresponding to (x, y) as P, take P as the center, and mark the initial area of the target with a size of w×h as Rinit , then record the scale of the target as scale, and initialize it to 1;步骤2:以P为中心,确定一个包含目标及背景信息的区域Rbkg,Rbkg的大小为M×N,M=2w,N=2h;采用VGGNet-19作为CNN模型,在第5层卷积层即conv5-4层对Rbkg提取卷积特征图ztarget_init;然后根据ztarget_init构建跟踪模块的目标模型t∈{1,2,...,T},T为CNN模型通道数,计算方法如下:Step 2: With P as the center, determine a region Rbkg containing target and background information. The size of Rbkg is M×N, M=2w, N=2h; VGGNet-19 is used as the CNN model, and the volume in the fifth layer The product layer is the conv5-4 layer to extract the convolution feature map ztarget_init for Rbkg ; then build the target model of the tracking module according to ztarget_init t∈{1,2,...,T}, T is the number of CNN model channels, the calculation method is as follows:其中:大写的变量为相应的小写变量在频域上的表示,高斯滤波模板m,n为高斯函数自变量,m∈{1,2,...,M},n∈{1,2,...,N},σtarget为高斯核的带宽,⊙代表元素相乘运算,上划线表示复共轭,λ1为调整参数;Among them: the uppercase variable is the representation of the corresponding lowercase variable in the frequency domain, and the Gaussian filter template m, n are the independent variables of the Gaussian function, m∈{1,2,...,M}, n∈{1,2,...,N}, σtarget is the bandwidth of the Gaussian kernel, ⊙ represents multiplication of elements, the overline represents complex conjugation, and λ1 is the adjustment parameter;步骤3:以P为中心,提取S个不同尺度的图像子块,S设定为33;每个子块的大小为w×h×s,变量s为图像子块的尺度因子,s∈[0.7,1.4];然后分别提取每个图像子块的HOG特征,合并后成为一个S维的HOG特征向量,并命名为尺度特征向量,记为zscale_init;再根据zscale_init构建跟踪模块的尺度模型Wscale,计算方法如下:Step 3: Taking P as the center, extract S image sub-blocks of different scales, S is set to 33; the size of each sub-block is w×h×s, and the variable s is the scale factor of the image sub-block, s∈[0.7 , 1.4]; then extract the HOG features of each image sub-block respectively, merge them into an S-dimensional HOG feature vector, and name it as a scale feature vector, denoted as zscale_init ; then build the scale model W of the tracking module according to zscale_initscale , calculated as follows:其中,s'为高斯函数自变量,s'∈{1,2,...,S},σscale为高斯核的带宽,λ2为调整参数;in, s' is the independent variable of the Gaussian function, s'∈{1,2,...,S}, σscale is the bandwidth of the Gaussian kernel, λ2 is an adjustment parameter;步骤4:对目标初始区域Rinit提取灰度特征,得到灰度特征表示的二维矩阵,命名为目标外观表示矩阵,记为Ak,下标k表示当前帧数,初始时k=1;然后将检测模块的滤波模型D初始化为A1,即D=A1,再初始化历史目标表示矩阵集合Ahis;Ahis存储当前及之前每一帧的目标外观表示矩阵,即Ahis={A1,A2,...,Ak},初始时Ahis={A1};Step 4: Extract grayscale features from the target initial region Rinit to obtain a two-dimensional matrix represented by grayscale features, named as the target appearance representation matrix, denoted as Ak , the subscript k represents the current frame number, and initially k=1; Then the filter model D of the detection module is initialized to A1 , that is, D=A1 , and then the historical object representation matrix set Ahis is initialized; Ahis stores the target appearance representation matrix of the current and previous frames, that is, Ahis ={A1 ,A2 ,...,Ak }, initially Ahis ={A1 };步骤5:读取下一帧图像,仍然以P为中心,提取大小为Rbkg×scale的经过尺度缩放后的目标搜索区域;然后通过步骤2中的CNN网络提取目标搜索区域的卷积特征,并以双边插值的方式采样到Rbkg的大小得到当前帧的卷积特征图ztarget_cur,再利用目标模型计算目标置信图ftarget,计算方法如下:Step 5: Read the next frame of image, still take P as the center, extract the scaled target search area with a size of Rbkg × scale; then extract the convolutional features of the target search area through the CNN network in step 2, And sample to the size of Rbkg by bilateral interpolation to get the convolution feature map ztarget_cur of the current frame, and then use the target model Calculate the target confidence map ftarget , the calculation method is as follows:其中,为傅里叶逆变换;最后更新P的坐标,将(x,y)修正为ftarget中的最大响应值所对应的坐标:in, It is the inverse Fourier transform; finally update the coordinates of P, and correct (x, y) to the coordinates corresponding to the maximum response value in ftarget :步骤6:以P为中心,提取S个不同尺度的图像子块,然后分别提取每个图像子块的HOG特征,合并后得到当前帧的尺度特征向量zscale_cur,同步骤3中zscale_init的计算方法;再利用尺度模型Wscale计算尺度置信图:Step 6: Take P as the center, extract S image sub-blocks of different scales, then extract the HOG features of each image sub-block, and combine them to obtain the scale feature vector zscale_cur of the current frame, which is the same as the calculation of zscale_init in step 3 Method; then use the scale model Wscale to calculate the scale confidence map:最后更新目标的尺度scale,计算方法如下:Finally, the scale of the target is updated, and the calculation method is as follows:得到跟踪模块在当前第k帧的输出:以坐标为(x,y)的P为中心,大小为Rinit×scale的图像子块TPatchk;另外,将已经计算完成的ftarget中的最大响应值简记为TPeakk,即TPeakk=ftarget(x,y);Obtain the output of the tracking module in the current kth frame: the image sub-block TPatchk whose size is Rinit × scale is centered on P whose coordinates are (x, y); in addition, the maximum response in the ftarget that has been calculated The value is abbreviated as TPeakk , that is, TPeakk = ftarget (x, y);步骤7:检测模块以全局搜索的方式将滤波模型D与当前帧的整幅图像进行卷积,计算滤波模型D与当前帧各个位置的相似程度;取响应度最高的前j个值,并分别以j个值对应的位置点为中心,提取大小为Rinit×scale的j个图像子块;将j个图像子块作为元素,生成一个图像子块集合DPatchesk,即检测模块在第k帧的输出;Step 7: The detection module convolves the filter model D with the entire image of the current frame in a global search, and calculates the similarity between the filter model D and each position of the current frame; takes the first j values with the highest responsiveness, and respectively Take the location points corresponding to j values as the center, extract j image sub-blocks with a size of Rinit ×scale; use j image sub-blocks as elements, and generate a set of image sub-blocks DPatchesk , that is, the detection module at frame k Output;步骤8:分别计算检测模块输出的集合DPatchesk中各图像子块与跟踪模块输出的TPatchk之间的像素重叠率,得到j个值,将其中最高的值记为如果小于阈值判定为目标被完全遮挡,需要抑制跟踪模块在模型更新时的学习率β,并转步骤9;否则按初始学习率βinit进行更新,并转步骤10;Step 8: Calculate the pixel overlap rate between each image sub-block in the set DPatchesk output by the detection module and the TPatchk output by the tracking module, and obtain j values, and record the highest value as if less than threshold If it is determined that the target is completely occluded, it is necessary to suppress the learning rate β of the tracking module when the model is updated, and go to step 9; otherwise, update according to the initial learning rate βinit , and go to step 10;所述β的计算公式如下:The formula for calculating β is as follows:步骤9:根据DPatchesk中各图像子块的中心,分别提取大小为Rbkg×scale的j个目标搜索区域,按照步骤5中的方法对每一个目标搜索区域提取卷积特征图并计算目标置信图,得到j个目标搜索区域上的最大响应值;在j个响应值中的最大的值记为DPeakk;如果DPeakk大于TPeakk,则再次更新P的坐标,将(x,y)修正为DPeakk所对应的坐标;并重新计算目标尺度特征向量和目标尺度scale,采用步骤6中的计算方式;Step 9: According to the center of each image sub-block in DPatchesk , respectively extract j target search areas with a size of Rbkg × scale, extract the convolution feature map for each target search area according to the method in step 5, and calculate the target confidence Figure, get the maximum response value on the j target search area; record the maximum value among the j response values as DPeakk ; if DPeakk is greater than TPeakk , update the coordinates of P again, and correct (x, y) is the coordinate corresponding to DPeakk ; and recalculate the target scale feature vector and target scale scale, using the calculation method in step 6;步骤10:目标在当前帧最优的位置中心确定为P,最优尺度确定为scale;在图像中标示出新的目标区域Rnew,以P为中心,宽和高分别为w×scale、h×scale的矩形框;另外,将已经计算完成、并且能够得到最优目标位置中心P的卷积特征图简记为ztarget;同样,将能够得到最优目标尺度scale的尺度特征向量简记为zscaleStep 10: Determine the center of the optimal position of the target in the current frame as P, and determine the optimal scale as scale; mark a new target area Rnew in the image, with P as the center, and the width and height are w×scale and h respectively × scale; in addition, the convolutional feature map that has been calculated and can obtain the optimal target position center P is abbreviated as ztarget ; similarly, the scale feature vector that can obtain the optimal target scale scale is abbreviated as zscale ;步骤11:利用ztarget、zscale,以及上一帧建立的跟踪模块中的目标模型和尺度模型Wscale,分别以加权求和的方式进行模型更新,计算方法如下:Step 11: Use ztarget , zscale , and the target model in the tracking module established in the previous frame and the scale model Wscale , the model is updated by weighted summation respectively, and the calculation method is as follows:Wscale=Wscale_newWscale = Wscale_new ;步骤12:对新的目标区域Rnew提取灰度特征后得到当前帧的目标外观表示矩阵Ak,将Ak加入到历史目标表示矩阵集合Ahis;如果集合Ahis中元素个数大于c,则从Ahis中随机选择c个元素生成一个三维矩阵Ck,Ck(:,i)对应的是Ahis中任意一个元素、即二维矩阵Ak;否则用Ahis中所有元素生成矩阵Ck;然后对Ck进行平均化得到二维矩阵,将这个二维矩阵作为检测模块新的滤波模型D,计算方法如下:Step 12: After extracting the gray features of the new target area Rnew , obtain the target appearance representation matrix Ak of the current frame, and add Ak to the historical target representation matrix set Ahis ; if the number of elements in the set Ahis is greater than c, Then randomly select c elements from Ahis to generate a three-dimensional matrix Ck , Ck (:,i) corresponds to any element in Ahis , that is, the two-dimensional matrix Ak ; otherwise, use all elements in Ahis to generate a matrix Ck ; then Ck is averaged to obtain a two-dimensional matrix, and this two-dimensional matrix is used as the new filtering model D of the detection module, and the calculation method is as follows:步骤13:若处理完视频中所有的图像帧则算法结束,否则转步骤5继续执行。Step 13: If all image frames in the video are processed, the algorithm ends, otherwise go to step 5 and continue to execute.2.根据权利要求1所述基于卷积特征和全局搜索检测的长时遮挡鲁棒跟踪方法,其特征在于:所述调整参数λ1和λ2设定为0.0001。2. The long-term occlusion robust tracking method based on convolution features and global search detection according to claim 1, characterized in that: the adjustment parameters λ1 and λ2 are set to 0.0001.3.根据权利要求1所述基于卷积特征和全局搜索检测的长时遮挡鲁棒跟踪方法,其特征在于:所述j值设定为10。3. The long-term occlusion robust tracking method based on convolution features and global search detection according to claim 1, characterized in that: the j value is set to 10.4.根据权利要求1所述基于卷积特征和全局搜索检测的长时遮挡鲁棒跟踪方法,其特征在于:所述阈值设定为0.05。4. The long-term occlusion robust tracking method based on convolution features and global search detection according to claim 1, characterized in that: the threshold Set to 0.05.5.根据权利要求1所述基于卷积特征和全局搜索检测的长时遮挡鲁棒跟踪方法,其特征在于:所述初始学习率βinit设定为0.02。5. The long-term occlusion robust tracking method based on convolution features and global search detection according to claim 1, characterized in that: the initial learning rate βinit is set to 0.02.6.根据权利要求1所述基于卷积特征和全局搜索检测的长时遮挡鲁棒跟踪方法,其特征在于:所述c设定为20。6. The long-term occlusion robust tracking method based on convolution features and global search detection according to claim 1, characterized in that: said c is set to 20.
CN201710204379.1A2017-03-312017-03-31Based on convolution feature and global search detect it is long when block robust tracking methodExpired - Fee RelatedCN106952288B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201710204379.1ACN106952288B (en)2017-03-312017-03-31Based on convolution feature and global search detect it is long when block robust tracking method

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201710204379.1ACN106952288B (en)2017-03-312017-03-31Based on convolution feature and global search detect it is long when block robust tracking method

Publications (2)

Publication NumberPublication Date
CN106952288A CN106952288A (en)2017-07-14
CN106952288Btrue CN106952288B (en)2019-09-24

Family

ID=59475259

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201710204379.1AExpired - Fee RelatedCN106952288B (en)2017-03-312017-03-31Based on convolution feature and global search detect it is long when block robust tracking method

Country Status (1)

CountryLink
CN (1)CN106952288B (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN107452022A (en)*2017-07-202017-12-08西安电子科技大学A kind of video target tracking method
CN107644430A (en)*2017-07-272018-01-30孙战里Target following based on self-adaptive features fusion
CN107491742B (en)*2017-07-282020-10-23西安因诺航空科技有限公司Long-term stable target tracking method for unmanned aerial vehicle
CN108734151B (en)*2018-06-142020-04-14厦门大学 Robust long-range target tracking method based on correlation filtering and deep Siamese network
CN110276782B (en)*2018-07-092022-03-11西北工业大学Hyperspectral target tracking method combining spatial spectral features and related filtering
CN109271865B (en)*2018-08-172021-11-09西安电子科技大学Moving target tracking method based on scattering transformation multilayer correlation filtering
CN109308469B (en)*2018-09-212019-12-10北京字节跳动网络技术有限公司Method and apparatus for generating information
CN109410249B (en)*2018-11-132021-09-28深圳龙岗智能视听研究院Self-adaptive target tracking method combining depth characteristic and hand-drawn characteristic
CN109596649A (en)*2018-11-292019-04-09昆明理工大学A kind of method and device that host element concentration is influenced based on convolutional network coupling microalloy element
CN109754424B (en)*2018-12-172022-11-04西北工业大学 Correlation Filter Tracking Algorithm Based on Fusion Features and Adaptive Update Strategy
CN109740448B (en)*2018-12-172022-05-10西北工业大学Aerial video target robust tracking method based on relevant filtering and image segmentation
CN111260687B (en)*2020-01-102022-09-27西北工业大学 An Aerial Video Object Tracking Method Based on Semantic Awareness Network and Correlation Filtering
CN111652910B (en)*2020-05-222023-04-11重庆理工大学Target tracking algorithm based on object space relationship
CN112762841A (en)*2020-12-302021-05-07天津大学Bridge dynamic displacement monitoring system and method based on multi-resolution depth features
CN114926497A (en)*2022-04-242022-08-19北京机械设备研究所Single-target anti-occlusion tracking method and device based on ECO

Citations (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN105631895A (en)*2015-12-182016-06-01重庆大学Temporal-spatial context video target tracking method combining particle filtering
CN105741316A (en)*2016-01-202016-07-06西北工业大学Robust target tracking method based on deep learning and multi-scale correlation filtering
CN106326924A (en)*2016-08-232017-01-11武汉大学Object tracking method and object tracking system based on local classification

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN105631895A (en)*2015-12-182016-06-01重庆大学Temporal-spatial context video target tracking method combining particle filtering
CN105741316A (en)*2016-01-202016-07-06西北工业大学Robust target tracking method based on deep learning and multi-scale correlation filtering
CN106326924A (en)*2016-08-232017-01-11武汉大学Object tracking method and object tracking system based on local classification

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CNNTracker: Online discriminative object tracking via deep convolutional neural network;Yan Chen等;《Applied Soft Computing》;20160131;第38卷;第1088-1098页*
Hierarchical convolutional features for visual tracking;Chao Ma等;《2015 IEEE International Conference on Computer》;20151213;第3074-3082页*
Tracking Human-like Natural Motion Using Deep Recurrent Neural Networks;Youngbin Park等;《arXiv: Computer Vision and Pattern Recognition》;20160415;第1-8页*

Also Published As

Publication numberPublication date
CN106952288A (en)2017-07-14

Similar Documents

PublicationPublication DateTitle
CN106952288B (en)Based on convolution feature and global search detect it is long when block robust tracking method
CN108509859B (en)Non-overlapping area pedestrian tracking method based on deep neural network
Li et al.Weighted low-rank decomposition for robust grayscale-thermal foreground detection
CN103295242B (en)A kind of method for tracking target of multiple features combining rarefaction representation
Li et al.Learning motion-robust remote photoplethysmography through arbitrary resolution videos
CN105335986B (en)Method for tracking target based on characteristic matching and MeanShift algorithm
CN108647694B (en) A Correlation Filtering Target Tracking Method Based on Context Awareness and Adaptive Response
CN104992453B (en)Target in complex environment tracking based on extreme learning machine
CN107481264A (en)A kind of video target tracking method of adaptive scale
CN107316316A (en)The method for tracking target that filtering technique is closed with nuclear phase is adaptively merged based on multiple features
CN108182388A (en)A kind of motion target tracking method based on image
CN109461172A (en)Manually with the united correlation filtering video adaptive tracking method of depth characteristic
CN101789124B (en)Segmentation method for space-time consistency of video sequence of parameter and depth information of known video camera
CN111080675A (en) A Target Tracking Method Based on Spatio-temporal Constraint Correlation Filtering
CN106204638A (en)A kind of based on dimension self-adaption with the method for tracking target of taking photo by plane blocking process
CN103617636B (en)The automatic detecting and tracking method of video object based on movable information and sparse projection
CN110211157A (en)A kind of target long time-tracking method based on correlation filtering
CN109977971A (en)Dimension self-adaption Target Tracking System based on mean shift Yu core correlation filtering
CN111402303A (en) A Target Tracking Architecture Based on KFSTRCF
CN111914756A (en)Video data processing method and device
CN110084830A (en)A kind of detection of video frequency motion target and tracking
CN106548194B (en) Construction method and positioning method of two-dimensional image human joint point positioning model
CN106887012A (en)A kind of quick self-adapted multiscale target tracking based on circular matrix
CN111027586A (en) A Target Tracking Method Based on Novel Response Graph Fusion
CN111539396A (en)Pedestrian detection and gait recognition method based on yolov3

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant
CF01Termination of patent right due to non-payment of annual fee

Granted publication date:20190924

CF01Termination of patent right due to non-payment of annual fee

[8]ページ先頭

©2009-2025 Movatter.jp