Movatterモバイル変換


[0]ホーム

URL:


CN112101122A - Weak supervision object number estimation method based on sequencing network - Google Patents

Weak supervision object number estimation method based on sequencing network
Download PDF

Info

Publication number
CN112101122A
CN112101122ACN202010845336.3ACN202010845336ACN112101122ACN 112101122 ACN112101122 ACN 112101122ACN 202010845336 ACN202010845336 ACN 202010845336ACN 112101122 ACN112101122 ACN 112101122A
Authority
CN
China
Prior art keywords
network
objects
layer
sequencing
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010845336.3A
Other languages
Chinese (zh)
Other versions
CN112101122B (en
Inventor
李国荣
杨一帆
黄庆明
苏荔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Chinese Academy of Sciences
Original Assignee
University of Chinese Academy of Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Chinese Academy of SciencesfiledCriticalUniversity of Chinese Academy of Sciences
Priority to CN202010845336.3ApriorityCriticalpatent/CN112101122B/en
Publication of CN112101122ApublicationCriticalpatent/CN112101122A/en
Application grantedgrantedCritical
Publication of CN112101122BpublicationCriticalpatent/CN112101122B/en
Expired - Fee Relatedlegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Classifications

Landscapes

Abstract

The invention relates to the technical field of computer vision, in particular to a weak supervision object number estimation method based on a sequencing network, which does not need to rely on object position marking information to train a model, saves human resources and improves the universality of the model; the method comprises the following steps: extracting image features by using a deep neural network, and acquiring pyramid feature vectors by using an adaptive pooling layer; the number of objects returned back and forth using the full link layer; and training a model by using a multi-branch sequencing network, converting a sequencing result into a sequencing matrix by using a Sinkhorn layer, and calculating loss by using a soft label transmission matrix as a true value.

Description

Translated fromChinese
一种基于排序网络的弱监督物体数目估计方法A Weakly Supervised Object Number Estimation Method Based on Ranking Network

技术领域technical field

本发明涉及计算机视觉的技术领域,特别是涉及一种基于排序网络的弱监督物体数目估计方法。The invention relates to the technical field of computer vision, in particular to a method for estimating the number of weakly supervised objects based on a ranking network.

背景技术Background technique

公共场合中通过摄像机实现人数、车辆等关键物体的计数具有重要的研究价值。比如:候车大厅中人群计数的结果、交通路口中的车辆数目估计,可优化公共交通的调度;某区域中人数的急剧变化既可能会导致意外事件的发生,又可能是意外事件发生的结果。因此图像视频中的物体数目估计在智能安防领域具有重要价值,是计算机视觉和智能视频监控领域的重要研究内容。It has important research value to realize the counting of key objects such as people and vehicles through cameras in public places. For example, the results of crowd counting in the waiting hall and the estimation of the number of vehicles in traffic intersections can optimize the scheduling of public transportation; a sharp change in the number of people in a certain area may lead to or be the result of an accident. Therefore, the estimation of the number of objects in images and videos has important value in the field of intelligent security and is an important research content in the fields of computer vision and intelligent video surveillance.

目前,物体数目估计方法大致可以分为三种:1)物体检测:这种方法比较直接,在物体较稀疏的场景中,通过检测图像中的物体,进而得到物体数目,这种方法在物体拥挤情况下不大奏效。2)视觉特征轨迹聚类:对于视频监控,一般用KLT跟踪器和聚类的方法,通过轨迹聚类得到的数目来物体数目。3)基于特征的回归:建立图像特征和图像物体数目的回归模型,通过测量图像特征从而估计场景中的物体数目。由于拥挤情况下采用直接法容易受到遮挡等难点问题的影响,而间接法从物体群体的整体特征出发,具有大规模物体计数的能力。At present, the number of objects estimation methods can be roughly divided into three types: 1) Object detection: This method is relatively straightforward. In scenes with sparse objects, the number of objects is obtained by detecting objects in the image. This method is used in crowded objects. Not very effective in this case. 2) Visual feature trajectory clustering: For video surveillance, KLT tracker and clustering methods are generally used, and the number of objects obtained by trajectory clustering is calculated. 3) Feature-based regression: establish a regression model of image features and the number of image objects, and estimate the number of objects in the scene by measuring image features. Since the direct method is easily affected by difficult problems such as occlusion under crowded conditions, the indirect method starts from the overall characteristics of the object group and has the ability to count objects on a large scale.

现有的基于特征回归的算法存在着以下缺点。首先,物体位置的标注通常很昂贵。现有的物体数目估计数据集提供了每个物体的位置来训练数目回归网络,而在评估阶段,却没有考虑这些位置标签,仅仅评估估计的物体数目的准确性。实际上,在不需要位置的情况下,可以仅标注图像中物体的数目,利用更有效的弱监督方法来训练物体数目估计模型。The existing feature regression-based algorithms have the following shortcomings. First, the annotation of object locations is usually expensive. Existing object number estimation datasets provide the location of each object to train the number regression network, while in the evaluation stage, these location labels are not considered and only the accuracy of the estimated object number is evaluated. In fact, without the need for location, it is possible to only label the number of objects in the image, using more efficient weakly supervised methods to train the object number estimation model.

发明内容SUMMARY OF THE INVENTION

为解决上述技术问题,本发明提供一种不需要物体位置标注信息、节省人力资源、提高物体数目估计准确性的基于排序网络的弱监督物体数目估计方法。In order to solve the above technical problems, the present invention provides a method for estimating the number of weakly supervised objects based on a sorting network, which does not require object position labeling information, saves human resources, and improves the accuracy of estimating the number of objects.

本发明的一种基于排序网络的弱监督物体数目估计方法,包括以下步骤:A method for estimating the number of weakly supervised objects based on a sorting network of the present invention includes the following steps:

S1、使用预训练好的深度神经网络如VGG-16提取图像特征,然后利用卷积操作回归密度图;利用自适应池化层从密度图中提取多尺度特征来捕获图像中的全局和局部信息,输入到全连接层回归物体数目。其中自适应池化层包括全局子簇层和局部子簇层两种类型。S1. Use a pre-trained deep neural network such as VGG-16 to extract image features, and then use the convolution operation to regress the density map; use an adaptive pooling layer to extract multi-scale features from the density map to capture global and local information in the image , which is input to the fully connected layer to regress the number of objects. The adaptive pooling layer includes two types of global sub-cluster layer and local sub-cluster layer.

S2、使用图像物体数目排序网络对多尺度特征进行学习,使得多尺度特征对物体数目敏感。这里的排序网络为多分支网络,其输入为多张图像的多尺度特征,输出为依据图像中物体的数目进行排序的结果。S2. Use the image object number sorting network to learn multi-scale features, so that the multi-scale features are sensitive to the number of objects. The sorting network here is a multi-branch network, whose input is the multi-scale features of multiple images, and the output is the result of sorting according to the number of objects in the image.

S3、排序网络中使用Sinkhorn层将排序特征变为序数矩阵,利用图像中物体的真实数目构造软标签传输矩阵,使用交叉熵损失来训练排序网络,得到对物体数目敏感的特征;然后训练回归网络,最终得到物体数目回归模型;S3. The Sinkhorn layer is used in the sorting network to change the sorting features into an ordinal matrix, and the soft label transmission matrix is constructed by using the real number of objects in the image, and the cross-entropy loss is used to train the sorting network to obtain features that are sensitive to the number of objects; then train the regression network. , and finally get the number of objects regression model;

本发明的的一种基于排序网络的弱监督物体数目估计方法,所述步骤S1的具体操作为:利用在图像分析任务上预训练好的深度网络模型提取图像特征,回归一个伪概率密度图;然后使用步幅较大的池化层构造全局子簇层,从密度图中提取全局特征;利用步幅较小的池化层构造局部子簇层,从密度图中提取局部特征。In a method for estimating the number of weakly supervised objects based on a sorting network of the present invention, the specific operations of the step S1 are: extracting image features using a deep network model pre-trained on the image analysis task, and returning a pseudo probability density map; Then, a pooling layer with a larger stride is used to construct a global subcluster layer to extract global features from the density map; a pooling layer with a smaller stride is used to construct a local subcluster layer to extract local features from the density map.

本发明的一种基于排序网络的弱监督物体数目估计方法,,所述步骤S2的具体操作为:使用多分支排序网络来微调特征提取模型,获取对图像中物体数目全局、局部特征In a method for estimating the number of weakly supervised objects based on a sorting network of the present invention, the specific operation of the step S2 is: using a multi-branch sorting network to fine-tune the feature extraction model, and obtain global and local characteristics of the number of objects in the image.

本发明的一种基于排序网络的弱监督物体数目估计方法,所述步骤S3的具体操作为:使用可微分的Sinkhorn层将排序特征变为序数矩阵;构造更有效的软标签运输矩阵来训练排序网络;使用交叉熵损失来训练排序网络,使用均方误差来训练回归网络。In a method for estimating the number of weakly supervised objects based on a sorting network of the present invention, the specific operations of the step S3 are: using a differentiable Sinkhorn layer to change the sorting feature into an ordinal matrix; constructing a more effective soft label transport matrix to train sorting Networks; use the cross-entropy loss to train the ranking network and the mean squared error to train the regression network.

本发明的有益效果为:排序网络能够通图像间物体数目的相对关系来学习对物体数目敏感的多尺度特征,用于回归网络的输入,避免使用物体的位置信息,不需要大量人力来标注物体位置信息。使用收可微分的Sinkhorn层,使得网络可以端到端训练;利用图像中物体数目的相对关系来构建软标签运输矩阵,有效的反应了排序任务的复杂程序,提升了物体数目估计的准确性。The beneficial effects of the invention are: the sorting network can learn multi-scale features sensitive to the number of objects through the relative relationship of the number of objects between images, which is used for the input of the regression network, avoids using the position information of the objects, and does not require a lot of manpower to label the objects. location information. The use of the differentiable Sinkhorn layer enables the network to be trained end-to-end; the relative relationship between the number of objects in the image is used to construct the soft label transport matrix, which effectively reflects the complex procedure of the sorting task and improves the accuracy of the number of objects estimated.

附图说明Description of drawings

图1是本发明的示意图。Figure 1 is a schematic diagram of the present invention.

具体实施方式Detailed ways

下面结合实施例,对本发明的具体实施方式作进一步详细描述。以下实施例用于说明本发明,但不用来限制本发明的范围The specific embodiments of the present invention will be further described in detail below with reference to the examples. The following examples are used to illustrate the present invention, but not to limit the scope of the present invention

实施例Example

S1、使用预训练好的深度神经网络如VGG-16提取图像特征,然后利用卷积操作回归密度图;利用多个池化层从密度图中提取多尺度特征来捕获图像中的全局和局部信息,输入到全连接层回归物体数目。其中自适应池化层包括全局子簇层和局部子簇层两种类型。全局子簇层使用三Max池化层,池化步长分别为8、16、32;局部子簇层使用两个Average池化层,池化步长为1、2;S1. Use a pre-trained deep neural network such as VGG-16 to extract image features, and then use convolution operations to regress the density map; use multiple pooling layers to extract multi-scale features from the density map to capture global and local information in the image , which is input to the fully connected layer to regress the number of objects. The adaptive pooling layer includes two types of global sub-cluster layer and local sub-cluster layer. The global subcluster layer uses three Max pooling layers, and the pooling steps are 8, 16, and 32 respectively; the local subcluster layer uses two Average pooling layers, and the pooling steps are 1 and 2;

S2、使用图像物体数目排序网络对多尺度特征进行学习,使得多尺度特征对物体数目敏感。这里的排序网络为多分支网络,其输入为多张图像的多尺度特征,输出为依据图像中物体的数目进行排序的结果。具体可采用K分支网络,提取K张图像的多尺度特征f1,f2,f3,…,fK然后计算f1-f2,f1-f3,…,f1-fk,f2-f4,…,f2-fK,…,fK-1-fK,输入到排序网络中,得到一个K(K-1)维的排序向量fdS2. Use the image object number sorting network to learn multi-scale features, so that the multi-scale features are sensitive to the number of objects. The sorting network here is a multi-branch network, whose input is the multi-scale features of multiple images, and the output is the result of sorting according to the number of objects in the image. Specifically, a K branch network can be used to extract multi-scale features f1 , f2 , f3 ,..., fK of K images, and then calculate f1 -f2 , f1 -f3 ,...,f1 -fk , f2 -f4 ,…,f2 -fK ,…,fK-1 -fK , input into the sorting network, and get a K(K-1)-dimensional sorting vector fd ;

S3、排序网络中使用Sinkhorn层将排序特征fd变为序数矩阵P,其中第i行第j列个元素Pi,j表第i张图像排在第j名的概率;利用图像中物体的真实数目构造软标签传输矩阵

Figure BDA0002642856760000041
S3. The Sinkhorn layer is used in the sorting network to change the sorting feature fd into an ordinal matrix P, in which the element Pi in the i-th row and the j-th column, j represents the probability that the i-th image is ranked in the j-th place; True Number Constructing Soft Label Transmission Matrix
Figure BDA0002642856760000041

用σ表示图像真实的排序结果,其中σ第i个元素σ(i)表示第i张图像排在第σ(i)个位置,则软标签矩阵中的元素计算方式如下:Use σ to represent the real sorting result of the image, where the i-th element σ(i) of σ indicates that the i-th image is ranked in the σ(i)-th position, then the elements in the soft label matrix are calculated as follows:

Figure BDA0002642856760000042
Figure BDA0002642856760000042

其中in

Figure BDA0002642856760000043
Figure BDA0002642856760000043

thr为预先定义的阈值。然使用如下交叉熵损失来训练排序网络,得到对物体数目敏感的特征。thr is a predefined threshold. However, the following cross-entropy loss is used to train the ranking network to obtain features that are sensitive to the number of objects.

Figure BDA0002642856760000044
Figure BDA0002642856760000044

然后使用均方误差损失来训练回归网络,最终得到物体数目回归模型。Then use the mean square error loss to train the regression network, and finally get the object number regression model.

以上所述仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明技术原理的前提下,还可以做出若干改进和变型,这些改进和变型也应视为本发明的保护范围。The above are only the preferred embodiments of the present invention. It should be pointed out that for those skilled in the art, without departing from the technical principle of the present invention, several improvements and modifications can be made. These improvements and modifications It should also be regarded as the protection scope of the present invention.

Claims (4)

1. A weak supervision object number estimation method based on a sequencing network is characterized by comprising the following steps:
s1, extracting image features by using a pre-trained deep neural network such as VGG-16, and then regressing a density map by using convolution operation; and extracting multi-scale features from the density map by using the self-adaptive pooling layer to capture global and local information in the image, and inputting the information into the number of the regression objects of the full-connection layer. The self-adaptive pooling layer comprises a global sub-cluster layer and a local sub-cluster layer;
s2, learning the multi-scale features by using the image object number ordering network, so that the multi-scale features are sensitive to the number of objects. The sorting network is a multi-branch network, which inputs multi-scale characteristics of a plurality of images and outputs a result of sorting according to the number of objects in the images;
s3, using a Sinkhorn layer in the sequencing network to change the sequencing characteristics into a ordinal matrix, constructing a soft label transmission matrix by using the real number of objects in the image, and training the sequencing network by using cross entropy loss to obtain characteristics sensitive to the number of the objects; and then training a regression network to finally obtain an object number regression model.
2. The method for estimating the number of weakly supervised objects based on a ranking network as recited in claim 1, wherein the specific operations of step S1 are: extracting image features by using a depth network model pre-trained on an image analysis task, and regressing a pseudo probability density map; then constructing a global sub-cluster layer by using the pooling layer with larger stride, and extracting global features from the density map; and constructing a local sub-cluster layer by using the pooling layer with a smaller step length, and extracting local features from the density map.
3. The method for estimating the number of weakly supervised objects based on a ranking network as recited in claim 1, wherein the specific operations of step S2 are: and (3) fine-tuning the feature extraction model by using a multi-branch sequencing network to obtain global and local features of the number of objects in the image.
4. The method for estimating the number of weakly supervised objects based on a ranking network as recited in claim 1, wherein the specific operations of step S3 are: using a differentiable Sinkhorn layer to change the ordering characteristics into a ordinal matrix; constructing a more effective soft label transport matrix, and training a sequencing network by using cross entropy loss; the regression network is trained using the mean square error.
CN202010845336.3A2020-08-202020-08-20Weak supervision object number estimation method based on sorting networkExpired - Fee RelatedCN112101122B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202010845336.3ACN112101122B (en)2020-08-202020-08-20Weak supervision object number estimation method based on sorting network

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202010845336.3ACN112101122B (en)2020-08-202020-08-20Weak supervision object number estimation method based on sorting network

Publications (2)

Publication NumberPublication Date
CN112101122Atrue CN112101122A (en)2020-12-18
CN112101122B CN112101122B (en)2024-02-09

Family

ID=73753262

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202010845336.3AExpired - Fee RelatedCN112101122B (en)2020-08-202020-08-20Weak supervision object number estimation method based on sorting network

Country Status (1)

CountryLink
CN (1)CN112101122B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN107301387A (en)*2017-06-162017-10-27华南理工大学A kind of image Dense crowd method of counting based on deep learning
US20180165554A1 (en)*2016-12-092018-06-14The Research Foundation For The State University Of New YorkSemisupervised autoencoder for sentiment analysis
US20200226735A1 (en)*2017-03-162020-07-16Siemens AktiengesellschaftVisual localization in images using weakly supervised neural network
CN111428733A (en)*2020-03-122020-07-17山东大学 A zero-sample target detection method and system based on semantic feature space transformation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20180165554A1 (en)*2016-12-092018-06-14The Research Foundation For The State University Of New YorkSemisupervised autoencoder for sentiment analysis
US20200226735A1 (en)*2017-03-162020-07-16Siemens AktiengesellschaftVisual localization in images using weakly supervised neural network
CN107301387A (en)*2017-06-162017-10-27华南理工大学A kind of image Dense crowd method of counting based on deep learning
CN111428733A (en)*2020-03-122020-07-17山东大学 A zero-sample target detection method and system based on semantic feature space transformation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
郑宝玉;王雨;吴锦雯;周全;: "基于深度卷积神经网络的弱监督图像语义分割", 南京邮电大学学报(自然科学版), no. 05*

Also Published As

Publication numberPublication date
CN112101122B (en)2024-02-09

Similar Documents

PublicationPublication DateTitle
CN111259850B (en) A Pedestrian Re-ID Method Fused with Random Batch Masking and Multi-scale Representation Learning
CN110414432B (en)Training method of object recognition model, object recognition method and corresponding device
CN109271960B (en)People counting method based on convolutional neural network
Wang et al.Dairy goat detection based on Faster R-CNN from surveillance video
CN111783576B (en) Pedestrian re-identification method based on improved YOLOv3 network and feature fusion
CN106096561B (en)Infrared pedestrian detection method based on image block deep learning features
CN106897670B (en)Express violence sorting identification method based on computer vision
CN111783590A (en) A Multi-Class Small Object Detection Method Based on Metric Learning
CN111709311A (en) A pedestrian re-identification method based on multi-scale convolutional feature fusion
CN111340881B (en) A Direct Visual Localization Method Based on Semantic Segmentation in Dynamic Scenes
Han et al.Image crowd counting using convolutional neural network and Markov random field
CN105243154B (en)Remote sensing image retrieval method based on notable point feature and sparse own coding and system
CN107767416B (en)Method for identifying pedestrian orientation in low-resolution image
CN109886176B (en)Lane line detection method in complex driving scene
CN107818307B (en) A multi-label video event detection method based on LSTM network
CN111783589A (en) A crowd counting method for complex scenes based on scene classification and multi-scale feature fusion
Xiong et al.Contrastive learning for automotive mmWave radar detection points based instance segmentation
CN110555420A (en)fusion model network and method based on pedestrian regional feature extraction and re-identification
CN109034258A (en)Weakly supervised object detection method based on certain objects pixel gradient figure
Li et al.An aerial image segmentation approach based on enhanced multi-scale convolutional neural network
CN115063831A (en) A high-performance pedestrian retrieval and re-identification method and device
CN115147644A (en) Image description model training and description method, system, device and storage medium
CN116543192A (en) A small-sample classification method for remote sensing images based on multi-view feature fusion
CN106056609B (en)Method based on DBNMI model realization remote sensing image automatic markings
Pillai et al.Fine-tuned EfficientNetB4 transfer learning model for weather classification

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant
CF01Termination of patent right due to non-payment of annual fee
CF01Termination of patent right due to non-payment of annual fee

Granted publication date:20240209


[8]ページ先頭

©2009-2025 Movatter.jp