CN108647682A

Movatterモバイル変換

Info

Publication number: CN108647682A
Application number: CN201810479069.5A
Authority: CN
Inventors: 屈鸿; 刘永胜; 张书洲; 季江舟; 贺强; 张亦洲; 郝雪洁
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2018-05-17
Filing date: 2018-05-17
Publication date: 2018-10-12

Abstract

The brand Logo detections and recognition methods, method and step that the invention discloses a kind of based on region convolutional neural networks model are as follows：Step 1 carries out raw data set expansion, and the scale of obtaining reaches the data set for the expansion for carrying out region convolutional neural networks model training requirement；Step 2 is trained region convolutional neural networks model using the data set after expansion；Step 3 carries out Logo detections and identification based on region convolutional neural networks to input picture.The present invention carries out raw data set expansion, the training of deep learning network model, Logo detections and identification based on region convolutional neural networks, realizes the detection and identification of a variety of Logo under complex background.

Description

Translated fromChinese

一种基于区域卷积神经网络模型的品牌Logo检测与识别方法A Brand Logo Detection and Recognition Method Based on Regional Convolutional Neural Network Model

技术领域technical field

本发明属于目标检测与识别技术领域，具体涉及一种基于区域卷积神经网络模型的品牌Logo检测与识别方法。The invention belongs to the technical field of target detection and recognition, and in particular relates to a brand logo detection and recognition method based on a regional convolutional neural network model.

背景技术Background technique

Logo检测与识别技术主要是通过对输入图像的一系列处理工作，然后在图像中找出Logo所在的区域并判定目标区域Logo的具体类别。在当今的日常生产生活中，Logo检测与识别技术在城市智能交通、文档检索分类、品牌溯源追踪、商业广告分析等领域都有着巨大的应用前景。虽然目前关于目标与检测技术的研究已经相对比较成熟，但是由于在不同的应用场景下，待检测目标的背景差异、形变差异等都会导致目标检测难度的提升，所以目前进行目标检测与识别技术在不同场景中的应用还存在一定的难度和研究空间。传统的Logo检测与识别技术的研究大多以文档中的单一Logo为研究基础，其设计的背景较为简单，Logo种类单一。Logo detection and recognition technology is mainly through a series of processing work on the input image, and then find out the area where the Logo is located in the image and determine the specific category of the Logo in the target area. In today's daily production and life, Logo detection and recognition technology has great application prospects in urban intelligent transportation, document retrieval and classification, brand traceability, commercial advertisement analysis and other fields. Although the current research on target and detection technology has been relatively mature, in different application scenarios, the background difference and deformation difference of the target to be detected will lead to the improvement of the difficulty of target detection, so the current target detection and recognition technology is in the There are still certain difficulties and research space in the application in different scenarios. Most of the research on traditional Logo detection and recognition technology is based on a single Logo in a document. The background of its design is relatively simple, and the type of Logo is single.

传统的Logo检测识别与识别算法大多采用“金字塔”式滑动窗口的机制和简单机器学习算法相结合的方式进行相关研究，其大体思想是通过可变大小的滑动窗口逐步遍历输入图像中的每个区域，然后对该区域的图像进行不变特征提取，最后用Adaboost、SVM(Support Vector Machine)等分类器对提取特征进行分类。现阶段随着R-CNN算法思想的提出，本文研究出了一种基于区域卷积神经网络的Logo检测与识别算法。Traditional Logo detection and recognition algorithms mostly use the combination of "pyramid" sliding window mechanism and simple machine learning algorithm for related research. region, and then perform invariant feature extraction on the image of the region, and finally use Adaboost, SVM (Support Vector Machine) and other classifiers to classify the extracted features. At this stage, with the idea of R-CNN algorithm proposed, this paper develops a Logo detection and recognition algorithm based on regional convolutional neural network.

发明内容Contents of the invention

本发明的目的在于：解决了传统Logo检测与识别技术以单一Logo和简单背景为研究基础，难以应用于复杂背景的Logo检测与识别的问题，提供了基于区域卷积神经网络模型中的区域卷积神经网络，实现复杂背景下多种Logo的检测与识别的一种基于区域卷积神经网络模型的品牌Logo检测与识别方法。The purpose of the present invention is to solve the problem that the traditional Logo detection and recognition technology is based on a single Logo and a simple background, and it is difficult to apply to the Logo detection and recognition of complex backgrounds, and provides a regional convolutional neural network model based on Convolutional neural network, a brand logo detection and recognition method based on the regional convolutional neural network model to realize the detection and recognition of various Logos in complex backgrounds.

本发明采用的技术方案如下：The technical scheme that the present invention adopts is as follows:

一种基于区域卷积神经网络模型的品牌Logo检测与识别方法，方法步骤如下：A brand logo detection and recognition method based on a regional convolutional neural network model, the method steps are as follows:

步骤1、进行原始数据集扩充，得到规模达到进行区域卷积神经网络模型训练要求的扩充的数据集；Step 1. Expand the original data set to obtain an expanded data set whose scale reaches the training requirements of the regional convolutional neural network model;

步骤2、利用扩充后的数据集对区域卷积神经网络模型进行训练；Step 2, using the expanded data set to train the regional convolutional neural network model;

步骤3、基于区域卷积神经网络对输入图像进行Logo检测与识别。Step 3. Perform Logo detection and recognition on the input image based on the regional convolutional neural network.

进一步，所述步骤1具体为：Further, the step 1 is specifically:

步骤11、利用网络爬虫技术和手工标注相结合的方式进行品牌Logo原始数据集的构建；Step 11, using the combination of web crawler technology and manual labeling to construct the brand Logo original data set;

步骤12、获取原始数据集中所包含的所有透明背景格式的Logo图像；Step 12, obtaining all logo images in transparent background format contained in the original data set;

步骤13、获取不含原始数据集中的Logo的图像，并归一化处理图像至指定像素；Step 13, obtaining an image that does not contain the Logo in the original data set, and normalizing the image to a specified pixel;

步骤14、将步骤12获取的原始数据集中的每种Logo进行仿射变换，然后和步骤13中得到的图像进行合成，合成得到的图像并入原始数据集，得到扩充的数据集。Step 14. Perform affine transformation on each Logo in the original data set obtained in step 12, and then synthesize it with the image obtained in step 13, and merge the synthesized image into the original data set to obtain an expanded data set.

进一步，所述步骤2具体为：Further, the step 2 is specifically:

步骤21、利用选择性搜索算法对得到的扩充的数据集中的每张图像进行候选区域获取；Step 21, using a selective search algorithm to obtain candidate regions for each image in the obtained expanded data set;

步骤22、计算步骤21获取的候选区域坐标与Logo区域真正的区域坐标的IoU值，进行候选区域样本分类，记IoU＞0.5的区域为正样本，其余的区域为负样本；Step 22, calculate the IoU value of the candidate area coordinates obtained in step 21 and the real area coordinates of the Logo area, and classify the candidate area samples, record the area with IoU>0.5 as a positive sample, and the rest of the area as a negative sample;

步骤23、利用步骤22得到的正负样本对区域卷积神经网络模型进行训练，模型中Softmax分类器的输出维度为Logo种类数加1。Step 23. Use the positive and negative samples obtained in step 22 to train the regional convolutional neural network model. The output dimension of the Softmax classifier in the model is the number of Logo types plus 1.

进一步，所述步骤23中区域卷积神经网络模型选取CaffeNet和VGGG16两种网络模型。Further, two network models of CaffeNet and VGGG16 are selected for the regional convolutional neural network model in the step 23.

进一步，所述步骤3具体为：Further, the step 3 is specifically:

步骤31、利用选择性搜索算法获取输入图像所有的候选区域，将输入图像和获取的候选区域坐标作为区域卷积神经网络模型的输入；Step 31, using a selective search algorithm to obtain all candidate regions of the input image, and using the input image and the coordinates of the obtained candidate regions as the input of the regional convolutional neural network model;

步骤32：根据目标Logo区域的特征，进行步骤31得到的候选区域的二次筛选，将长宽比或宽长比大于4的区域剔除；Step 32: According to the characteristics of the target Logo region, perform secondary screening of the candidate regions obtained in step 31, and remove regions with an aspect ratio or an aspect ratio greater than 4;

步骤33：将整个图像输入到区域卷积神经网络进行整个图像的特征值的计算和提取；Step 33: Input the entire image into the regional convolutional neural network to calculate and extract the feature values of the entire image;

步骤34：基于步骤32的图像候选区域和步骤33得到的图像的特征值，利用RoI池化层完成对图像候选区域到候选区域特征值的映射计算；Step 34: Based on the image candidate area in step 32 and the feature value of the image obtained in step 33, use the RoI pooling layer to complete the mapping calculation from the image candidate area to the feature value of the candidate area;

步骤35：利用softmax分类器对候选区域特征值进行分类，得到logo分类，并输出候选区域的概率向量；Step 35: using the softmax classifier to classify the feature values of the candidate regions to obtain logo classification, and output the probability vector of the candidate regions;

步骤36：最后使用位置回归器进行目标Logo区域位置回归并进行目标Logo区域的提取。Step 36: Finally, use the position regressor to regress the position of the target Logo area and extract the target Logo area.

综上所述，由于采用了上述技术方案，本发明的有益效果是：In summary, owing to adopting above-mentioned technical scheme, the beneficial effect of the present invention is:

1、本发明中，进行原始数据集扩充、区域卷积神经网络模型训练、基于区域卷积神经网络的Logo检测与识别，实现了复杂背景下多种Logo的检测与识别；1. In the present invention, the expansion of the original data set, the training of the regional convolutional neural network model, and the detection and identification of Logos based on the regional convolutional neural network have realized the detection and identification of various Logos under complex backgrounds;

2、本发明中，提出的利用图像合成技术进行数据集的扩充方案，可以非常有效的进行Logo检测与识别数据集的扩充；2. In the present invention, the proposed data set expansion scheme using image synthesis technology can effectively expand the Logo detection and recognition data set;

3、本发明中，采用的基于去与卷积神经网络的Logo检测与识别方法，与传统的Logo检测识别方法相比，其可以显著提升检测识别效果，同时与人工选取特征进行分类相比，基于区域卷积神经网络模型的检测识别算法具有更高的迁移性；3. In the present invention, the Logo detection and recognition method based on the convolutional neural network can significantly improve the detection and recognition effect compared with the traditional Logo detection and recognition method, and at the same time compared with manual selection of features for classification, The detection and recognition algorithm based on the regional convolutional neural network model has higher mobility;

4、本发明中，区域卷积神经网络模型选取CaffeNet和VGGG16两种网络模型，CaffeNet模型的识别mAP值为69.6，VGG16模型的识别mAP值为70.6，识别精确度高；4. In the present invention, two kinds of network models of CaffeNet and VGGG16 are selected for the regional convolutional neural network model, the recognition mAP value of the CaffeNet model is 69.6, and the recognition mAP value of the VGG16 model is 70.6, and the recognition accuracy is high;

5、本发明中，利用RoI池化层完成对图像候选区域到特征值候选区域的映射计算，与对每个候选区域进行卷积操作相比，RoI池化层可以极大程度减少卷积操作运算量；5. In the present invention, the RoI pooling layer is used to complete the mapping calculation from the image candidate area to the feature value candidate area. Compared with the convolution operation for each candidate area, the RoI pooling layer can greatly reduce the convolution operation Computation;

6、本发明中，利用位置回归器进行目标Logo区域位置回归可以提升检测识别算法效果3个百分点。6. In the present invention, using a position regressor to regress the position of the target Logo area can improve the effect of the detection and recognition algorithm by 3 percentage points.

附图说明Description of drawings

图1为本发明方法流程图；Fig. 1 is a flow chart of the method of the present invention;

图2为本发明步骤3方法流程图；Fig. 2 is the method flowchart of step 3 of the present invention;

图3为本发明步骤3方法流程图；Fig. 3 is the method flowchart of step 3 of the present invention;

图4为本发明实施例透明背景Logo效果图；Fig. 4 is an effect diagram of a transparent background Logo of an embodiment of the present invention;

图5为本发明实施例步骤14中图像合成效果图；Fig. 5 is an image synthesis effect diagram in step 14 of the embodiment of the present invention;

图6为本发明实施例识别与检测效果图。Fig. 6 is an effect diagram of recognition and detection according to the embodiment of the present invention.

具体实施方式Detailed ways

为了使本发明的目的、技术方案及优点更加清楚明白，以下结合附图及实施例，对本发明进行进一步详细说明。应当理解，此处所描述的具体实施例仅用以解释本发明，并不用于限定本发明。In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

步骤1、进行原始数据集扩充，得到规模达到进行区域卷积神经网络模型训练要求的扩充的数据集，具体为：Step 1. Carry out the expansion of the original data set, and obtain the expanded data set whose scale meets the training requirements of the regional convolutional neural network model, specifically:

步骤12、获取原始数据集中所包含的所有透明背景格式的Logo图像，采用透明背景的Logo进行图像合成，其可以保证合成图像中Logo区域背景的多样化，从而更接近真实情况：Step 12. Obtain all Logo images with transparent background format contained in the original data set, and use the Logo with transparent background for image synthesis, which can ensure the diversification of the background of the Logo area in the synthesized image, so as to be closer to the real situation:

步骤13、获取不含原始数据集中的Logo的图像，并归一化处理图像至指定像素，这里长宽不大于800像素；Step 13. Obtain an image that does not contain the Logo in the original data set, and normalize the image to a specified pixel, where the length and width are not greater than 800 pixels;

步骤14、将步骤12获取的原始数据集中的每种Logo进行仿射变换，可进行几百至几万次缩放、旋转、平移操作的仿射变换，然后和步骤13中得到的图像进行合成，合成得到的图像并入原始数据集，得到扩充的数据集，进行Logo的仿射变换的目的是尽可能模拟自然拍摄图片中Logo的形变情况。Step 14, perform affine transformation on each Logo in the original data set obtained in step 12, and perform affine transformation of hundreds to tens of thousands of scaling, rotation, and translation operations, and then synthesize with the image obtained in step 13, The synthesized image is merged into the original data set to obtain an expanded data set. The purpose of the affine transformation of the Logo is to simulate the deformation of the Logo in the natural captured picture as much as possible.

步骤2、利用扩充后的数据集对区域卷积神经网络模型进行训练，具体为：Step 2. Use the expanded data set to train the regional convolutional neural network model, specifically:

步骤23、利用步骤22得到的正负样本对区域卷积神经网络模型进行训练，模型中Softmax分类器的输出维度为Logo种类数加1，即把背景区域作为一种Logo进行分类处理，此处可以选取CaffeNet和VGGG16两种网络模型进行训练。Step 23. Use the positive and negative samples obtained in step 22 to train the regional convolutional neural network model. The output dimension of the Softmax classifier in the model is the number of Logo types plus 1, that is, the background area is classified as a Logo. Here Two network models, CaffeNet and VGGG16, can be selected for training.

步骤3、基于区域卷积神经网络进行Logo检测与识别，流程图如图2和图3所示，具体为：Step 3. Logo detection and recognition based on the regional convolutional neural network. The flow chart is shown in Figure 2 and Figure 3, specifically:

步骤33：将整个图像输入到卷积神经网络进行整个图像的特征值的计算和提取；Step 33: Input the entire image into the convolutional neural network to calculate and extract the feature values of the entire image;

步骤34：基于步骤32的图像候选区域和步骤33得到的图像的特征值，利用RoI池化层完成对图像候选区域到候选区域特征值的映射计算，与对每个候选区域进行卷积操作相比，RoI池化层可以极大程度减少卷积操作运算量；Step 34: Based on the image candidate area in step 32 and the feature value of the image obtained in step 33, use the RoI pooling layer to complete the mapping calculation from the image candidate area to the feature value of the candidate area, which is similar to the convolution operation for each candidate area Compared, the RoI pooling layer can greatly reduce the amount of convolution operation;

步骤35：利用softmax分类器对候选区域特征值进行分类，得到logo分类，并输出候选区域的概率向量，每个候选区域经过softmax分类器得到这个候选区域是哪类Logo，如果不是Logo则归为background，即把background当成一类特殊logo，softmax的最终输出结果是一个概率向量，我们最终选取概率值最大的一个维度作为结果；Step 35: Use the softmax classifier to classify the feature values of the candidate area to obtain the logo classification, and output the probability vector of the candidate area. Each candidate area is passed through the softmax classifier to obtain which type of Logo the candidate area is. If it is not a Logo, it is classified as background, that is, the background is regarded as a special type of logo, the final output result of softmax is a probability vector, and we finally select the dimension with the largest probability value as the result;

步骤36：最后使用位置回归器进行目标Logo区域位置回归并进行目标Logo区域的提取，就是对每个候选区域进行一个坐标平移或缩放变换，使得包含Logo的区域更加准确，有可能某个logo区域中只包含3/4个logo，位置回归器回归后尽可能会包含整个logo。Step 36: Finally, use the position regressor to regress the position of the target Logo area and extract the target Logo area, which is to perform a coordinate translation or scaling transformation on each candidate area to make the area containing the Logo more accurate. There may be a certain logo area Only 3/4 of the logo are included in , and the position regressor will contain the entire logo as much as possible after returning.

方法的流程图如图1所示。The flowchart of the method is shown in Fig. 1 .

实施例1Example 1

一种基于区域卷积神经网络模型的品牌Logo检测与识别方法，对运动品牌Logo进行检测与识别，步骤如下：A brand logo detection and recognition method based on a regional convolutional neural network model detects and recognizes sports brand logos, and the steps are as follows:

步骤11、利用网络爬虫技术和手工标注相结合的方式进行运动品牌Logo原始数据集的构建，构建的数据集为FlickrSprotLogos-10；Step 11, using the combination of web crawler technology and manual labeling to construct the original data set of sports brand Logo, the constructed data set is FlickrSprotLogos-10;

步骤12、获取原始数据集中所包含的所有透明背景格式的Logo图像，透明背景Logo效果如附图4所示；Step 12. Obtain all Logo images in transparent background format contained in the original data set, and the transparent background Logo effect is shown in Figure 4;

步骤13、获取不含原始数据集中的Logo的图像，采用了SUN397数据集，并归一化处理图像至长宽不大于800像素；Step 13. Obtain an image that does not contain the Logo in the original data set, using the SUN397 data set, and normalize the image until the length and width are not greater than 800 pixels;

步骤14、将步骤12获取的原始数据集中的每种Logo进行一万次缩放、旋转、平移操作的仿射变换，然后和步骤13中得到的图像进行合成，合成效果如附图5所示，合成得到的图像并入原始数据集，得到扩充的数据集。Step 14. Perform affine transformation of 10,000 scaling, rotation, and translation operations on each Logo in the original data set obtained in step 12, and then synthesize it with the image obtained in step 13. The synthesis effect is shown in Figure 5. The synthesized images are merged into the original dataset to obtain an augmented dataset.

步骤23、利用步骤22得到的正负样本对区域卷积神经网络模型进行训练，模型中Softmax分类器的输出维度为Logo种类数加1，即把背景区域作为一种Logo进行分类处理，此处选取CaffeNet和VGGG16两种网络模型进行训练。Step 23. Use the positive and negative samples obtained in step 22 to train the regional convolutional neural network model. The output dimension of the Softmax classifier in the model is the number of Logo types plus 1, that is, the background area is classified as a Logo. Here Two network models, CaffeNet and VGGG16, are selected for training.

步骤3、基于区域卷积神经网络进行Logo检测与识别，具体为：Step 3. Logo detection and recognition based on the regional convolutional neural network, specifically:

步骤35：利用softmax分类器对候选区域特征值进行分类，得到logo分类，并输出候选区域的概率向量，每个候选区域经过softmax分类器得到这个候选区域是哪类Logo，如果不是Logo则归为background，即把background当成一类特殊logo，softmax的最终输出结果是一个概率向量，最终选取概率值最大的一个维度作为结果；Step 35: Use the softmax classifier to classify the feature values of the candidate area to obtain the logo classification, and output the probability vector of the candidate area. Each candidate area is passed through the softmax classifier to obtain which type of Logo the candidate area is. If it is not a Logo, it is classified as background, that is, the background is regarded as a special type of logo, the final output result of softmax is a probability vector, and finally the dimension with the largest probability value is selected as the result;

实施例1的识别与检测效果如图6所示。The recognition and detection effect of Embodiment 1 is shown in FIG. 6 .

以上所述仅为本发明的较佳实施例而已，并不用以限制本发明，凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等，均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the present invention. Any modifications, equivalent replacements and improvements made within the spirit and principles of the present invention should be included in the protection of the present invention. within range.

Claims

Translated fromChinese

1.一种基于区域卷积神经网络的品牌Logo检测与识别方法，其特征在于：方法步骤如下：1. A brand Logo detection and recognition method based on regional convolutional neural network, characterized in that: the method steps are as follows:

步骤3、基于区域卷积神经网络模型对输入图像进行Logo检测与识别。Step 3. Perform Logo detection and recognition on the input image based on the regional convolutional neural network model.

2.根据权利要求1所述的一种基于区域卷积神经网络模型的品牌Logo检测与识别方法，其特征在于：所述步骤1具体为：2. A kind of brand Logo detection and recognition method based on regional convolutional neural network model according to claim 1, it is characterized in that: described step 1 is specifically:

3.根据权利要求1所述的一种基于区域卷积神经网络模型的品牌Logo检测与识别方法，其特征在于：所述步骤2具体为：3. A kind of brand Logo detection and recognition method based on regional convolutional neural network model according to claim 1, characterized in that: said step 2 is specifically:

4.根据权利要求3所述的一种基于区域卷积神经网络模型的品牌Logo检测与识别方法，其特征在于：所述步骤23中区域卷积神经网络模型选取CaffeNet和VGGG16两种网络模型。4. A kind of brand Logo detection and recognition method based on the regional convolutional neural network model according to claim 3, characterized in that: in the step 23, the regional convolutional neural network model selects two network models of CaffeNet and VGGG16.

5.根据权利要求1所述的一种基于区域卷积神经网络模型的品牌Logo检测与识别方法，其特征在于：所述步骤3具体为：5. A kind of brand Logo detection and recognition method based on regional convolution neural network model according to claim 1, it is characterized in that: described step 3 is specifically: