CN114973161A

Movatterモバイル変換

Info

Publication number: CN114973161A
Application number: CN202210623973.5A
Authority: CN
Inventors: 侯福金; 刘轶鹏; 王术剑; 田源; 李涛; 吴建清; 刘群; 李利平; 栗剑; 马川义; 李利娜; 薛宇翾
Original assignee: Shandong High Speed Construction Management Group Co ltd; Shandong University
Current assignee: Shandong High Speed Construction Management Group Co ltd; Shandong University
Priority date: 2022-06-02
Filing date: 2022-06-02
Publication date: 2022-08-30
Anticipated expiration: 2042-06-02
Also published as: CN114973161B

Abstract

The invention relates to a vehicle real-time detection deep neural network input end online data enhancement method, which comprises the following steps of S1: acquiring a highway vehicle data set and an image sample; step S2: carrying out random and non-repeated operations of original shape keeping unchanged, HSV (hue, saturation, value) transformation, translation, shearing/non-vertical projection and perspective transformation on the image sample to obtain a new image sample; the method specifically comprises the steps of carrying out random and non-repeated original-shape-keeping unchanged operation on 1/5 samples of image samples to serve as new image samples; performing random and non-repeated HSV conversion operation on 1/4 samples of the residual image samples to serve as new image samples; performing a random and non-repeating translation operation on 1/3 samples of the remaining image samples as new image samples; subjecting 1/2 samples of the remaining image samples to a random and non-repeating miscut/non-orthogonal projection operation as new image samples; carrying out random and non-repeated perspective transformation operation on the residual image samples to serve as new image samples; the method and the device perform data enhancement on the images in the training data set, and enhance the recognition capability of the target recognition algorithm on the overlapped targets of the vehicles on the highway under the condition of not changing the network structure.

Description

Translated fromChinese

一种车辆实时检测深度神经网络输入端在线数据增强方法及系统A method and system for online data enhancement at the input end of a deep neural network for real-time vehicle detection

技术领域technical field

本发明涉及计算机视觉技术领域，具体涉及一种车辆实时检测深度神经网络输入端在线数据增强方法及系统。The invention relates to the technical field of computer vision, in particular to a method and system for online data enhancement at the input end of a deep neural network for real-time vehicle detection.

背景技术Background technique

随着公路结构等级的优化，区域之间的联系更加密切，推动了社会经济发展，与此同时，随着新一代技术的发展，信息化、数据化的智慧公路开始试点建设，监控范围的全程覆盖、撤销省界收费站、布设ETC电子不停车收费门架，基础路侧设施智能化、数字化改造进程不断推及，结合大数据的路网综合管理。智慧公路一方面提升公路安全水准、车辆通行效率，另一方面构建高效信息通信系统，联系云平台和大数据中心，实现车路协同，统筹全局的智能化高效管理。虽然高速公路路网的建设越来越规模化、结构化、智能化，车路综合管理技术越来越完善，但在这些新技术应用的初期，车辆检测问题的挑战需要解决。高速公路场景下的车辆目标检测是交通智能管理与安全监测中的关键技术，是实现智能多样化交通管理的基础，具有重要的研究价值。车辆检测问题即是从被检区域中检测到车辆目标，并且对车辆类型准确分类对车辆位置精准定位。对于遮挡目标的识别问题，目标检测一直是计算机视觉领域的研究热点，而车辆检测由于在检测过程中存在着各类车辆外观属性、状态不同，除此还有客观等因素的干扰使得其始终是目标检测领域最基础且最具艰巨的任务。With the optimization of highway structure levels, the connection between regions has become closer, which has promoted social and economic development. At the same time, with the development of a new generation of technology, the pilot construction of informatized and data-based smart highways has begun, and the entire monitoring range has been Covering and canceling provincial toll stations, setting up ETC electronic non-stop toll collection gantry, continuous promotion of intelligent and digital transformation of basic roadside facilities, and comprehensive management of road network combined with big data. On the one hand, smart highways improve road safety and vehicle traffic efficiency, and on the other hand, build an efficient information communication system, link cloud platforms and big data centers, realize vehicle-road coordination, and coordinate overall intelligent and efficient management. Although the construction of expressway road network is becoming more and more large-scale, structured and intelligent, and the comprehensive management technology of vehicle and road is becoming more and more perfect, the challenge of vehicle detection needs to be solved in the early stage of the application of these new technologies. Vehicle target detection in highway scenes is a key technology in intelligent traffic management and safety monitoring, and it is the basis for realizing intelligent and diversified traffic management, and has important research value. The problem of vehicle detection is to detect the vehicle target from the detected area, and accurately classify the vehicle type and accurately locate the vehicle position. For the identification of occluded targets, target detection has always been a research hotspot in the field of computer vision, and vehicle detection is always a problem due to the different appearance attributes and states of various vehicles in the detection process, and the interference of objective factors. The most basic and difficult task in the field of object detection.

传统车辆检测算法提取特征的方法比较单一，依靠人工提取，从视频序列中提取运动车辆之后再对提取到的特征进行分类器分类，以达到对车辆识别的目的。但传统方式选取特征太过依赖先验知识，而在实际应用的场景存在很多客观干扰因素的存在，如光照、形变等，如光照、形变等，因此传统车辆检测算法很难应用于现实场景，难以达到实际应用所必须的精准性和鲁棒性。基于卷积神经网络(Convolutional Neural Network，CNN)逐渐发展了众多目标检测算法，按网络结构主要分为两类：One-Stage结构和Two-Stage结构。近年来，基于无监督模式和监督模式相结合的半监督模式逐渐发展起来，在这种模式下，故障辨识系统使用少量已知样本完成初始训练，然后在辨识未知样本的过程中通过自学习的方式进一步训练系统，最后完成对大量无标记样本的分类。半监督模式节省了采集有标记样本的成本，但是由于训练样本少，导致辨识系统过拟合问题严重，泛化能力不足，整体辨识结果较差。为此，基于数据增强的半监督模式被提出，通过一些数据增强算法或方法，在已知样本的基础上生成大量伪已知样本，以扩充训练样本的容量，增强辨识系统的泛化能力。数据增强算法通过对原始数据进行下采样，然后再通过数据插值的方法构造新数据。所以，数据增强算法一般都需要复杂的算法支持，将构造的新数据尽量保留原始数据的特征而又不能完全相同。因此，对于高速公路车辆检测与识别，小目标与车辆遮挡问题，对高质量数据的访问仍然是一个障碍，亟需一种新的车辆实时检测方法。The traditional vehicle detection algorithm has a relatively simple method of extracting features, relying on manual extraction, extracting moving vehicles from video sequences, and then classifying the extracted features by a classifier to achieve the purpose of vehicle identification. However, the traditional method of selecting features relies too much on prior knowledge, and there are many objective interference factors in practical application scenarios, such as illumination, deformation, etc., so traditional vehicle detection algorithms are difficult to apply in real scenarios. It is difficult to achieve the accuracy and robustness necessary for practical applications. Based on the Convolutional Neural Network (CNN), many target detection algorithms have been gradually developed, which are mainly divided into two categories according to the network structure: One-Stage structure and Two-Stage structure. In recent years, the semi-supervised mode based on the combination of unsupervised mode and supervised mode has been gradually developed. In this mode, the fault identification system uses a small number of known samples to complete the initial training, and then uses self-learning in the process of identifying unknown samples. The method further trains the system, and finally completes the classification of a large number of unlabeled samples. The semi-supervised mode saves the cost of collecting labeled samples, but due to the small number of training samples, the identification system has a serious problem of overfitting, insufficient generalization ability, and poor overall identification results. To this end, a semi-supervised mode based on data augmentation is proposed. Through some data augmentation algorithms or methods, a large number of pseudo-known samples are generated on the basis of known samples to expand the capacity of training samples and enhance the generalization ability of the identification system. Data augmentation algorithms construct new data by downsampling the original data and then interpolating the data. Therefore, data enhancement algorithms generally require complex algorithm support, and the new data constructed should retain the characteristics of the original data as much as possible without being completely the same. Therefore, access to high-quality data is still an obstacle for highway vehicle detection and recognition, small targets and vehicle occlusion issues, and a new real-time vehicle detection method is urgently needed.

发明内容SUMMARY OF THE INVENTION

针对现有技术的不足，本发明提供一种车辆实时检测深度神经网络输入端在线数据增强方法及系统，以解决上述背景技术中提出的问题。In view of the deficiencies of the prior art, the present invention provides an online data enhancement method and system for the input end of a deep neural network for real-time vehicle detection, so as to solve the problems raised in the above background art.

术语解释：Terminology Explanation:

1、batch_size：表示单次传递给程序用以训练的参数个数。1. batch_size: Indicates the number of parameters passed to the program for training at a time.

2、batch：表示批次数。其中批次数＝样本个数/batch_size。2. batch: indicates the number of batches. The number of batches = the number of samples/batch_size.

3、HSV变换：HSV(Hue,Saturation,Value)是根据颜色的直观特性由A.R.Smith在1978年创建的一种颜色空间,也称六角锥体模型(Hexcone Model)。3. HSV transformation: HSV (Hue, Saturation, Value) is a color space created by A.R. Smith in 1978 based on the intuitive characteristics of color, also known as the Hexcone Model.

4、平移：指在同一平面内，将一个图形上的所有点都按照某个直线方向做相同距离的移动，这样的图形运动叫做图形的平移运动，简称平移。4. Translation: In the same plane, all points on a graph are moved by the same distance in a certain straight line direction. Such graph motion is called graph translation motion, or translation for short.

5、错切：在某方向上，按照一定的比例对图形的每个点到某条平行于该方向的直线的有向距离做放缩得到的平面图形。5. Miscut: In a certain direction, a plane figure obtained by scaling the directional distance from each point of the figure to a line parallel to the direction according to a certain proportion.

6、非垂直投影：一种几何投影方式。6. Non-vertical projection: a geometric projection method.

透视变换：透视变换(Perspective Transformation)是指利用透视中心、像点、目标点三点共线的条件，按透视旋转定律使承影面(透视面)绕迹线(透视轴)旋转某一角度，破坏原有的投影光线束，仍能保持承影面上投影几何图形不变的变换。Perspective transformation: Perspective transformation refers to the use of the condition that the perspective center, the image point and the target point are collinear, and according to the law of perspective rotation, the bearing surface (perspective surface) is rotated around the trace (perspective axis) by a certain angle. , destroy the original projection beam, and still keep the same transformation of the projection geometry on the bearing surface.

本发明的技术方案为：The technical scheme of the present invention is:

一种车辆实时检测深度神经网络输入端在线数据增强方法,包括:An online data enhancement method at the input end of a deep neural network for real-time vehicle detection, comprising:

步骤S1：获取高速公路车辆数据集和图像样本；Step S1: Obtain a highway vehicle dataset and image samples;

步骤S2：对图像样本进行随机且不重复的维持原样不变、HSV变换、平移、错切/非垂直投影和透视变换的操作，得到新的图像样本；Step S2: Perform random and non-repetitive operations of maintaining the original, HSV transformation, translation, staggered/non-vertical projection and perspective transformation on the image sample to obtain a new image sample;

具体包括:对图像样本的1/5样本进行随机且不重复的维持原样不变操作，作为新的图像样本；Specifically include: performing random and non-repetitive operations on 1/5 of the image samples as a new image sample;

对剩余图像样本的1/4样本进行随机且不重复的HSV变换操作，作为新的图像样本；Perform random and non-repetitive HSV transformation operations on 1/4 of the remaining image samples as new image samples;

对剩余图像样本的1/3样本进行随机且不重复的平移操作，作为新的图像样本；Perform random and non-repetitive translation operations on 1/3 of the remaining image samples as new image samples;

对剩余图像样本的1/2样本进行随机且不重复的错切/非垂直投影操作，作为新的图像样本；Perform random and non-repetitive staggered/non-vertical projection operations on 1/2 of the remaining image samples as new image samples;

对剩余图像样本进行随机且不重复的透视变换操作，作为新的图像样本；Perform random and non-repetitive perspective transformation operations on the remaining image samples as new image samples;

步骤S3：在新的图像样本中随机选择若干张图片，通过随机缩放进行图像分割，再通过随机分布进行拼接并进行图像修正后形成新样本；Step S3: randomly select a number of pictures in the new image sample, perform image segmentation through random scaling, and then perform splicing through random distribution and image correction to form a new sample;

步骤S4：对新样本进行符合均匀分布的上下翻转、左右翻转与维持原样的操作，并随机加入高斯噪声，最终得到增强后图像。Step S4: Perform the operations of up and down, left and right inversion and maintaining the original state of the new sample in accordance with a uniform distribution, and randomly add Gaussian noise to finally obtain an enhanced image.

进一步地，所述步骤S1中获取高速公路车辆数据集和图像样本，包括采集高速公路监控视频中的不同角度、不同路段、不同光照条件的图像样本。Further, in the step S1, acquiring the highway vehicle data set and image samples includes collecting image samples of different angles, different road sections, and different lighting conditions in the highway surveillance video.

进一步地，所述对图像样本进行随机且不重复的HSV变换操作，包括将图像样本中的像素点数值归一化，得到像素点数值的最大值和最小值及其差值，并分别计算颜色空间下的HSV数值。Further, performing a random and non-repetitive HSV transformation operation on the image sample includes normalizing the pixel value in the image sample, obtaining the maximum value and the minimum value of the pixel value and the difference thereof, and calculating the color respectively. HSV value under space.

进一步地，所述对图像样本进行随机且不重复的平移操作，具体包括：设选取的单个图像样本数据坐标为(x,y),则Further, performing a random and non-repetitive translation operation on the image sample specifically includes: assuming that the coordinates of the selected single image sample data are (x, y), then

式中，M_T为平移矩阵，M₁₁、M₁₂、...M₃₃为平移矩阵参数，其中M₁₁、M₂₂、M₃₃为固定值1；src(x,y)表示选取的单个图像样本，dst(x,y)表示平移后的图像。_In the_formula , M_T is the translation matrix,_M₁₁ ,_M₁₂ ,... Sample, dst(x,y) represents the translated image.

进一步地，所述对图像样本进行随机且不重复的错切/非垂直投影，具体包括：设选取的单个图像样本数据坐标为(x,y),则Further, the random and non-repetitive staggered/non-vertical projection of the image samples specifically includes: assuming that the selected single image sample data coordinates are (x, y), then

式中，M_S为错切/非垂直投影矩阵，M₁₁、M₁₂、...M₃₃为错切/非垂直投影矩阵参数，其中M₁₁、M₂₂、M₃₃为固定值1；src(x,y)表示选取的单个图像样本，dst(x,y)表示错切/非垂直投影变换后的图像。In the formula, M_S is the staggered/non-vertical projection matrix, M₁₁ , M₁₂ , ... M₃₃ are the staggered/non-vertical projection matrix parameters, where M₁₁ , M₂₂ , M₃₃ are fixed values 1; src (x, y) represents the selected single image sample, and dst(x, y) represents the image after staggered/non-vertical projection transformation.

进一步地，所述对图像样本进行随机且不重复的透视变换操作，具体包括：设选取的单个图像样本数据坐标为(x,y),则Further, performing a random and non-repetitive perspective transformation operation on the image sample specifically includes: assuming that the coordinates of the selected single image sample data are (x, y), then

式中，M_P为透视变换矩阵，src(x,y)表示选取的单个图像样本，dst(x,y)表示透视变换变换后的图像。In the formula, M_P is the perspective transformation matrix, src(x, y) represents the selected single image sample, and dst(x, y) represents the image after perspective transformation.

进一步地，所述图像修正包括将拼接后的图像的不在对应图像所在区域内的检测框剔除，将检测框一部分在图像区域内，一部分不在图像区域内的，以该图的区域分界线代替越界的检测框线条。Further, the image correction includes removing the detection frame of the spliced image that is not in the area where the corresponding image is located, and if a part of the detection frame is in the image area, and part of the detection frame is not in the image area, the area boundary line of the figure is used instead of crossing the boundary. detection box lines.

一种车辆实时检测深度神经网络输入端在线数据增强系统，包括：An online data enhancement system at the input end of a deep neural network for real-time vehicle detection, comprising:

数据获取模块，被配置为，获取高速公路车辆数据集和图像样本；a data acquisition module configured to acquire a highway vehicle dataset and image samples;

图像处理模块，被配置为，通过对原始样本进行随机且不重复的维持原样不变、HSV变换、平移、错切/非垂直投影、透视变换操作，得到新的图像样本；随机使用变换后图像中的若干张图片，通过随机缩放进行图像分割，再通过随机分布进行拼接并进行图像修正后形成新样本；对新样本进行符合均匀分布的上下翻转、左右翻转与维持原样的操作，并随机加入高斯噪声，最终得到增强后图像。The image processing module is configured to obtain new image samples by randomly and non-repeatingly maintaining the original samples, HSV transformation, translation, staggered/non-vertical projection, and perspective transformation operations; randomly using the transformed image Several pictures in the image are divided by random scaling, and then stitched by random distribution and image correction to form new samples; the new samples are subjected to upside-down, left-right, and original operations that conform to the uniform distribution, and are added randomly. Gaussian noise, and finally get the enhanced image.

一种计算机可读存储介质，其中存储有多条指令，所述指令适于由终端设备的处理器加载并执行所述的一种车辆实时检测深度神经网络输入端在线数据增强方法。A computer-readable storage medium stores a plurality of instructions, wherein the instructions are adapted to be loaded by a processor of a terminal device and execute the online data enhancement method for the input end of a deep neural network for real-time vehicle detection.

一种终端设备，包括处理器和计算机可读存储介质，处理器用于实现各指令；计算机可读存储介质用于存储多条指令，所述指令适于由处理器加载并执行所述的一种车辆实时检测深度神经网络输入端在线数据增强方法。A terminal device, comprising a processor and a computer-readable storage medium, where the processor is used to implement various instructions; the computer-readable storage medium is used to store a plurality of instructions, the instructions are suitable for being loaded by the processor and executing the described one An online data augmentation method at the input of a deep neural network for real-time vehicle detection.

本发明的有益效果为：The beneficial effects of the present invention are:

1、本发明对训练数据集中的图像进行数据增强，在不改变网络结构的情况下，加强目标识别算法对高速公路车辆重叠目标的识别能力。1. The present invention performs data enhancement on the images in the training data set, and enhances the ability of the target recognition algorithm to recognize overlapping targets of expressway vehicles without changing the network structure.

2、针对处于离线状态下的小型摄像平台，选择合适的轻量级算法，在保持其识别速度的情况下提升识别准确率，降低不同分辨率的输入图像对算法识别性能的影响。2. For a small camera platform in an offline state, select an appropriate lightweight algorithm to improve the recognition accuracy while maintaining its recognition speed, and reduce the impact of input images of different resolutions on the algorithm's recognition performance.

3、通过在车辆数据集不同的高速道路场景进行实时检测测试，结果表明，经过改进数据增强方法训练后的算法在车辆重叠目标的检测当中准确率更高，证明了改进方法的有效性和鲁棒性。3. Through real-time detection tests in different high-speed road scenes of vehicle datasets, the results show that the algorithm trained by the improved data augmentation method has higher accuracy in the detection of vehicle overlapping targets, which proves the effectiveness and robustness of the improved method. Awesome.

附图说明Description of drawings

构成本申请的一部分的说明书附图用来提供对本申请的进一步理解，本申请的示意性实施例及其说明用于解释本申请，并不构成对本申请的不当限定。The accompanying drawings that form a part of the present application are used to provide further understanding of the present application, and the schematic embodiments and descriptions of the present application are used to explain the present application and do not constitute improper limitations on the present application.

图1为本发明在线数据增强方法的流程示意图；Fig. 1 is the schematic flow chart of the online data enhancement method of the present invention;

图2为本发明mosaic数据增强示意图；Fig. 2 is a schematic diagram of mosaic data enhancement of the present invention;

图3为本发明增强mosaic示意图；Fig. 3 is the enhanced mosaic schematic diagram of the present invention;

图4为样本标注类别数目统计示意图；Figure 4 is a schematic diagram of the statistics of the number of sample annotation categories;

图5本发明数据增强的深度学习网络PR曲线；Fig. 5 deep learning network PR curve of data enhancement of the present invention;

图6为本发明数据增强前深度学习网络识别准确率示意图；6 is a schematic diagram of the recognition accuracy rate of deep learning network before data enhancement of the present invention;

图7为本发明数据增强后深度学习网络识别准确率示意图。FIG. 7 is a schematic diagram of the recognition accuracy of the deep learning network after data enhancement of the present invention.

具体实施方式Detailed ways

下面结合说明书附图和实施例对本发明作进一步限定，但不限于此。The present invention is further defined below with reference to the accompanying drawings and embodiments of the description, but is not limited thereto.

实施例1Example 1

如图1所示，一种车辆实时检测深度神经网络输入端在线数据增强方法，具体包括以下步骤：As shown in Figure 1, an online data enhancement method at the input end of a deep neural network for real-time vehicle detection includes the following steps:

S1、定义高速公路车辆数据集、设置样本个数并初始化batch_size；S1. Define the highway vehicle data set, set the number of samples and initialize batch_size;

S2、获取batch大小；S2, get the batch size;

S3、每个batch内部按照文件路径的读取方式排列；S3. Each batch is arranged according to the reading method of the file path;

S4、打乱batch内排列顺序；S4, disrupt the arrangement order in the batch;

S5、对步骤S4的batch内的数据随机进行HSV变换、平移、错切/非垂直投影、透视变换操作；S5, randomly perform HSV transformation, translation, staggered/non-vertical projection, and perspective transformation operations on the data in the batch of step S4;

S6、随机使用步骤S5中的4张图片，随机缩放，再随机分布进行拼接。S6. Randomly use the 4 pictures in step S5, randomly scale them, and then randomly distribute them for splicing.

S7、对新样本进行符合均匀分布的上下翻转、左右翻转与维持原样的操作，并随机加入噪声；S7. Perform up and down, left and right inversion and maintain the same operations on the new sample in accordance with a uniform distribution, and add noise randomly;

S8、batch内部样本是否读取完毕，若是则结束，否则返回步骤S6。S8. Whether the internal samples of the batch have been read, if so, end, otherwise return to step S6.

其中，步骤S1中定义高速公路车辆数据集、设置样本个数并初始化batch_size具体方法为：Among them, in step S1, the specific methods of defining the expressway vehicle data set, setting the number of samples and initializing the batch_size are:

针对高速公路场景分析高速公路监控应用角度、分辨率等特点，采集山东东营高速公路东营至济南段某点与山东淄博高速公路沂源隧道高速公路多角度监控视频，为确保图像中车辆样本的多样性，处理原始视频数据手段为每隔40帧抽取一张作为图像样本，如图2所示，共选取出1493张带有车辆信息的图片，并且与选取1060张多类型车辆图片通过不同角度、不同路段、不同光照条件的高清监控视频与多类型车辆图片制作检测网络训练数据集组成原始车辆检测图像样本库。高速公路车辆数据集用dataset表示，data_size表示样本个数，batch_size单次传递给程序用以训练的参数个数。According to the characteristics of highway monitoring application angle, resolution and other characteristics of highway scene analysis, the multi-angle monitoring video of Shandong Dongying Expressway Dongying-Jinan section and Shandong Zibo Expressway Yiyuan Tunnel Expressway was collected to ensure the diversity of vehicle samples in the image. , the method of processing the original video data is to extract one image sample every 40 frames. As shown in Figure 2, a total of 1493 pictures with vehicle information are selected, and compared with the selection of 1060 multi-type vehicle pictures through different angles and different The high-definition surveillance video of road sections, different lighting conditions, and multi-type vehicle pictures are used to make detection network training data sets to form the original vehicle detection image sample library. The highway vehicle dataset is represented by dataset, data_size represents the number of samples, and batch_size is the number of parameters passed to the program for training at a time.

步骤S2中获取batch大小方法为：根据步骤S1中data_size样本个数与batch_size单次传递给程序用以训练的参数个数，

The method for obtaining the batch size in step S2 is: according to the number of data_size samples in step S1 and the number of parameters that batch_size passes to the program for training at a time,

步骤S3的具体方法为：通过python内置的数据读取器reader()函数，指定欲读取的高速公路车辆数据集dataset路径。The specific method of step S3 is: specifying the path of the highway vehicle dataset dataset to be read through the built-in data reader reader() function of python.

步骤S4的具体方法为：通过导入random模块，然后通过random静态对象调用打乱该批次内的样本顺序。将列表中的第一层元素进行了随机排序，列表中的元素依然为列表，对该列表中的元素不进行排序操作。The specific method of step S4 is: by importing the random module, and then calling the random static object to disrupt the order of samples in the batch. The first-level elements in the list are randomly sorted, the elements in the list are still lists, and the elements in the list are not sorted.

步骤S5的具体方法为：通过对步骤S4的batch内的样本数据随机进行HSV变换、平移、错切/非垂直投影、透视变换操作。The specific method of step S5 is: by randomly performing HSV transformation, translation, staggered/non-vertical projection, and perspective transformation operations on the sample data in the batch of step S4.

具体包括:Specifically include:

对图像样本的1/5样本进行随机且不重复的维持原样不变操作，作为新的图像样本；Perform random and non-repetitive operations on 1/5 of the image samples as they are, as new image samples;

作为进一步地实施方式，其具体步骤为：As a further embodiment, its specific steps are:

步骤S5-1：单个图像样本数据由RGB转换HSV，设该样本图像的RGB表示为(X_R,X_G,X_B),通道数为3。在图像样本中描述一个像素点，如果是灰度，则只需要一个数值来描述它，即图像为单通道。若图像样本一个像素点，有RGB三种颜色来描述它，则图像为三通道。首先将RGB的像素点数值归一化：Step S5-1: The single image sample data is converted from RGB to HSV, and the RGB representation of the sample image is (X_R , X_G , X_B ), and the number of channels is 3. A pixel is described in the image sample. If it is grayscale, only one value is needed to describe it, that is, the image is a single channel. If the image sample is a pixel, and there are three colors of RGB to describe it, the image is three-channel. First normalize the pixel value of RGB:

X′_R＝X_R/255 (1)X′_R = X_R /255 (1)

X′_G＝X_G/255 (2)X′_G = X_G /255 (2)

X′_B＝X_B/255 (3)X′_B = X_B /255 (3)

式(1)、式(2)、式(3)中分别表示归一化后的像素点数值。Equation (1), Equation (2), and Equation (3) respectively represent the normalized pixel value.

然后，比较三者的最大值与最小值，分别记作C_max与C_min,并计算两者的差值，记作Δ：Then, compare the maximum and minimum values of the three, denoted as C_max and C_min respectively, and calculate the difference between the two, denoted as Δ:

C_max＝max(X′_R,X′_G,X′_B) (4)C_max = max(X'_R , X'_G , X'_B ) (4)

C_min＝min(X′_R,X′_G,X′_B) (5)C_min =min(X′_R , X′_G , X′_B ) (5)

Δ＝C_max-C_min (6)Δ=C_max -C_min (6)

最后，分别计算颜色空间下的HSV数值。其中，H为：Finally, the HSV values in the color space are calculated separately. where H is:

S为：S is:

V为：V is:

V＝C_max (9)V=_Cmax (9)

步骤S5-2：设选取的单个图像样本数据坐标为(x,y),则Step S5-2: Set the coordinates of the selected single image sample data as (x, y), then

步骤S5-3：设选取的单个图像样本数据坐标为(x,y),则Step S5-3: Set the coordinates of the selected single image sample data as (x, y), then

步骤S5-4：设选取的单个图像样本数据坐标为(x,y),则Step S5-4: Set the coordinates of the selected single image sample data as (x, y), then

式中，M_P为透视变换矩阵，M₁₁、M₁₂、...M₃₃为透视变换矩阵参数，其中M₁₁、M₂₂、M₃₃为固定值1；src(x,y)表示选取的单个图像样本，dst(x,y)表示透视变换变换后的图像。In the formula, M_P is the perspective transformation matrix, M₁₁ , M₁₂ , ... M₃₃ are the perspective transformation matrix parameters, wherein M₁₁ , M₂₂ , M₃₃ are the fixed value 1; src(x, y) represents the selected A single image sample, dst(x,y) represents the perspective transformed image.

步骤S6具体方法为，首先随机图片拼接基准点坐标(x_c,y_c)，并且随机选取S5中的四张图片；随后，选取的四张图片根据基准点，分别经过尺寸调整和比例缩放后，放置在指定尺寸的大图的左上，右上，左下，右下位置；其次，根据每张图片的尺寸变换方式，将映射关系对应到图片标签上；最后，依据指定的基准点坐标，对大图进行拼接，处理超过边界的检测框坐标。具体步骤如下：The specific method of step S6 is as follows: first, the coordinates of the reference point (x_c , y_c ) are randomly spliced into the picture, and the four pictures in S5 are randomly selected; then, the four pictures selected are respectively adjusted and scaled according to the reference point. , placed in the upper left, upper right, lower left and lower right positions of the large picture of the specified size; secondly, according to the size transformation method of each picture, the mapping relationship is corresponding to the picture label; finally, according to the specified reference point coordinates, the large The graphs are stitched together, and the coordinates of the detection frame that exceed the boundaries are processed. Specific steps are as follows:

步骤S6-1，图像分割,随机选取S5中的四张图片后，设输入图片的尺寸是(i_w,i_h)；指定图片的尺寸是(w,h)，其中w＝h＝416(pixel)；缩放后的图片的尺寸是(n_w,n_h)：Step S6-1, image segmentation, after randomly selecting the four pictures in S5, set the size of the input picture to be (i_w , i_h ); the size of the specified picture is (w, h), where w=h=416 ( pixel); the size of the scaled image is (n_w , n_h ):

(1)通过调用python中opencv库的cv2.resize()函数，将图片尺寸从(i_w,i_h)转换成(w,h)；将图片尺寸再乘以缩放比例R_scale，其中，R_scale是0.6至0.8之间的一个随机数；此时便可获得压缩后的图像尺寸(n_w,n_h)；(1) Convert the image size from (i_w , i_h ) to (w, h) by calling the cv2.resize() function of the opencv library in python; multiply the image size by the scaling ratio R_scale , where R_scale is a random number between 0.6 and 0.8; at this time, the compressed image size (n_w , n_h ) can be obtained;

(2)生成一个尺寸为(w,h)的三通道画板，将第一张压缩后的图片放在画板的左上方，第二张放在右上方，第三张放在左下方，第四张放在右下方；(2) Generate a three-channel artboard of size (w, h), place the first compressed image on the upper left of the artboard, the second on the upper right, the third on the lower left, and the fourth Zhang is placed at the bottom right;

(3)此时，检测框中心点坐标为(x_c,y_c)，坐标调整比例是

令

代表y轴方向上画板边界距离缩放后图片边界的距离，

代表x轴方向上画板边界距离缩放后图片边界的距离。(3) At this time, the coordinates of the center point of the detection frame are (x_c , y_c ), and the coordinate adjustment ratio is

make

Represents the distance from the border of the drawing board in the y-axis direction to the border of the image after scaling,

Represents the distance between the border of the artboard and the border of the scaled image in the x-axis direction.

步骤S6-2，图像合并：Step S6-2, image merging:

首先设置拼接线，x_cut代表x轴方向把图像分割成两块区域，y_cut代表y轴方向把图片分割成两块。然后，创建一块新的三通道画板New_image，大小为(416,416,3)。最后，将切割后的四张图片组合在一起。First set the splicing line, x_cut represents the x-axis direction to divide the image into two areas, and y_cut represents the y-axis direction to divide the image into two areas. Then, create a new three-channel artboard New_image of size (416,416,3). Finally, combine the four cut pictures together.

步骤S6-3，处理检测框边界：Step S6-3, processing the boundary of the detection frame:

(1)将不在其对应图像所在区域内的检测框都剔除；(1) Eliminate all detection frames that are not in the area where the corresponding image is located;

(2)将检测框一部分在图像区域内，一部分不在图像区域内的，以该图的区域分界线(x_cut,y_cut)代替越界的检测框线条；(2) If a part of the detection frame is in the image area, and a part is not in the image area, replace the out-of-bounds detection frame line with the area boundary line (x_cut , y_cut ) of the figure;

(3)如果修正后的检测框的高度或者宽度过于小，则剔除这个修正后的框。(3) If the height or width of the corrected detection frame is too small, remove the corrected frame.

步骤S7的具体方法为：对新样本进行符合均匀分布的上下翻转、左右翻转与维持原样的操作，并随机加入噪声，如图4所示，具体步骤如下：The specific method of step S7 is: perform operations of up and down, left and right inversion and maintaining the original state of the new sample in accordance with a uniform distribution, and randomly add noise, as shown in FIG. 4 , and the specific steps are as follows:

步骤S7-1，Step S7-1,

新的图像样本上下翻转，box坐标y翻转：The new image sample is flipped up and down, and the box coordinate y is flipped:

im_up＝np.flipud(im)，labels[:,2]＝1-labels[:,2]。im_up=np.flipud(im), labels[:,2]=1-labels[:,2].

步骤S7-2，Step S7-2,

新的图像样本左右翻转，box坐标x翻转：The new image sample is flipped left and right, and the box coordinate x is flipped:

im_right＝np.fliplr(im)，labels[:,1]＝1-labels[:,1]。im_right=np.fliplr(im), labels[:,1]=1-labels[:,1].

步骤S7-3，随机为新图像样本添加高斯噪声，高斯随机变量z的概率密度函数由下式给出：Step S7-3, randomly adding Gaussian noise to the new image sample, the probability density function of the Gaussian random variable z is given by the following formula:

步骤S8的具体方法为：通过DataLoader函数检查batch内部样本是否读取完毕，若是则结束，否则返回步骤S6。The specific method of step S8 is: check whether the internal samples of the batch have been read through the DataLoader function, if so, end, otherwise return to step S6.

在具体实施过程中，如图5所示，在相同的超参数和训练轮数设置情况下，数据增强方法优化前后实验结果对比，在深度学习网络中的Mosaic方法重复利用数据的同时增加了样本的丰富性，已经达到了相当优秀的表现，mAP50和mAP50:95均有较大提升；本发明的增强Mosaic数据增强方法相比原本的Mosaic数据增强，在更严格的mAP50：95提升更加明显，本发明的增强Mosaic数据增强方法表现略胜一筹，几乎可以消除少样本类别对总体精度的限制瓶颈，证明了本文改进数据增强优化方法的有效性。如图6、图7所示，深度学习网络在SUV与Family Sedan方面的识别有着更加优异的表现，由于两种车型相近，尤其是在标定框较小的情况下，两者分别提升了0.5％与0.3％，整体表现更加优异。In the specific implementation process, as shown in Figure 5, under the same hyperparameter and training rounds settings, the experimental results before and after the optimization of the data augmentation method are compared, and the Mosaic method in the deep learning network reuses the data while adding samples. Compared with the original Mosaic data, the enhanced Mosaic data enhancement method of the present invention is enhanced, and the improvement is more obvious at the stricter mAP50:95. The enhanced Mosaic data enhancement method of the present invention performs slightly better, and can almost eliminate the bottleneck limiting the overall accuracy of the few-sample category, which proves the effectiveness of the improved data enhancement and optimization method in this paper. As shown in Figure 6 and Figure 7, the deep learning network has better performance in the recognition of SUV and Family Sedan. Since the two models are similar, especially when the calibration frame is small, the two are improved by 0.5% respectively. With 0.3%, the overall performance is even better.

综上所述，本发明对训练数据集中的图像进行数据增强，在不改变网络结构的情况下，加强了目标识别算法对高速公路车辆重叠目标的识别能力。经过改进数据增强方法训练后的算法在高速公路车辆重叠目标的检测当中准确率更高，证明了改进方法的有效性和鲁棒性。To sum up, the present invention performs data enhancement on the images in the training data set, and enhances the ability of the target recognition algorithm to recognize overlapping targets of expressway vehicles without changing the network structure. The algorithm trained by the improved data augmentation method has higher accuracy in the detection of overlapping targets of expressway vehicles, which proves the effectiveness and robustness of the improved method.

实施例2Example 2

实施例3Example 3

一种计算机可读存储介质，其中存储有多条指令，所述指令适于由终端设备的处理器加载并执行本实施例提供的一种车辆实时检测深度神经网络输入端在线数据增强方法。A computer-readable storage medium stores a plurality of instructions, and the instructions are suitable for being loaded by a processor of a terminal device and executing the online data enhancement method for the input end of a deep neural network for real-time vehicle detection provided in this embodiment.

实施例4Example 4

一种终端设备，包括处理器和计算机可读存储介质，处理器用于实现各指令；计算机可读存储介质用于存储多条指令，所述指令适于由处理器加载并执行本实施例提供的一种车辆实时检测深度神经网络输入端在线数据增强方法。A terminal device includes a processor and a computer-readable storage medium, where the processor is used to implement various instructions; the computer-readable storage medium is used to store a plurality of instructions, and the instructions are suitable for the processor to load and execute the instructions provided in this embodiment. An online data enhancement method at the input end of a deep neural network for real-time vehicle detection.

本领域内的技术人员应明白，本申请的实施例可提供为方法、系统、或计算机程序产品。因此，本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且，本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。As will be appreciated by those skilled in the art, the embodiments of the present application may be provided as a method, a system, or a computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器，使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the present application. It will be understood that each flow and/or block in the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to the processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing device to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing device produce Means for implementing the functions specified in a flow or flow of a flowchart and/or a block or blocks of a block diagram.

这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中，使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品，该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions The apparatus implements the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.

这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上，使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理，从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable data processing device to cause a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process such that The instructions provide steps for implementing the functions specified in the flow or blocks of the flowcharts and/or the block or blocks of the block diagrams.

以上所述仅为本申请的优选实施例而已，并不用于限制本申请，对于本领域的技术人员来说，本申请可以有各种更改和变化。凡在本申请的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本申请的保护范围之内。The above descriptions are only preferred embodiments of the present application, and are not intended to limit the present application. For those skilled in the art, the present application may have various modifications and changes. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of this application shall be included within the protection scope of this application.

上述虽然结合附图对本发明的具体实施方式进行了描述，但并非对本发明保护范围的限制，所属领域技术人员应该明白，在本发明的技术方案的基础上，本领域技术人员不需要付出创造性劳动即可做出的各种修改或变形仍在本发明的保护范围以内。Although the specific embodiments of the present invention have been described above in conjunction with the accompanying drawings, they do not limit the scope of protection of the present invention. Those skilled in the art should understand that on the basis of the technical solutions of the present invention, those skilled in the art do not need to pay creative work. Various modifications or deformations that can be made are still within the protection scope of the present invention.

Claims

Translated fromChinese

1.一种车辆实时检测深度神经网络输入端在线数据增强方法，其特征在于，包括:1. a vehicle real-time detection deep neural network input end online data enhancement method, is characterized in that, comprises:

具体包括:对图像样本的部分样本进行随机且不重复的维持原样不变操作，作为新的图像样本；Specifically include: performing a random and non-repetitive operation of maintaining the same original image sample as a new image sample;

对剩余图像样本的部分样本进行随机且不重复的HSV变换操作，作为新的图像样本；Perform random and non-repetitive HSV transformation operations on some of the remaining image samples as new image samples;

对剩余图像样本的部分样本进行随机且不重复的平移操作，作为新的图像样本；Perform random and non-repetitive translation operations on some of the remaining image samples as new image samples;

对剩余图像样本的部分样本进行随机且不重复的错切/非垂直投影操作，作为新的图像样本；Perform random and non-repetitive staggered/non-vertical projection operations on some of the remaining image samples as new image samples;

2.如权利要求1所述的一种车辆实时检测深度神经网络输入端在线数据增强方法，其特征在于，所述步骤S1中获取高速公路车辆数据集和图像样本，包括采集高速公路监控视频中的不同角度、不同路段、不同光照条件的图像样本。2. A kind of real-time vehicle detection deep neural network input end online data enhancement method as claimed in claim 1, it is characterized in that, in described step S1, obtain expressway vehicle data set and image sample, include collecting expressway surveillance video in the collection. image samples from different angles, different road sections, and different lighting conditions.

3.如权利要求1所述的一种车辆实时检测深度神经网络输入端在线数据增强方法，其特征在于，所述对图像样本随机进行随机且不重复的维持原样不变操作，包括将图像样本中的像素点数值归一化，得到像素点数值的最大值和最小值及其差值，并分别计算颜色空间下的HSV数值。3 . The method for online data enhancement at the input end of a deep neural network for real-time vehicle detection according to claim 1 , wherein the random and non-repetitive maintenance of the image samples is performed randomly, comprising: The pixel values in the are normalized to obtain the maximum and minimum values of the pixel values and their differences, and the HSV values in the color space are calculated respectively.

4.如权利要求1所述的一种车辆实时检测深度神经网络输入端在线数据增强方法，其特征在于，所述对图像样本进行随机且不重复的平移操作，具体包括：设选取的单个图像样本数据坐标为(x,y),则4. a kind of vehicle real-time detection deep neural network input end online data enhancement method as claimed in claim 1 is characterized in that, described image sample is carried out random and non-repetitive translation operation, specifically comprises: set the single image selected The sample data coordinates are (x, y), then

式中，M_T为平移矩阵，M₁₁、M₁₂、...M₃₃为平移矩阵参数，其中M₁₁、M₂₂、M₃₃为固定值1。In the formula, M_T is a translation matrix, M₁₁ , M₁₂ , ... M₃₃ are translation matrix parameters, wherein M₁₁ , M₂₂ , and M₃₃ are a fixed value of 1.

src(x,y)表示选取的单个图像样本，dst(x,y)表示平移后的图像。src(x,y) represents the selected single image sample, and dst(x,y) represents the translated image.

5.如权利要求1所述的一种车辆实时检测深度神经网络输入端在线数据增强方法，其特征在于，所述对图像样本进行随机且不重复的错切/非垂直投影操作，具体包括：设选取的单个图像样本数据坐标为(x,y),则5. The method for online data enhancement at the input end of a deep neural network for real-time vehicle detection as claimed in claim 1, wherein the random and non-repetitive staggered/non-vertical projection operation is performed on the image sample, specifically comprising: Let the coordinates of the selected single image sample data be (x, y), then

6.如权利要求1所述的一种车辆实时检测深度神经网络输入端在线数据增强方法，其特征在于，所述对图像样本进行随机且不重复的透视变换操作，具体包括：设选取的单个图像样本数据坐标为(x,y),则6. The online data enhancement method for a deep neural network input end for real-time vehicle detection as claimed in claim 1, wherein the random and non-repetitive perspective transformation operation is performed on the image sample, specifically comprising: assuming that a single selected The image sample data coordinates are (x, y), then

7.如权利要求1所述的一种车辆实时检测深度神经网络输入端在线数据增强方法，其特征在于，所述图像修正包括将拼接后的图像的不在对应图像所在区域内的检测框剔除，将检测框一部分在图像区域内，一部分不在图像区域内的，以该图的区域分界线代替越界的检测框线条。7. A kind of real-time vehicle detection deep neural network input end online data enhancement method as claimed in claim 1, is characterized in that, described image correction comprises that the detection frame of the image after splicing that is not in the area where the corresponding image is located is eliminated, If a part of the detection frame is in the image area and part of it is not in the image area, the boundary line of the area in the figure replaces the out-of-bounds detection frame line.

8.一种车辆实时检测深度神经网络输入端在线数据增强系统，其特征在于，包括：8. A vehicle real-time detection deep neural network input end online data enhancement system, characterized in that, comprising:

图像处理模块，被配置为，通过对原始样本进行随机且不重复的维持原样不变、HSV变换、平移、错切/非垂直投影、透视变换操作，得到新的图像样本；随机使用变换后图像中的若干张图片，通过随机缩放进行图像分割，再通过随机分布进行拼接并进行图像修正后形成新样本；对新样本进行符合均匀分布的上下翻转、左右翻转与维持原样的操作，并随机加入高斯噪声，最终得到增强后图像。The image processing module is configured to obtain new image samples by randomly and non-repeatingly maintaining the original samples, HSV transformation, translation, staggered/non-vertical projection, and perspective transformation operations; randomly using the transformed image Several pictures in the image are divided by random scaling, and then stitched by random distribution and image correction to form new samples; the new samples are subjected to upside-down, left-right, and original operations that conform to the uniform distribution, and are added randomly. Gaussian noise, and finally the enhanced image is obtained.

9.一种计算机可读存储介质，其特征在于，其中存储有多条指令，所述指令适于由终端设备的处理器加载并执行权利要求1-7中任一项所述的一种车辆实时检测深度神经网络输入端在线数据增强方法。9. A computer-readable storage medium, wherein a plurality of instructions are stored therein, the instructions are adapted to be loaded by a processor of a terminal device and execute a vehicle according to any one of claims 1-7 An online data augmentation method for real-time detection of deep neural network inputs.

10.一种终端设备，其特征在于，包括处理器和计算机可读存储介质，处理器用于实现各指令；计算机可读存储介质用于存储多条指令，所述指令适于由处理器加载并执行权利要求1-7中任一项所述的一种车辆实时检测深度神经网络输入端在线数据增强方法。10. A terminal device, characterized in that it comprises a processor and a computer-readable storage medium, wherein the processor is used to implement each instruction; the computer-readable storage medium is used to store a plurality of instructions, and the instructions are suitable for being loaded by the processor and storing the instructions. Execute the online data enhancement method for the input end of the deep neural network for real-time vehicle detection according to any one of claims 1-7.