Movatterモバイル変換


[0]ホーム

URL:


CN115482523B - Small object target detection method and system of lightweight multi-scale attention mechanism - Google Patents

Small object target detection method and system of lightweight multi-scale attention mechanism

Info

Publication number
CN115482523B
CN115482523BCN202211241968.4ACN202211241968ACN115482523BCN 115482523 BCN115482523 BCN 115482523BCN 202211241968 ACN202211241968 ACN 202211241968ACN 115482523 BCN115482523 BCN 115482523B
Authority
CN
China
Prior art keywords
attention mechanism
feature map
scale
features
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211241968.4A
Other languages
Chinese (zh)
Other versions
CN115482523A (en
Inventor
鲁慧民
马菘哲
王贵增
薛涵
桑鹏程
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changchun University of Technology
Original Assignee
Changchun University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changchun University of TechnologyfiledCriticalChangchun University of Technology
Priority to CN202211241968.4ApriorityCriticalpatent/CN115482523B/en
Publication of CN115482523ApublicationCriticalpatent/CN115482523A/en
Application grantedgrantedCritical
Publication of CN115482523BpublicationCriticalpatent/CN115482523B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

Translated fromChinese

本发明提供一种轻量级多尺度注意力机制的小物体目标检测方法及系统,方法包括如下步骤:步骤1,利用GhostNet作为YOLOv4目标检测架构的主干特征提取网络提取特征;步骤2,对步骤1所提取到的特征使用多尺度注意力模块捕获从空间和通道两个维度上对小目标图像中具有鉴别性的特征;步骤3,对步骤2输出的特征图采用Soft‑NMS算法降低与当前最佳检测框重叠的检测框的置信度。本发明网络结构尺寸小、检测速度快、对小目标的检测效果好,完全满足实时场景的要求,有非常高的实用价值。

This paper provides a method and system for small object detection using a lightweight multi-scale attention mechanism. The method comprises the following steps: Step 1: Extracting features using GhostNet as the backbone feature extraction network of the YOLOv4 object detection architecture; Step 2: Applying a multi-scale attention module to the features extracted in Step 1 to capture discriminative features of small object images in both spatial and channel dimensions; and Step 3: Applying the Soft-NMS algorithm to the feature map output in Step 2 to reduce the confidence of detection boxes that overlap with the current optimal detection box. This network structure is compact, fast, and effective in detecting small objects, fully meeting the requirements of real-time scenarios and possessing high practical value.

Description

Small object target detection method and system of lightweight multi-scale attention mechanism
Technical Field
The invention belongs to the technical field of image processing and computer vision, and particularly relates to a small object target detection method and system of a lightweight multi-scale attention mechanism.
Background
In recent years, with the rapid development of computer vision based on deep learning, object detection has become a popular research direction of computer vision, and is widely applied to various fields such as video monitoring, industrial detection, medical treatment, and the like. The method has important practical significance in reducing the consumption of manpower and material resources by utilizing computer vision.
Target detection is a very basic and important task, and image segmentation, object tracking, keypoint detection, etc. typically rely on target detection. In target detection, the number, size and pose of objects in each image are different, namely unstructured output, which is a point very different from image classification.
In a practical scenario, however, deep learning-based target detection is very sensitive to scale and variation of targets, especially detection of small targets. The reason for this phenomenon is mainly the following three points:
Firstly, if the detected target is smaller in scale, as the training network deepens, the detected target can easily lose features such as edge information, gray information and the like, the high-level semantic information can also obtain fewer features, and in addition, some noise information can exist in an image to mislead the training network to learn wrong features;
Second, the size of the receptive field mapping to the original image plays a relatively important role in detecting whether the target is successful or not, and when the receptive field is smaller, more spatial structural features are reserved, but abstract semantic information may be less. On the contrary, the semantic information reserved when the receptive field is large is relatively richer, but the spatial structure information of the target may be lost;
Third, convolutional neural networks are discrete implementations of feature extraction, making sub-pixel accuracy difficult. When small targets are involved, the neural network is one pixel worse in the deep layer of the network, and possibly 8 pixels or 16 pixels or more in the shallow layer, which has little effect on the large targets, but has a great effect on the small targets. Therefore, it is important to improve the detection effect of a small object and to reduce the size of the model without decreasing the accuracy.
At present, the target detection method for small targets mainly comprises the following directions:
Firstly, the thought of an image pyramid is used for carrying out scale transformation, namely enlargement or reduction, on an input detected image, an image pyramid with the image scale gradually increasing or decreasing from top to bottom can be constructed on the basis, and then a window with a fixed size is used for sliding detection of an interested target on each layer of image. However, as images with different resolutions all need to pass through a convolutional neural network, the calculated amount is large, so that the detection speed is slow;
Secondly, the image features are fused, so that semantic information of shallow features and space structure information of deep features can be improved. However, since feature level fusion is performed by extracting image features as fusion information, many detail features are lost;
thirdly, adjusting the dimension and distribution of the anchor frame. In actual use, however, a large number of anchor frames are typically required to ensure sufficient overlap with the real frames such that only a small portion of the anchor frames overlap with the real frames, which can create a large imbalance between the positive and negative anchor frames and slow the training rate.
The existing research can only deal with the detection problem of small targets, but improving the robustness of the algorithm to the target scale change and realizing the detection of lightweight small targets are still difficult works in target detection.
Disclosure of Invention
The invention aims to provide a small object target detection method and system of a lightweight multi-scale attention mechanism, which can improve the detection precision of a small object, reduce the size of a model and solve the problem that the detection precision and a lightweight network cannot coexist in the existing research method.
The invention solves the problems by the following technical means:
The first aspect of the invention provides a small object target detection method based on YOLOv4 lightweight multi-scale attention mechanism, which comprises the following steps:
Step1, extracting network extraction features by utilizing GhostNet as backbone features of a YOLOv target detection architecture;
Step 2, capturing features which are identified in the small target image in two dimensions of the space and the channel by using a multi-scale attention module for the features extracted in the step 1;
And 3, adopting a Soft-NMS algorithm to reduce the confidence of the detection frame overlapped with the current optimal detection frame for the feature map with the identifying feature for the small target obtained in the step 2.
A second aspect of the present invention provides a small object target detection system based on YOLOv's 4 lightweight multi-scale attention mechanism, comprising:
the first feature extraction module performs feature extraction by using GhostNet as a trunk feature extraction network of the YOLOv target detection architecture;
The second feature extraction module is connected with the first feature extraction module, and captures features which are identified in the small target image in two dimensions of the space and the channel by using the multi-scale attention module on the features extracted by the first feature extraction module;
And the detection output module is connected with the second feature extraction module, and reduces the confidence coefficient of the detection frame overlapped with the current optimal detection frame in the feature diagram output by the second feature extraction module by adopting a Soft-NMS algorithm.
A third aspect of the present invention provides a small object target detection apparatus, comprising:
Memory, and
A processor coupled to the memory, the processor configured to execute the small object target detection method based on YOLOv a lightweight multi-scale attention mechanism based on instructions stored in the memory.
A fourth aspect of the present invention provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the small object target detection method based on the YOLOv4 lightweight multi-scale attention mechanism.
Compared with the prior art, the invention has the beneficial effects that:
According to the invention, ghostNet is used as a main feature of a YOLOv target detection framework to extract network extraction features, the network is subjected to weight reduction for the first time while the precision is ensured, and a multi-scale attention module is provided to carry out secondary weight reduction on the network, capture important features with discrimination in small target images in two dimensions of space and channel, reduce the confidence of a detection frame overlapped with the current optimal detection frame through a Soft-NMS algorithm, and obtain small object types in pictures in real time, efficiently and accurately only by modifying few parameters, and the small object types in the images can be obtained by adopting the method of the invention for different image acquisition equipment and images acquired in different scenes, so that the method has stronger robustness.
Drawings
In order to more clearly describe the technical solutions in the embodiments of the present invention, the following will briefly describe the drawings that are required to be used in the embodiments. It is apparent that the figures presented below are only individual cases of the invention, and that for a researcher in the field, this method can be applied to real-time object detection of different scenes by a simple reproduction of the invention;
Fig. 1 is a flow chart of the method of the present invention.
Fig. 2 is a diagram of a multi-scale attention module referred to in the present invention.
Fig. 3 is a diagram showing the effect of the invention applied to different image detection of small objects.
Detailed Description
In order that the above-recited objects, features and advantages of the present invention will be more clearly understood, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description. It should be noted that this embodiment is only a part of the examples of the present invention, but not all examples, and all other examples obtained without making innovative work are included in the scope of the invention.
Example 1
The embodiment provides a small object target detection method based on YOLOv4 lightweight multi-scale attention mechanism, which comprises the following steps:
Step1, extracting network extraction features by utilizing GhostNet as backbone features of a YOLOv target detection architecture;
as shown in fig. 1, the YOLOv a 4 target detection architecture includes:
step 1.1, extracting primary features from an original image through a YOLOv target detection architecture taking GhostNet as a backbone network;
Step 1.2, conveying the strong semantic features from top to bottom through the FPN layer to the extracted preliminary features, conveying the strong positioning features from bottom to top through the PAN structure, and carrying out feature aggregation on different detection layers from different trunk layers.
Step 2, capturing features which are identified in the small target image in two dimensions of the space and the channel by using a multi-scale attention module for the features extracted in the step 1;
as shown in fig. 2, the specific steps of capturing features that are discriminative in small target images from both spatial and channel dimensions using a multi-scale attention module are:
And constructing a spatial attention mechanism module and a channel attention mechanism module, wherein the spatial attention mechanism module can enable a convolutional neural network to efficiently learn a region to be focused, so that spatial information in an original image is mapped to another space so as to preserve important features in the image, and the global features and the local features with discrimination can be adaptively learned by combining maximum pooling and average pooling. The channel attention mechanism of the channel attention mechanism module represents the correlation between the feature map of the channel and important features by adding a weight to the feature maps of n channels, wherein the larger the weight is, the more important features are contained in the feature map of the channel;
The method comprises the steps of constructing a spatial attention mechanism module and a channel attention mechanism module, combining the constructed spatial attention mechanism module and the channel attention mechanism module to construct a multi-scale attention mechanism, wherein the multi-scale attention mechanism adopts 4 branches to carry out multi-scale feature extraction on an input feature image, the first branch uses a 1×1 convolution operation to adjust the channel number to be the same as the channel number of the feature image output by other three branches, the second branch uses two cascaded 1×3 convolution operations and 3×1 convolution operations, the third branch uses two cascaded 1×5 convolution operations and 5×1 convolution operations, the two cascaded asymmetric convolution operations effectively reduce the number of network parameters, and meanwhile, more nonlinear activation layers can be introduced to improve the nonlinear learning capacity, and the fourth branch firstly uses a 3×3 maximum pooling operation to extract the feature texture and then carries out 1×1 convolution operation to adjust the channel number to be the same as the channel number of the feature image output by other three branches;
first, the feature map tensorInput to a spatial attention mechanism module for calculation to add spatial attention to obtain a feature map tensorWherein w, h and c are the width, height and channel number of the feature map respectively;
then, useThe 1 multiplied by 1 convolution check feature image tensor S carries out convolution operation to obtain the feature image tensor;
Then, 4 branches of a multiscale attention mechanism are used for respectively carrying out multiscale feature extraction on the feature map tensor D to obtain a multiscale feature map tensorPerforming feature fusion on the feature map tensors P1、P2、P3 and P4 by adopting Concat operation to obtain the feature map tensorsInputting the feature map tensor Q into a channel attention mechanism module for calculation to add channel attention to obtain the feature map tensor;
Finally, performing feature fusion on the feature map tensors S and C by adopting Add operation to obtain a feature map tensorAs an output of the multi-scale attention mechanism.
Step 3, adopting a Soft-NMS algorithm to reduce the confidence coefficient of the detection frame overlapped with the current optimal detection frame for the feature map with the identifying feature for the small target obtained in the step 2;
the decay formula of the Soft-NMS algorithm for reducing the confidence of the detection frame overlapped with the current optimal detection frame is as follows:
Wherein Si is confidence, bi is detection frame,For adjusting the degree of attenuation.
Effect contrast
By using the small object eye detection method based on the YOLOv4 lightweight multi-scale attention mechanism provided by the embodiment, the accurate class of the small object can be obtained by detecting the image, fig. 3 shows the performance comparison of the method of the embodiment with different algorithms under different scenes, wherein the first column is a label picture, the second column represents the detection result of adding GhostNet on the basis of YOLOv, the third column represents the detection result of adding Soft-NMS on the basis of YOLOv, and the fourth column represents the detection result of the method. In the second and third columns, there are many small objects such as people, vehicles, animals, etc., which are missed, but the method of the embodiment can accurately detect all the small objects without missing detection. For the case of small fuzzy targets or front-back shielding and dense distribution, the method of the embodiment can accurately detect the object types, but other algorithms cannot. The results show that the method of this example is superior to YOLOv 4.
Example 2
The present embodiment provides a small object target detection system based on YOLOv's 4 lightweight multi-scale attention mechanism, comprising:
the first feature extraction module performs feature extraction by using GhostNet as a trunk feature extraction network of the YOLOv target detection architecture;
The second feature extraction module is connected with the first feature extraction module, and captures features which are identified in the small target image in two dimensions of the space and the channel by using the multi-scale attention module on the features extracted by the first feature extraction module;
And the detection output module is connected with the second feature extraction module, and reduces the confidence coefficient of the detection frame overlapped with the current optimal detection frame in the feature diagram output by the second feature extraction module by adopting a Soft-NMS algorithm.
The specific implementation method of the system of this embodiment is referred to the method described in embodiment 1, and will not be described herein.
Example 3
The present embodiment provides a small object target detection apparatus, including:
Memory, and
A processor coupled to the memory, the processor configured to perform the small object target detection method of embodiment 1 based on the YOLOv's lightweight multi-scale attention mechanism based on instructions stored in the memory.
The memory may include, for example, system memory, fixed nonvolatile storage media, and the like. The system memory stores, for example, an operating system, application programs, boot Loader (Boot Loader), and other programs.
The small object target detection device may also include an input-output interface, a network interface, a storage interface, etc. These interfaces and the memory and processor may be connected by a bus, for example. The input/output interface provides a connection interface for input/output devices such as a display, a mouse, a keyboard, a touch screen and the like. The network interface provides a connection interface for various networking devices. The storage interface provides a connection interface for external storage devices such as an SD card and a U disk.
Example 4
The present embodiment provides a non-transitory computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the small object target detection method based on YOLOv a4 lightweight multi-scale attention mechanism described in embodiment 1
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-non-transitory readable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flowchart and/or block of the flowchart illustrations and/or block diagrams, and combinations of flowcharts and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (8)

Translated fromChinese
1.一种基于YOLOv4的轻量级多尺度注意力机制的小物体目标检测方法,其特征在于,包括如下步骤:1. A small object detection method based on a lightweight multi-scale attention mechanism of YOLOv4, characterized by comprising the following steps:步骤1,利用GhostNet作为YOLOv4目标检测架构的主干特征提取网络提取特征;Step 1: Use GhostNet as the backbone feature extraction network of the YOLOv4 target detection architecture to extract features;步骤2,对步骤1所提取到的特征使用多尺度注意力模块捕获从空间和通道两个维度上对小目标图像中具有鉴别性的特征;Step 2: Use a multi-scale attention module to capture the discriminative features of the small target image in both spatial and channel dimensions on the features extracted in step 1.使用多尺度注意力模块捕获从空间和通道两个维度上对小目标图像中具有鉴别性的特征的具体步骤为:The specific steps of using the multi-scale attention module to capture discriminative features of small object images in both spatial and channel dimensions are as follows:构造一个空间注意力机制模块和一个通道注意力机制模块;其中,通道注意力机制模块的通道注意力机制通过给n个通道的特征图都增加一个权重来表示该通道的特征图与重要特征的相关性,权重越大,表示该通道的特征图包含较多的重要特征;Construct a spatial attention mechanism module and a channel attention mechanism module; the channel attention mechanism of the channel attention mechanism module adds a weight to the feature maps of n channels to indicate the correlation between the feature map of the channel and the important features. The larger the weight, the more important features the feature map of the channel contains.将构造的空间注意力机制模块和通道注意力机制模块结合,构造多尺度注意力机制;其中,多尺度注意力机制采用4条支路对输入的特征图进行多尺度特征提取,第一条支路使用一个1×1的卷积运算,第二条支路使用两个级联的1×3卷积运算和3×1卷积运算,第三条支路使用两个级联的1×5卷积运算和5×1卷积运算,第四条支路使用两个级联的3×3的最大池化运算和1×1的卷积运算;The constructed spatial attention mechanism module and channel attention mechanism module are combined to construct a multi-scale attention mechanism. The multi-scale attention mechanism uses four branches to extract multi-scale features from the input feature map. The first branch uses a 1×1 convolution operation, the second branch uses two cascaded 1×3 convolution operations and a 3×1 convolution operation, the third branch uses two cascaded 1×5 convolution operations and a 5×1 convolution operation, and the fourth branch uses two cascaded 3×3 maximum pooling operations and a 1×1 convolution operation.首先,将特征图张量输入到空间注意力机制模块进行计算以添加空间注意力,得到特征图张量,其中w、h、c分别为特征图的宽度、高度和通道数;First, the feature map tensor Input into the spatial attention mechanism module for calculation to add spatial attention and obtain the feature map tensor , where w, h, and c are the width, height, and number of channels of the feature map respectively;然后,使用个1×1卷积核对特征图张量S进行卷积运算,得到特征图张量Then, use A 1×1 convolution kernel performs convolution operation on the feature map tensor S to obtain the feature map tensor ;接着,使用多尺度注意力机制的4条支路分别对特征图张量D进行多尺度特征提取,得到多尺度的特征图张量;采用Concat操作对特征图张量P1、P2、P3和P4进行特征融合,得到特征图张量;再将特征图张量Q输入到通道注意力机制模块进行计算以添加通道注意力,得到特征图张量Then, the four branches of the multi-scale attention mechanism are used to extract multi-scale features from the feature map tensor D to obtain the multi-scale feature map tensor ; Use Concat operation to fuse the feature map tensors P1 , P2 , P3 and P4 to obtain the feature map tensor ; Then input the feature map tensor Q into the channel attention mechanism module for calculation to add channel attention and obtain the feature map tensor ;最后采用Add操作对特征图张量S和C进行特征融合,得到特征图张量,作为多尺度注意力机制的输出;Finally, the Add operation is used to fuse the feature map tensors S and C to obtain the feature map tensor , as the output of the multi-scale attention mechanism;步骤3,对步骤2中获得的对小目标具有鉴别性特征的特征图采用Soft-NMS算法降低与当前最佳检测框重叠的检测框的置信度。Step 3: Use the Soft-NMS algorithm to reduce the confidence of the detection box that overlaps with the current best detection box for the feature map with discriminative features for small targets obtained in step 2.2.根据权利要求1所述的基于YOLOv4的轻量级多尺度注意力机制的小物体目标检测方法,其特征在于,利用GhostNet作为YOLOv4目标检测架构的主干特征提取网络提取特征的具体步骤为:2. The small object target detection method based on the lightweight multi-scale attention mechanism of YOLOv4 according to claim 1 is characterized in that the specific steps of extracting features using GhostNet as the backbone feature extraction network of the YOLOv4 target detection architecture are as follows:步骤1.1,原始图像经过以GhostNet为主干网络的YOLOv4目标检测架构提取到初步特征;Step 1.1: The original image is passed through the YOLOv4 target detection architecture with GhostNet as the backbone network to extract preliminary features;步骤1.2,对提取到的初步特征通过FPN层自顶向下传达强语义特征,再通过PAN结构自底向上传达强定位特征,从不同的主干层对不同的检测层进行特征聚合。In step 1.2, the extracted preliminary features are transmitted from top to bottom through the FPN layer to convey strong semantic features, and then from bottom to top through the PAN structure to convey strong positioning features, and feature aggregation is performed on different detection layers from different backbone layers.3.根据权利要求1所述的基于YOLOv4的轻量级多尺度注意力机制的小物体目标检测方法,其特征在于,采用Soft-NMS算法降低与当前最佳检测框重叠的检测框的置信度的Soft-NMS算法的衰减公式为:3. The small object detection method based on the lightweight multi-scale attention mechanism of YOLOv4 according to claim 1 is characterized in that the Soft-NMS algorithm is used to reduce the confidence of the detection box overlapping with the current best detection box. The attenuation formula of the Soft-NMS algorithm is:其中,Si为置信度,bi为检测框,用于调节衰减程度。Among them,Si is the confidence,bi is the detection box, Used to adjust the attenuation level.4.一种基于YOLOv4的轻量级多尺度注意力机制的小物体目标检测系统,其特征在于,包括:4. A small object detection system based on a lightweight multi-scale attention mechanism of YOLOv4, characterized by comprising:第一特征提取模块,利用GhostNet作为YOLOv4目标检测架构的主干特征提取网络进行特征提取;The first feature extraction module uses GhostNet as the backbone feature extraction network of the YOLOv4 target detection architecture for feature extraction;第二特征提取模块,与第一特征提取模块连接,对第一特征提取模块提取到的特征使用多尺度注意力模块中的多尺度注意力捕获从空间和通道两个维度上对小目标图像中具有鉴别性的特征;The second feature extraction module is connected to the first feature extraction module, and uses the multi-scale attention in the multi-scale attention module to capture the discriminative features of the small target image in two dimensions, namely, spatial and channel.使用多尺度注意力模块捕获从空间和通道两个维度上对小目标图像中具有鉴别性的特征的具体步骤为:The specific steps of using the multi-scale attention module to capture discriminative features of small object images in both spatial and channel dimensions are as follows:构造一个空间注意力机制模块和一个通道注意力机制模块;其中,通道注意力机制模块的通道注意力机制通过给n个通道的特征图都增加一个权重来表示该通道的特征图与重要特征的相关性,权重越大,表示该通道的特征图包含较多的重要特征;Construct a spatial attention mechanism module and a channel attention mechanism module; the channel attention mechanism of the channel attention mechanism module adds a weight to the feature maps of n channels to indicate the correlation between the feature map of the channel and the important features. The larger the weight, the more important features the feature map of the channel contains.将构造的空间注意力机制模块和通道注意力机制模块结合,构造多尺度注意力机制;其中,多尺度注意力机制采用4条支路对输入的特征图进行多尺度特征提取,第一条支路使用一个1×1的卷积运算,第二条支路使用两个级联的1×3卷积运算和3×1卷积运算,第三条支路使用两个级联的1×5卷积运算和5×1卷积运算,第四条支路使用两个级联的3×3的最大池化运算和1×1的卷积运算;The constructed spatial attention mechanism module and channel attention mechanism module are combined to construct a multi-scale attention mechanism. The multi-scale attention mechanism uses four branches to extract multi-scale features from the input feature map. The first branch uses a 1×1 convolution operation, the second branch uses two cascaded 1×3 convolution operations and a 3×1 convolution operation, the third branch uses two cascaded 1×5 convolution operations and a 5×1 convolution operation, and the fourth branch uses two cascaded 3×3 maximum pooling operations and a 1×1 convolution operation.首先,将特征图张量输入到空间注意力机制模块进行计算以添加空间注意力,得到特征图张量,其中w、h、c分别为特征图的宽度、高度和通道数;First, the feature map tensor Input into the spatial attention mechanism module for calculation to add spatial attention and obtain the feature map tensor , where w, h, and c are the width, height, and number of channels of the feature map respectively;然后,使用个1×1卷积核对特征图张量S进行卷积运算,得到特征图张量Then, use A 1×1 convolution kernel performs convolution operation on the feature map tensor S to obtain the feature map tensor ;接着,使用多尺度注意力机制的4条支路分别对特征图张量进行多尺度特征提取,得到多尺度的特征图张量;采用Concat操作对特征图张量P1、P2、P3和P4进行特征融合,得到特征图张量;再将特征图张量Q输入到通道注意力机制模块进行计算以添加通道注意力,得到特征图张量Next, the four branches of the multi-scale attention mechanism are used to respectively Perform multi-scale feature extraction to obtain multi-scale feature map tensors ; Use Concat operation to fuse the feature map tensors P1 , P2 , P3 and P4 to obtain the feature map tensor ; Then input the feature map tensor Q into the channel attention mechanism module for calculation to add channel attention and obtain the feature map tensor ;最后采用Add操作对特征图张量S和C进行特征融合,得到特征图张量,作为多尺度注意力机制的输出;Finally, the Add operation is used to fuse the feature map tensors S and C to obtain the feature map tensor , as the output of the multi-scale attention mechanism;检测输出模块,与第二特征提取模块连接,采用Soft-NMS算法降低第二特征提取模块输出的特征图中检测框与当前最佳检测框重叠的检测框的置信度。The detection output module is connected to the second feature extraction module and uses the Soft-NMS algorithm to reduce the confidence of the detection box in the feature map output by the second feature extraction module that overlaps with the current best detection box.5.根据权利要求4所述的基于YOLOv4的轻量级多尺度注意力机制的小物体目标检测系统,其特征在于,利用GhostNet作为YOLOv4目标检测架构的主干特征提取网络进行特征提取的具体步骤为:5. The small object target detection system based on the lightweight multi-scale attention mechanism of YOLOv4 according to claim 4 is characterized in that the specific steps of using GhostNet as the backbone feature extraction network of the YOLOv4 target detection architecture for feature extraction are:步骤1.1,原始图像经过以GhostNet为主干网络的YOLOv4目标检测架构提取到初步特征;Step 1.1: The original image is passed through the YOLOv4 target detection architecture with GhostNet as the backbone network to extract preliminary features;步骤1.2,对提取到的初步特征通过FPN层自顶向下传达强语义特征,再通过PAN结构自底向上传达强定位特征,从不同的主干层对不同的检测层进行特征聚合。In step 1.2, the extracted preliminary features are transmitted from top to bottom through the FPN layer to convey strong semantic features, and then from bottom to top through the PAN structure to convey strong positioning features, and feature aggregation is performed on different detection layers from different backbone layers.6.根据权利要求4所述的基于YOLOv4的轻量级多尺度注意力机制的小物体目标检测系统,其特征在于,采用Soft-NMS算法降低与当前最佳检测框重叠的检测框的置信度的Soft-NMS算法的衰减公式为:6. The small object detection system based on the YOLOv4 lightweight multi-scale attention mechanism according to claim 4, characterized in that the Soft-NMS algorithm is used to reduce the confidence of the detection box overlapping with the current best detection box. The attenuation formula of the Soft-NMS algorithm is:其中,Si为置信度,bi为检测框,用于调节衰减程度。Among them,Si is the confidence,bi is the detection box, Used to adjust the attenuation level.7.一种小物体目标检测装置,包括:7. A small object detection device comprising:存储器;以及Memory; and耦接至所述存储器的处理器,所述处理器被配置为基于存储在所述存储器中的指令,执行权利要求1-3中任一项所述的基于YOLOv4的轻量级多尺度注意力机制的小物体目标检测方法。A processor coupled to the memory, the processor being configured to execute the small object target detection method based on the lightweight multi-scale attention mechanism of YOLOv4 according to any one of claims 1 to 3 based on instructions stored in the memory.8.一种非瞬时性计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现权利要求1-3中任一项所述的基于YOLOv4的轻量级多尺度注意力机制的小物体目标检测方法。8. A non-transitory computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the small object target detection method based on the lightweight multi-scale attention mechanism of YOLOv4 according to any one of claims 1 to 3.
CN202211241968.4A2022-10-112022-10-11Small object target detection method and system of lightweight multi-scale attention mechanismActiveCN115482523B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202211241968.4ACN115482523B (en)2022-10-112022-10-11Small object target detection method and system of lightweight multi-scale attention mechanism

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202211241968.4ACN115482523B (en)2022-10-112022-10-11Small object target detection method and system of lightweight multi-scale attention mechanism

Publications (2)

Publication NumberPublication Date
CN115482523A CN115482523A (en)2022-12-16
CN115482523Btrue CN115482523B (en)2025-08-01

Family

ID=84393954

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202211241968.4AActiveCN115482523B (en)2022-10-112022-10-11Small object target detection method and system of lightweight multi-scale attention mechanism

Country Status (1)

CountryLink
CN (1)CN115482523B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN115631422B (en)*2022-12-232023-04-28国家海洋局东海信息中心Enteromorpha identification method based on attention mechanism
CN116863419A (en)*2023-09-042023-10-10湖北省长投智慧停车有限公司Method and device for lightening target detection model, electronic equipment and medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN114283321A (en)*2021-12-242022-04-05公安部道路交通安全研究中心Target vehicle detection method and device and computer
CN114708566A (en)*2022-04-052022-07-05哈尔滨理工大学 An automatic driving target detection method based on improved YOLOv4

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN111967538B (en)*2020-09-252024-03-15北京康夫子健康技术有限公司 Feature fusion methods, devices, equipment and storage media applied to small target detection
CN113065558B (en)*2021-04-212024-03-22浙江工业大学Lightweight small target detection method combined with attention mechanism

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN114283321A (en)*2021-12-242022-04-05公安部道路交通安全研究中心Target vehicle detection method and device and computer
CN114708566A (en)*2022-04-052022-07-05哈尔滨理工大学 An automatic driving target detection method based on improved YOLOv4

Also Published As

Publication numberPublication date
CN115482523A (en)2022-12-16

Similar Documents

PublicationPublication DateTitle
CN113286194B (en) Video processing method, device, electronic device and readable storage medium
CN111178183B (en)Face detection method and related device
CN111931764B (en) A target detection method, target detection framework and related equipment
CN113591968A (en)Infrared weak and small target detection method based on asymmetric attention feature fusion
CN110427905A (en)Pedestrian tracting method, device and terminal
WO2019218824A1 (en)Method for acquiring motion track and device thereof, storage medium, and terminal
CN111626163B (en)Human face living body detection method and device and computer equipment
CN110443210A (en)A kind of pedestrian tracting method, device and terminal
CN107545263B (en) Object detection method and device
WO2023082784A1 (en)Person re-identification method and apparatus based on local feature attention
CN115482523B (en)Small object target detection method and system of lightweight multi-scale attention mechanism
WO2014001610A1 (en)Method, apparatus and computer program product for human-face features extraction
WO2022205937A1 (en)Feature information extraction method and apparatus, model training method and apparatus, and electronic device
CN113139896A (en)Target detection system and method based on super-resolution reconstruction
CN107248174A (en)A kind of method for tracking target based on TLD algorithms
CN107563290A (en)A kind of pedestrian detection method and device based on image
CN103761747B (en)Target tracking method based on weighted distribution field
CN108229281B (en)Neural network generation method, face detection device and electronic equipment
CN118736010A (en) A dynamic visual SLAM method for low-light scenes
CN120147619A (en) Deformable object detection method based on adaptive feature extraction network and attention mechanism
WO2023160061A1 (en)Method and apparatus for determining moving object in image, electronic device, and storage medium
Chebbi et al.Deepsim-nets: Deep similarity networks for stereo image matching
Zhang et al.Single image dehazing using deep convolution neural networks
CN114764936A (en)Image key point detection method and related equipment
CN111435448A (en) Image saliency object detection method, device, equipment and medium

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp