CN120318607B

Movatterモバイル変換

Info

Publication number: CN120318607B
Application number: CN202510804742.8A
Authority: CN
Inventors: 许海霞; 戴浩华; 周维; 朱江; 张东波; 印峰; 刘奇鹏; 许宇婷; 张可欣
Original assignee: Xiangtan University
Current assignee: Xiangtan University
Priority date: 2025-06-17
Filing date: 2025-06-17
Publication date: 2025-09-19
Anticipated expiration: 2045-06-17
Also published as: CN120318607A

Abstract

The invention discloses a penicillin bottle defect detection method based on an improvement YOLOv, which belongs to the technical field of computer vision and comprises the following steps of obtaining penicillin bottle defect data and preprocessing the data to obtain a training set and a testing set, establishing a defect target detection model of the improvement YOLOv, training the defect target detection model by adopting the training set, optimizing a loss function, updating model weight parameters until the loss function converges, and testing the defect target detection model by adopting the testing set. The invention introduces a space separable pooling attention module SSPA, can strengthen the perceptibility of a network to fine features such as bottle scratches and the like, introduces a feature enhancement module MMFF, adopts convolution cores of different sizes to extract the defect features of dirt, scratches and abnormal loading under complex background and promotes the feature fusion effect by module stacking.

Description

Penicillin bottle body defect detection method based on improvement YOLOv8

Technical Field

The invention relates to the technical field of computer vision, in particular to a penicillin bottle defect detection method based on an improvement YOLOv <8 >.

Background

With the rapid development of the medical field, the defect detection requirements for medical products are gradually increased, and the defect detection method based on computer vision gradually occupies the mainstream position. Conventional object detection algorithms are based on manually designed feature extractors and machine learning algorithms, and detect and classify objects by using manually defined features in combination with classifiers, which require a great deal of expertise and experience, and are complex. In contrast, the deep learning-based object detection method performs better in production applications. The deep learning neural network learns target features from the original image through end-to-end training, so that detection tasks under corresponding backgrounds are completed, manual intervention is reduced, and detection precision and efficiency under complex environments are improved.

Penicillin bottle is a glass material medicament container, is a common medical packaging product. The appearance integrity and the internal condition of the bottle body of the penicillin bottle directly or indirectly influence the quality safety of the medicament in the penicillin bottle, so that the life health of patients is concerned, and the quality and the safety of the medicament are vital. In the process of bottling the medicament, various problems such as foreign matter residue, damaged bottle body, incomplete sealing, and medicament dissolution back can occur, which presents challenges for high-efficiency and accurate quality inspection of the penicillin bottle.

The production line detection flow of the penicillin bottled medicament is divided into four parts, namely plastic top cover detection, aluminum cover detection, medicament top detection and glass bottle body detection. In the process, defective products are timely and accurately detected and screened, so that waste of production resources can be effectively reduced, and economic cost is reduced. The practical application of the YOLO series model can balance the detection speed and the detection precision, and is more suitable for real-time detection of the penicillin bottle production line. The YOLOv-n basic model can realize high-precision detection on three parts, namely a plastic top cover, an aluminum cover and a medicament top of the penicillin bottle. However, the defect condition of the glass bottle body part of the penicillin bottle is complex, the scratch appearance of the bottle body is not obvious, the size of dirt is different and the distribution is discrete, and the defect characteristic shape of the abnormal filling quantity is irregular. The existence of the phenomena increases the detection difficulty, so that the overall detection precision of the base network to the glass bottle body is not high.

Disclosure of Invention

In order to solve the technical problems, the invention provides the penicillin bottle defect detection method based on the improvement YOLOv with simple algorithm and high detection precision.

The technical scheme for solving the technical problems is that the method for detecting the defects of the penicillin bottle body based on the improvement YOLOv < 8 >, comprises the following steps:

s1, acquiring and preprocessing penicillin bottle body defect data to obtain a training set and a testing set;

S2, establishing a defect target detection model of an improvement YOLOv;

Introducing a space separable pooling attention module SSPA at a shallow layer of a Backbone network Backbone by taking YOLOv as a basic framework, introducing a characteristic enhancement module MMFF at a neck Neck to replace a C2f module, adopting convolution check defects with different sizes to extract characteristics, stacking and improving characteristic fusion effects;

S3, training the defect target detection model by adopting a training set, optimizing a loss function, and updating model weight parameters until the loss function converges;

And S4, testing the defect target detection model by adopting a test set.

The method for detecting the defects of the penicillin bottle body based on the improvement YOLOv is characterized in that the specific process of the step S1 is as follows:

s11, enabling a penicillin bottle to enter an image acquisition area through a conveyor belt, triggering a sensor in the process of uniform speed conveying and inspection of the penicillin bottle, and carrying out image sampling on the penicillin bottle by an industrial camera at multiple angles;

and S12, performing target area positioning, image segmentation, defect labeling and data enhancement on the acquired picture sample to complete image preprocessing, and dividing a training set and a testing set according to a proportion.

In the above method for detecting a defect of a penicillin bottle body based on the improvement YOLOv, in the step S12, the specific procedures of target area positioning, image segmentation and data enhancement are as follows:

The method comprises the steps of locating a target area, namely performing binarization operation on an acquired original picture, displaying the outline of a bottle body by a white background, calculating the mass center of the white background area, and finding the positions of the bottle wall and the bottle bottom along the gray level change in the horizontal and vertical directions;

image segmentation, namely performing fine adjustment and shrinkage on a rectangular frame, segmenting the image and removing a background area to obtain a bottle area to be detected, and facilitating detection of defects in the area;

and data enhancement, namely carrying out data enhancement operation on the processed sample to be detected, and expanding the volume of the data.

In the above method for detecting the defects of the penicillin bottle body based on the improvement YOLOv8, in the step S2, the space separable pooling attention module SSPA comprises two parts of attention weight calculation and information aggregation, wherein the attention weight calculation comprises three parts of local feature extraction, global feature generation and space separation feature;

Given input features,,The real number domain is represented by the number,Respectively representing three dimensions of channel number, height and width, the method is toIs divided intoGroups, denoted asWhereinRepresentation ofThe first of (3)The group characteristics of the group of characteristics,And performing integral division operation on each group of features, extracting and fusing the features through three parts of calculation, and finally, aggregating each group of features.

In the above method for detecting a defect of a penicillin bottle body based on the improvement YOLOv8, in the step S2, the process of calculating the attention weight includes:

local feature extraction-for each set of features input, deployment in the horizontal direction、、Global average pooling is carried out on the pooling window with the size, and the pooling window is deployed in the vertical direction、、The method comprises the steps of carrying out global average pooling on pooled windows with the size, embedding space position information into channel dimension by deployment of pooled windows in a single direction, balancing short-distance dependency relationships when long-distance dependency relationships are established in discrete areas by pooled windows with different sizes, capturing multi-scale features under different distance dependency relationships, and obtaining two local feature graphsThe following are provided:

;

Wherein, the、Respectively represent the splicing operation in the horizontal and vertical directions,Representing the horizontal and vertical coordinates of the pixel respectively,Is the size of the global average pooling window,,;

Global feature generation using a1 x 1 standard convolution processAndRespectively obtain characteristic diagramsAnd feature map,AndPerforming matrix multiplication operation to generate global feature diagram:

;

Wherein, theA convolution operation of 1 x 1 is shown,Representing a matrix multiplication operation;

For global feature mapInformation fusion is carried out by applying 1X 1 standard convolution, and global feature attention weight is obtained through Sigmoid function:

Spatial separation feature is toAndUpsampling to original feature map sizeRespectively obtaining characteristic diagramsAnd feature mapAnd is opposite toProceeding withDimension and dimensionTransposition between dimensions;

;

Wherein, theIndicating the operation of the transpose,Representing an upsampling operation;

And (3) withAt the position ofSplicing in the dimension direction to obtainIs subjected to characteristic extraction by a 3X 3 standard convolution to obtain a characteristic enhancement graphLocal feature recombination across space dimension is realized, and feature information in horizontal and vertical directions is complementarily transferred;

;

Wherein, theRepresenting a standard convolution operation of 3 x 3,In order to combine the functions of the two,Represents the 3 rd dimension;

through a 1 x 1 standard convolution, and generating new weight by Sigmoid function, adding the new weight toAll execute separation operation along H dimension to obtain local feature weight in horizontal directionPartial feature map in horizontal directionLocal feature weight in vertical directionVertical partial feature map;

;

Wherein, theIn order to perform the separation operation,Is a Sigmoid function;

Correspondingly multiplying the local feature map and the local feature weight, performing independent separation adjustment on the features in different directions in the space dimension, and generating the attention weight through a sigmoid function;

;

Wherein, theRepresenting an element-by-element multiplication operation.

In the above method for detecting a defect of a penicillin bottle body based on the improvement YOLOv8, in the step S2, the information aggregation process is as follows:

for already generatedAndWill beAnd (3) with、Sequentially multiplying and carrying out space weighting to obtain enhanced characteristic diagramThe formula is as follows:

;

Wherein the enhanced feature map;

;

Wherein, theIs a batch normalization function; Is a merging function for merging g-group features in the channel dimension.

In the above method for detecting a defect of a penicillin bottle body based on the improvement YOLOv8, in the step S2, the feature enhancement module MMFF performs feature extraction by adopting convolution check defects with different sizes, stacks and promotes the feature fusion effect, specifically as follows:

MMFF input isThe output is,MMFF captures spatial feature information and performs feature fusion through two parallel branches, the first branch comprises a convolution layer Conv1×1-batch normalization layer BN-activation function layer SiLU, the convolution layer Conv1×1-batch normalization layer BN-activation function layer SiLU is marked as CBS, all channels at the same position are weighted and summed, channel features are fused, and the first branch inputsGenerating an outputThe second branch comprises a multi-scale depth separable convolution module MDSConv and N re-parameterized convolution modules RepConv-Block, uses convolution kernels of different sizes to capture defect features, helps the network focus on local details and context information, and simultaneously stacks the representation capabilities of a plurality of RepConv-Block enhancement features, and inputs the second branchGenerating an output;AndAfter the addition operation of the element is completed, the final output is obtained through additional CBSThe number of channels does not change.

According to the penicillin bottle defect detection method based on the improvement YOLOv, repConv-Block uses a multi-branch convolution layer in a training stage, then a plurality of calculation modules are combined into one in an inference stage, parameters of branches are re-parameterized on a main branch, repConv-Block extracts and enhances target features by repeatedly applying the convolution layer and an activation function, MDSConv comprises 3×3, 5×5 and 7×7 depth-wise convolution and 1×1 point-wise convolution, an input feature map is subjected to parallel multi-scale depth-wise convolution, space features with different sizes are extracted by grouping convolution in a channel dimension, and feature aggregation is achieved by the point-wise convolution after the multi-branch is subjected to element-wise addition operation.

In the above method for detecting the defects of the penicillin bottle body based on the improvement YOLOv8, in the step S3, a training set image is input into a defect target detection model, and after the characteristics of a Backbone network backbox are extracted and the characteristics of a neck Neck are fused, a prediction frame of the defect target is output by a detection Head;

Error function between target prediction result and real labelRegression loss by bounding boxLoss of target classAnd confidence lossThe weighted composition is as follows:

;

Wherein, theThe weighting coefficients for the bounding box regression losses,For the weighting coefficients of the target class losses,A weighting coefficient for the confidence loss;

boundary box regression lossFor measuring the position and shape of the predicted and real frames, the following formula is:

;

wherein: the cross-over ratio is represented by the ratio,In order to predict the frame of a picture,As a real frame of the image, the image is displayed,Is the center point of the prediction block,Is the center point of the real frame; Is thatAndIs a Euclidean distance of (2); is the diagonal length surrounding the prediction box and the true minimum rectangle; Is the aspect ratio difference of the predicted frame and the real frame; Is an adjustment factor, target class loss functionThe formula is as follows:

;

Wherein, theIs the total number of categories; is a category index; Is a true category label; is a predicted class probability;

Confidence lossThe formula is as follows:

;

Wherein, theIs the true degree of confidence that the user is,Is the confidence value of the network prediction;

OptimizationThe model parameters are updated iteratively, so that the model parameters are converged to the minimum value, and the model performance reaches the optimum;

After training, selecting the weight with highest precision to obtain the optimal model for detecting the defects of the penicillin bottles.

In the above method for detecting a defect of a penicillin bottle body based on the improvement YOLOv8, in the step S4, the performance of the defect target detection model on detecting the defect of the penicillin bottle body is evaluated by an evaluation index, wherein the evaluation index comprises、Accuracy rate ofRecall rate of,Representing the average precision of a single class; is of all kindsThe average value of (2) is calculated as follows:

;

Wherein, theIndicating the number of correctly detected targets as true positives; the target number of false detection is represented as false positive; Representing the number of real targets which are not detected, and determining that the targets are false negative; Representing accuracy rate,Representing recall rate;Representing recall as a value,Indicating a recall rate ofThe accuracy of the time required for this is,Is the total number of categories; is a category index of the category,Represent the firstOf the individual categoryValues.

The invention has the beneficial effects that the problems of difficult detection and low precision of the defects of the bottle body part of the penicillin bottle can be solved, and the invention is specifically expressed as follows:

1. the SSPA is introduced into the space separable pooling attention module, so that the perceptibility of a network to fine features such as bottle scratches and the like can be enhanced;

2. And introducing a characteristic enhancement module MMFF, adopting convolution cores with different sizes to extract defect characteristics with different sizes such as dirt, scratch and abnormal loading under a complex background, and improving the characteristic fusion effect by stacking the modules.

Drawings

Fig. 1 is an overall flow chart of the present invention.

FIG. 2 is an overall frame diagram of a defect target detection model of the present invention.

Fig. 3 is a diagram of a sample of a defect of a penicillin bottle according to the present invention, wherein left, middle and right diagrams are respectively a sample of a defect of Stain stin, scratch, and loading abnormality load anomaly.

Fig. 4 is a schematic diagram of the operation of the spatially separable pooled attention module SSPA of the present invention.

Fig. 5 is a schematic structural diagram of a feature enhancement module MMFF according to the present invention.

FIG. 6 is a schematic diagram of RepConv-Block according to the present invention.

Fig. 7 is a schematic structural diagram of MDSConv of the present invention.

Detailed Description

The invention is further described below with reference to the drawings and examples.

As shown in fig. 1, a method for detecting a defect of a penicillin bottle body based on a modification YOLOv comprises the following steps:

s1, acquiring the defect data of the penicillin bottle body and preprocessing to obtain a training set and a testing set.

The specific process of the step S1 is as follows:

s11, enabling a penicillin bottle to enter an image acquisition area through a conveyor belt, triggering a sensor in the process of uniform speed conveying and inspection of the penicillin bottle, performing image sampling on the penicillin bottle by an industrial camera at multiple angles, and acquiring 1241 original images at a resolution of 2448 multiplied by 2048;

The specific processes of target area positioning, image segmentation and data enhancement are as follows:

Image segmentation, namely in an actual production line, the position of a penicillin bottle may have slight deviation, so that a rectangular frame is finely scaled, an image is segmented, a background area is removed, and a bottle body area to be detected is obtained, so that dirt, scratches and abnormal loading which possibly exist can be detected conveniently;

Data enhancement for processed sample pictures, data enhancement operation was performed to expand the data volume to 3723 sheets by miscut transform Shear (±10° horizontal, ±10° vertical), noise (1.05%), hue transform Hue (between-15 ° and +15°).

As shown in fig. 3, the types of defects in the penicillin bottle body include three types, namely Stain, scratch, and loading abnormality load anomaly.

S2, establishing a defect target detection model of the improvement YOLOv, as shown in FIG. 2;

A YOLOv is used as a basic framework, a spatially separable pooling attention module SSPA is introduced into a shallow layer of a Backbone network Backbone for enhancing the perceptibility of the network to fine features such as bottle scratches and the like, a feature enhancement module MMFF is introduced into a neck Neck for replacing a C2f module, the defects such as dirt, scratches and abnormal loading are subjected to feature extraction by adopting convolution cores with different sizes, and the feature fusion effect is stacked and improved, and a Head output defect target is detected.

The backbox contains a CBS (Conv-BN-SiLU, convolutional layer-batch normalization layer-SiLU activation function layer) module, a feature enhancement module C2f, a spatially separable pooling attention module SSPA, and a spatial pyramid pooling module SPPF. The method comprises the steps of receiving an original image by a Backbone, alternately stacking the image by a CBS module and a C2f module, gradually extracting high-order semantic features from the image, realizing feature enhancement by SSPA, and finally generating a multi-scale feature map by aggregating multi-scale context information in a deep network by the SPPF module.

The CBS module performs preliminary feature extraction through a 3 multiplied by 3 standard convolution block, and activates functions through batch normalization layers BN and SiLU, so that training convergence acceleration and calculation efficiency improvement are realized.

The C2f module is divided into three parts, namely segmentation, processing and splicing, and the reservation of shallow details and deep semantics is realized. The C2f module receives the feature map output from the CBS module as input, the input is divided into two parts along the channel dimension average, the two parts are connected according to residual errors, the first part is directly transmitted, the second part is processed through the overlapped feature fusion bottleneck structure Bottleneck, and the two parts of processed feature maps are spliced along the channel dimension and then subjected to channel number adjustment through the 1X 1 convolution module, so that a final output feature map is obtained. The Bottleneck structure comprises a1×1 convolution module, a3×3 convolution module and a residual connection. The 1X 1 convolution module compresses the channel number of the feature map to reduce the calculated amount, the 3X 3 convolution module performs space feature extraction, and residual connection adds the input features and the extracted features to avoid gradient disappearance.

In step S2, as shown in fig. 4, the spatial separable pooling attention module SSPA includes two parts of attention weight calculation and information aggregation, wherein the attention weight calculation includes three parts of local feature extraction, global feature generation and spatial separation feature;

Given input features,,The real number domain is represented by the number,Respectively representing three dimensions of channel number, height and width, the method is toIs divided intoGroups, denoted asWhereinRepresentation ofThe first of (3)The group characteristics of the group of characteristics,And each group of characteristics are subjected to characteristic extraction and fusion through three-part calculation, in the process, the channel dimension is not changed, the calculated amount is reduced, and finally, each group of characteristics are aggregated, so that the capturing capability of the network on various defect characteristics such as dirt, scratches and abnormal loading is enhanced.

The process of attention weight calculation includes:

;

Wherein, theA convolution operation of 1 x 1 is shown,Representing a matrix multiplication operation; AndAnd each pixel represents a local refined feature in the current direction, and the two feature maps perform matrix multiplication operation to perform cross-dimension information interaction on local features with different space dimensions, so that the relevance of global feature information is better established.

For global feature mapInformation fusion is carried out by applying 1X 1 standard convolution, and global feature attention weight is obtained through sigmoid functionGlobal feature attention weightingExpressed as:

;

Through a1 x 1 standard convolution, and generating new weights by sigmoid function, adding the new weightsAll execute separation operation along H dimension to obtain local feature weight in horizontal directionPartial feature map in horizontal directionLocal feature weight in vertical directionVertical partial feature map;

;

Wherein, theIn order to perform the separation operation,Is a Sigmoid function;

;

Wherein, theRepresenting an element-by-element multiplication operation.

The information aggregation process comprises the following steps:

;

Wherein the enhanced feature map;

;

The SPPF module is positioned at the tail end of the backbone network and is mainly used for multi-scale feature fusion. The SPPF module firstly compresses the channel number of the input feature map by using 1X 1 convolution, then carries out maximum pooling through pooling windows with the size of 5X 5 which are connected in series for three times, splices the original feature map and the pooling results for three times along the channel dimension, and recovers the channel number by using 1X 1 convolution.

Neck adopts FPN-PAN bidirectional structure, and comprises up-sampling moduleA 3 x 3 convolution module, a C2f module, concat splice module, and a feature enhancement module MMFF. Neck receives as input the output of the backbone network. First, through an upsampling module in a feature pyramid network FPNAnd secondly, reducing the size of the feature map through a 3X 3 convolution module of a path aggregation network PAN, performing feature splicing through a Concat splicing module, fusing the features with different scales through a C2f module, and enhancing semantic information and spatial information.

FPN structure-top-down transfer path. And up-sampling the deep feature map, transmitting the up-sampled deep feature map to the shallow feature map, splicing the feature maps in the channel dimension, and carrying out feature enhancement by MMFF.

The feature enhancement module MMFF adopts convolution check defects with different sizes to perform feature extraction and stacks and improves the feature fusion effect, and the method is as follows:

as shown in FIG. 5, the MMFF input isThe output is,MMFF captures spatial feature information and performs feature fusion through two parallel branches, the first branch comprises a convolution layer Conv1×1-batch normalization layer BN-activation function layer SiLU, the convolution layer Conv1×1-batch normalization layer BN-activation function layer SiLU is marked as CBS, all channels at the same position are weighted and summed, channel features are fused, and the first branch inputsGenerating an outputThe second branch comprises a multi-scale depth separable convolution module MDSConv and N re-parameterized convolution modules RepConv-Block, uses convolution kernels of different sizes to capture defect features, helps the network focus on local details and a larger range of context information, stacks multiple RepConv-Block enhancement feature representation capabilities simultaneously, extracts more complex and high-level features, and inputs the second branchGenerating an output;AndAfter the addition operation of the element is completed, the final output is obtained through additional CBSThe number of channels does not change. CBS consists of a standard convolution of 1 x 1, BN and an activation function SiLU.

As shown in FIG. 6, repConv-Block uses a multi-branched convolutional layer in the training stage, then combines multiple calculation modules into one in the reasoning stage, and re-parameterizes the parameters of the branches to the main branches, so that the calculation amount and the memory consumption are reduced, and the reasoning speed is improved. RepConv-Block achieves extraction and enhancement of target features by repeatedly applying convolutional layers and activation functions.

As shown in FIG. 7, MDSConv comprises 3×3, 5×5 and 7×7 depth-wise convolution and 1×1 point-wise convolution, the input feature map is subjected to parallel multi-scale depth-wise convolution, space features with different sizes are extracted by grouping convolution in channel dimensions, the calculated amount is reduced, residual connection is beneficial to information flow of a deep network, gradient disappearance is avoided, the feature expression capability is enhanced, and feature aggregation is realized by the point-wise convolution after multi-branch element addition operation.

PAN structure-bottom-up transfer path. And downsampling the shallow feature map through a3×3 convolution module with stride=2, transmitting shallow information to the deep feature map, splicing in the channel dimension, and carrying out feature fusion through a C2f module. Neck is applied to a bidirectional path structure, and by combining feature graphs of different levels, the problems of lack of space details of deep features and insufficient semantic information of shallow features are avoided, and the effect of multi-scale feature fusion is enhanced.

Head, including decoupling Head and Anchor-Free mechanism. The decoupling head separates the classification task and the regression task into two independent branches, so that interference between tasks is avoided, and classification precision and stability of regression positioning are improved. An Anchor-Free mechanism is adopted, and the target object detection system consists of three detection layers, wherein feature maps with different scales are used for detecting target objects with different sizes. Each detection layer outputs corresponding vectors through classification, confidence and regression three branches, screens through non-maximum suppression (NMS), and finally generates a prediction boundary box and a category of the target in the original image and marks the prediction boundary box and the category. The Head is mainly responsible for converting the multi-scale feature map output by Neck into a target detection result. The Head outputs a vector containing the class probability of the target, the target confidence level, and bounding box information.

And S3, training the defect target detection model by adopting a training set, optimizing the loss function, and updating model weight parameters until the loss function converges.

In the invention, because the penicillin bottle defect data set is constructed by self-defining treatment, the network is required to be trained from beginning without pre-trained weights. Using a random gradient descent (SGD) optimizer with 100 iterations epoch, the initial learning rate is 0.01, the momentum is set to 0.937, and the weight decay is 0.0005. Training was performed using the warm-up method with a warm-up period of 3 epochs, the input image size set to 640 x 640, and the batch size set to 16.

In the step S3, a training set image is input into a defect target detection model, and after the feature extraction of a Backbone network Backbone and the feature fusion of a neck Neck, a prediction frame of the defect target is output by a detection Head;

;

wherein: the cross-over ratio is represented by the ratio,In order to predict the frame of a picture,As a real frame of the image, the image is displayed,Is the center point of the prediction block,Is the center point of the real frame; Is thatAndIs a Euclidean distance of (2); is the diagonal length surrounding the prediction box and the true minimum rectangle; Is the aspect ratio difference of the predicted frame and the real frame; is an adjustment factor;

Objective class loss functionThe formula is as follows:

;

Confidence lossThe formula is as follows:

;

and after 100 rounds of training are completed, selecting the weight with highest precision to obtain an optimal model for detecting the defects of the penicillin bottles.

And S4, testing the defect target detection model by adopting a test set.

Evaluating the performance of a defect target detection model on the defect detection of the penicillin bottle body by using evaluation indexes, wherein the evaluation indexes comprise、Accuracy rate ofRecall rate of,Representing the average precision of a single class; is of all kindsThe average value of (2) is calculated as follows:

;

The invention is trained and tested, and the existing advanced target detection model is compared with the invention. Defects of each category in a datasetAnd the whole bodyThe @50 is shown in Table 1, stain, scratch, load indicates defect types of dirt, scratch, and abnormal loading, respectively; 50 isIn a particular form of (1)Is considered to be a correct detection.

Where Size represents the input image Size.

As can be seen from Table 1, the invention improves the detection precision of various defects of the penicillin bottle body compared with the basic network, and the overall detection effect is more excellent than the existing advanced target detection network.

Claims

1. The penicillin bottle defect detection method based on the improvement YOLOv is characterized by comprising the following steps of:

S2, establishing a defect target detection model of an improvement YOLOv;

In the step S2, the space separable pooling attention module SSPA comprises two parts of attention weight calculation and information aggregation, wherein the attention weight calculation comprises three parts of local feature extraction, global feature generation and space separation feature;

Given input features,,The real number domain is represented by the number,Respectively representing three dimensions of channel number, height and width, the method is toIs divided intoGroups, denoted asWhereinRepresentation ofThe first of (3)The group characteristics of the group of characteristics,Performing feature extraction and fusion on each group of features through three-part calculation, and finally, aggregating each group of features;

And S4, testing the defect target detection model by adopting a test set.

2. The method for detecting the defects of the penicillin bottles based on the improvement YOLOv as set forth in claim 1, wherein the specific process of the step S1 is as follows:

3. The method for detecting a defect of a penicillin bottle body based on the improvement YOLOv as set forth in claim 2, wherein in the step S12, the specific procedures of target area positioning, image segmentation and data enhancement are as follows:

4. The method for detecting a defect in a penicillin bottle body based on the improvement YOLOv as set forth in claim 3, wherein in the step S2, the process of attention weight calculation includes:

;

Wherein, theIn order to perform the separation operation,Is a Sigmoid function;

;

Wherein, theRepresenting an element-by-element multiplication operation.

5. The method for detecting a defect of a penicillin bottle body based on the improvement YOLOv as set forth in claim 4, wherein in the step S2, the information aggregation process is as follows:

;

Wherein the enhanced feature map;

;

6. The method for detecting a vial defect of a penicillin vial based on the improvement YOLOv as set forth in claim 5, wherein in the step S2, the feature enhancement module MMFF performs feature extraction by using convolution check defects of different sizes and stacks up and promotes a feature fusion effect, specifically as follows:

7. The method for detecting the defects of the penicillin bottle body based on the improvement YOLOv and 8 according to claim 6, wherein in the step S2, repConv-Block uses a multi-branch convolution layer in a training stage, then a plurality of calculation modules are combined into one in an inference stage, parameters of branches are re-parameterized on a main branch, repConv-Block extracts and enhances target features by repeatedly applying the convolution layer and an activation function, MDSConv comprises 3×3, 5×5, 7×7 depth-wise convolution and 1×1 point-wise convolution, an input feature map is subjected to parallel multi-scale depth-wise convolution, space features of different sizes are extracted in a channel dimension by grouping convolution, and feature aggregation is achieved by the point-wise convolution after the multi-branch is subjected to element-wise addition operation.

8. The method for detecting the defects of the penicillin bottles based on the improvement YOLOv as claimed in claim 6, wherein in the step S3, training set images are input into a defect target detection model, and after the characteristics of a Backbone network Backbone are extracted and the characteristics of a neck Neck are fused, a prediction frame of a defect target is output by a detection Head;

;

Objective class loss functionThe formula is as follows:

;

Confidence lossThe formula is as follows:

;

9. The method for detecting defects of a penicillin bottle based on the improvement YOLOv as claimed in claim 1, wherein in the step S4, the performance of the defect target detection model on detecting defects of a penicillin bottle is evaluated by an evaluation index, and the evaluation index comprises、Accuracy rate ofRecall rate of,Representing the average precision of a single class; is of all kindsThe average value of (2) is calculated as follows:

;