CN110555839B

Movatterモバイル変換

Info

Publication number: CN110555839B
Application number: CN201910843972.XA
Authority: CN
Inventors: 高斌斌; 高立钊; 贾佳亚; 戴宇荣; 沈小勇
Original assignee: Tencent Cloud Computing Beijing Co Ltd
Current assignee: Tencent Cloud Computing Beijing Co Ltd
Priority date: 2019-09-06
Filing date: 2019-09-06
Publication date: 2024-11-15
Anticipated expiration: 2039-09-06
Also published as: CN110555839A

Abstract

The invention discloses a defect detection and identification method, a device, computer equipment and a medium, and belongs to the field of computer vision detection and identification. The method obtains a mask map by dividing the background and the foreground in the target product image, positions a defect target in the target product image according to the spatial position distribution and the number of connected domains in the mask map of the target product image, and further identifies a target product image block corresponding to a target positioning frame, the dividing method converts the prediction of the defect shape and the boundary into the division of the defect foreground and the background, the defect locating block comprises a defect foreground and an image background which meet target conditions, and meanwhile, the defect locating method provided by the invention can locate the defect position more accurately, is beneficial to extracting main defect characteristics, reduces influence of mask noise and target product image background on defect type identification, and improves accuracy of defect type identification.

Description

Defect detection and identification method, device, computer equipment and storage medium

Technical Field

The present invention relates to the field of computer vision inspection, and in particular, to a defect inspection and identification method, apparatus, computer device, and computer readable storage medium.

Background

The defect detection and identification are widely applied to the fields of industrial production and manufacturing, quality monitoring and the like, such as liquid crystal panel defect identification, workpiece surface quality detection, cloth surface flaw identification and the like. Through defect detection, defects existing on the surface of a product can be found for repair by maintenance personnel in time to ensure the quality of the product, but in order to accurately judge whether the quality of the product is qualified, select which procedures to repair, and the like, careful analysis and fine recognition of images are required after target product images suspected to comprise the defects of the product are obtained. Therefore, a defect detection and identification method is needed to realize automatic positioning of defects on the surface of a product and intelligent identification of defect types.

The existing defect detection and identification methods mainly comprise three types: firstly, scaling an original target product image of a product to a fixed size, then using a convolutional neural network (Convolutional Neural Networks, CNN) to identify the defect type in the original target product image, in the actual implementation process, firstly sampling the original target product image by using a sliding window mode to obtain target product image blocks, and then using the CNN to carry out defect positioning on each target product image block and realize the identification of the defect type; firstly, constructing a cascade detection network by utilizing a target detection algorithm such as a Single-shot Detector (SSD), unified real-time target detection (You Only Look Once, YOLO) and the like to locate a target product image block, and then identifying the defect type of the located target product image block by using CNN; firstly, designing an architecture of a cascade self-encoder so as to carry out image segmentation on an original target product image of the obtained product surface to obtain a mask image of the target product image, further obtaining a minimum peripheral boundary frame of the target product image, thereby realizing defect positioning, and finally, sending the positioned target product image block into a CNN to realize accurate identification of defect types.

In carrying out the invention, the inventors have found that the prior art has at least the following problems:

In the first method, original image input is adopted when a target product image is sampled by using a sliding window mode, accurate identification is difficult to achieve when defects are too small, and defect positioning is performed by using the sliding window mode, so that the positions of the defects can only be roughly obtained, and the defect positioning is inaccurate; in the second method, the method for constructing the cascade detection network through the target detection algorithm is difficult to realize accurate segmentation of the defect boundary and shape, and the accurate positioning of the core defect position is affected; in the third method, when the mask pattern of the target product image has noise points or scattered distribution, the result obtained by using the positioning method based on the mask minimum peripheral frame is difficult to accurately express the position information of the defect, and further, the performance of identifying the defect type is affected, so that the defect type identification result is inaccurate.

Disclosure of Invention

The embodiment of the invention provides a defect detection and identification method, a device, computer equipment and a computer readable storage medium, which can solve the problems of inaccurate defect positioning and poor defect type identification precision in the related technology. The technical scheme is as follows:

in one aspect, a defect detection and identification method is provided, the method comprising:

acquiring a mask image of a target product image based on the target product image;

determining a target positioning frame in the target product image according to the spatial position distribution and the number of the connected domains in the mask image of the target product image, wherein the connected domains and the image background contained in the target positioning frame meet target conditions;

And identifying the target product image block corresponding to the target positioning frame in the target product image.

In one aspect, there is provided a defect detection and identification apparatus, the apparatus comprising:

the segmentation module is used for acquiring a mask image of the target product image based on the target product image;

The positioning module is used for determining a target positioning frame in the target product image according to the spatial position distribution and the number of the connected domains in the mask image of the target product image;

and the identification module is used for identifying the target product image block corresponding to the target positioning frame in the target product image.

In one possible implementation, the positioning module is further configured to:

when only one connected domain exists in the mask map of the target product image, determining a positioning frame of the connected domain as the target positioning frame, wherein the first connected domain is the largest connected domain in the mask map;

when two or more connected domains exist in the mask map of the target product image, the target positioning frame is determined according to the area occupation ratio of the merging frame and the merging mask occupation ratio.

When the area ratio of the merging frame meets a first value range or the area ratio of the merging mask meets a second value range, determining a positioning frame of the first connected domain as a target positioning frame;

when the area ratio of the merging frame does not meet the first value range and the area ratio of the merging mask does not meet the second value range, determining a locating frame which is positioned in the locating frame of the first merging domain and comprises the locating frame of the first communicating domain in the mask map as the target locating frame, wherein the first merging domain is obtained by merging all the communicating domains.

In a possible implementation manner, the positioning module is further configured to determine a positioning frame of the first connected domain in the mask map as an initial positioning frame;

the positioning module is further configured to determine, based on the nearest connected domain of the first connected domain, a positioning frame of a second merged domain, where the second merged domain includes the first merged domain and the nearest connected domain of the first connected domain;

a calculating module, configured to calculate the merge frame area ratio and the merge mask ratio of the second merge domain;

And the positioning module is further used for determining the amplified positioning frame as the target positioning frame when the area ratio of the combined frame meets the first value range or the combined mask meets the second value range.

In one possible implementation, the apparatus further includes:

The extraction module is used for extracting a feature map of the target product image through a convolutional neural network of the defect detection model;

The pyramid module inputs the feature images into the space pyramid module of the defect detection model to obtain feature images with different granularities of the target product image;

the up-sampling module is used for up-sampling the feature graphs with different granularities through the space pyramid module to obtain a final feature graph;

and the segmentation mask extraction module is used for extracting the defect mask by using a convolution layer of 1x1 based on the final feature map, and obtaining a mask map of the target product image.

In one possible implementation, the apparatus further includes:

The intercepting module is used for intercepting a square target product image block by taking the longest side of the target positioning frame as the side length based on the center of the target positioning frame;

the identification module is also used for identifying the square target product image block.

In one aspect, a computer device is provided that includes one or more processors and one or more memories having stored therein at least one program code loaded and executed by the one or more processors to implement the operations performed by the defect detection identification method.

In one aspect, a computer readable storage medium having stored therein at least one program code loaded and executed by a processor to perform operations performed by the defect detection identification method is provided.

Obtaining a mask map by dividing the background and the foreground in the target product image, positioning a defect target frame in the target product image according to the spatial position distribution and the number of connected domains in the mask map of the target product image, further identifying a target product image block corresponding to the target positioning frame, converting the prediction of the defect shape and the boundary into the division of the defect foreground and the background by the dividing method, the method for locating the target can accurately determine the position of the defect, is favorable for extracting main defect characteristics, reduces influence of mask noise and target product image background on defect type identification, and improves accuracy of defect type identification.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic diagram of an implementation environment of a defect detection and identification method according to an embodiment of the present invention;

FIG. 2 is a schematic structural diagram of a defect detection recognition model according to an embodiment of the present invention;

FIG. 3 is a flowchart of a defect detection and identification method according to an embodiment of the present invention;

Fig. 4 is a schematic structural diagram of a spatial pyramid module according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of connected domain merging according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of a positioning result of a target positioning frame according to an embodiment of the present invention;

FIG. 7 is a schematic diagram of a defect detecting and identifying apparatus according to an embodiment of the present invention;

fig. 8 is a schematic structural diagram of a computer device according to an embodiment of the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the embodiments of the present invention will be described in further detail with reference to the accompanying drawings.

Artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) is the theory, method, technique, and application system that simulates, extends, and extends human intelligence using a digital computer or a machine controlled by a digital computer, perceives the environment, obtains knowledge, and uses the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.

The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

Computer Vision (CV) is a science of studying how to "look" a machine, and more specifically, to replace a human eye with a camera and a Computer to perform machine Vision such as recognition and measurement on a target, and further perform graphic processing to make the Computer process an image more suitable for human eye observation or transmission to an instrument for detection. As a scientific discipline, computer vision research-related theory and technology has attempted to build artificial intelligence systems that can acquire information from images or multidimensional data. Computer vision techniques typically include image processing, image recognition, image semantic understanding, image retrieval, optical character recognition (Optical Character Recognition, OCR), video processing, video semantic understanding, video content/behavior recognition, three-Dimensional object reconstruction, three-Dimensional (3D) techniques, virtual reality, augmented reality, synchronous positioning, and map construction, among others, and also include common biometric recognition techniques such as face recognition, fingerprint recognition, and others.

In the field of images, the semantics refer to the content of the images, and the segmentation means that different objects in the image are segmented from the angle of pixels, and each pixel in the original image is identified.

Machine learning (MACHINE LEARNING, ML) is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, and the like. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, confidence networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like.

CNN is a feedforward neural network, whose artificial neurons can respond to surrounding units within a part of coverage, and has excellent performance for large-scale image processing, and specifically includes a convolution layer, a pooling layer, a normalization-coating layer, a random inactivation layer, an activation function layer, and the like.

With research and advancement of artificial intelligence technology, research and application of artificial intelligence technology is being developed in various fields, such as common smart home, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned, automatic driving, unmanned aerial vehicles, robots, smart medical treatment, smart customer service, etc., and it is believed that with the development of technology, artificial intelligence technology will be applied in more fields and with increasing importance value.

The defect detection generally refers to detection of surface defects of an article, wherein the surface defects are detected by adopting advanced computer vision detection technology, such as spots, pits, scratches, chromatic aberration, defects and the like on the surface of a workpiece.

The scheme provided by the embodiment of the invention relates to the technologies of artificial intelligence, such as machine learning, computer vision technology and the like, and is specifically described by the following embodiments:

fig. 1 is a schematic diagram of an implementation environment of a defect detection and identification method according to an embodiment of the present invention, referring to fig. 1, the implementation environment includes: a computer device 101.

The computer device 101 may be at least one of a desktop graphics processor (Graphic Processing Unit, GPU) computer, a GPU computing cluster, a neural network computer, or the like. The related technician can use the computer equipment 101 to process the product image, find out the defective product, and ensure the product quality. The computer device 101 may process the image input therein, illustratively, the computer device 101 is connected to the camera assembly to automatically acquire the image and process the image, or the related technician may input the image into the computer device to process the image, which is not limited by the image acquisition mode of the present invention. Optionally, the computer device 101 may also have at least one image database, such as a defect type database, a defect image database, etc., for storing possible defect types and acquired defect images.

Fig. 3 is a flowchart of a defect detection and identification method according to an embodiment of the present invention, referring to fig. 3, the method includes:

301. The computer device obtains an image of the target product.

It should be noted that, the computer device may acquire the target product image through the camera component connected with the computer device, or the related technician may input the target product image into the computer device, and the embodiment of the present invention does not limit a specific manner of acquiring the target product image.

302. The computer device obtains a feature map of the target product image based on the target product image.

In step 302, multiple feature extraction layers may be used to implement feature extraction, and a network of more layers may iteratively extract more complex features from low-level features.

303. The computer equipment inputs the feature map into a space pyramid module to obtain feature maps with different granularities of the target product image.

It should be noted that the main purpose of the spatial pyramid module is to integrate context information of different levels to enrich the feature representation of the image. Fig. 4 is a schematic diagram of a specific structure of a spatial pyramid module according to an embodiment of the present invention, where the spatial pyramid module uses a plurality of hierarchical pooling kernels, and may obtain feature maps with different granularities, as shown in fig. 4, where the spatial pyramid module is a part of a defect detection model used in the mask prediction process in fig. 2.

The context information may be some or all of information that can affect the scene and the object in the image, and the context information is not directly obtained from the appearance of the object, but is obtained from data in the neighborhood, labels of the object, spatial position distribution of the object, or statistical data information. In the actual process, the interaction information between different objects can be captured, and the interaction information between the objects and the scene is used as a condition to identify and process the targets.

304. The computer device obtains a final feature map based on the feature maps of different granularities.

In one possible implementation manner, the computer device uses a convolution layer to perform dimension reduction processing on feature graphs with different granularities to obtain a feature graph after dimension reduction, uses bilinear interpolation to perform up-sampling processing on the feature graph after dimension reduction, and finally connects the feature graphs after up-sampling processing in series to obtain an output of the spatial pyramid module as a final feature graph.

The final feature map is a final feature representation, which contains information about local and global contexts.

305. The computer device obtains a mask map of the target product image based on the final feature map.

In one possible implementation, the computer device uses the final feature map as an input to a segmentation processing layer of a defect detection model to obtain a mask map of the target product image, the mask map being a mask prediction result at a pixel level.

It should be noted that, the defect detection model may be a neural network model based on a depth semantic segmentation algorithm, for example, the depth semantic segmentation algorithm may be a two-class semantic segmentation algorithm, through which a mask image of the target product image may be obtained, so as to implement prediction of a pixel level of the target product image.

It should be noted that, the above steps 302 to 305 may be replaced by other methods to predict the defect mask map, and the embodiment of the present invention does not limit what method is specifically adopted, for example, a template matching method may be used to implement the prediction of the defect mask map.

306. The computer device detects the spatial position distribution and the number of connected domains in the mask map of the target product image, performs step 307 when only one connected domain in the mask map of the target product image is detected, and performs step 308 when two or more connected domains in the mask map of the target product image are detected.

It should be noted that, the computer device may determine, according to pixel values of different pixel points in the mask map of the target product image, pixel points having the same or similar pixel values and being adjacent to each other in position, so as to determine positions of connected domains, and pixel points having different or less similar pixel values may form different connected domains.

In the embodiment of the present invention, it may be assumed that n connected domains exist in the mask map of the target product image, and the set of n connected domains may be denoted as c= { C₁,c₂,…,c_n }, where n may be any positive integer greater than or equal to 1.

It should be noted that, there may be a case where there is no connected domain in the mask map of the target product image, that is, C is an empty set, and in this case, the computer device may not execute the subsequent steps any more.

307. The computer device performs step 314 with the location frame of the connected domain as the target location frame.

In one possible implementation manner, when the computer device detects that only one connected domain exists in the mask image of the target product image, the positioning frame of the connected domain can include all defects in the mask image of the target product image to the greatest extent, so that the computer device can define a positioning frame with any size and any position in the mask image of the target product image as a target positioning frame m in advance, and further can update the position of the target positioning frame m as a positioning frame b₁ of the connected domain, namely m≡b₁.

It should be noted that, in the mask chart of the target product image, the positioning frame of any one of the connected domains c_i may be denoted as b_i, and the positioning frame of the connected domain may be denoted as b₁, and the target positioning frame may be denoted as m.

308. The computer device determines a first connected domain from the two or more connected domains, determines a positioning frame of the first connected domain as an initial target positioning frame, and the first connected domain is a largest connected domain in the mask map.

In one possible implementation manner, the computer device detects spatial position distribution and number of pixels included in each connected domain in the two or more connected domains, determines an area of each connected domain according to the spatial position and number of pixels in each connected domain, and then finds a maximum connected domain in the mask map, and determines a positioning frame of the first connected domain as an initial positioning frame as the first connected domain. For example, the computer device may take the largest connected domain in the mask image of the target product image as the first connected domain, i.eTaking a positioning frame of the maximum connected domain in the mask image of the target product image as an initial positioning frame, namely

Here, arg max_i area(c_i) may refer to the value of i when area (c_i) is maximized, and area (c_i) may represent the area of the connected domain c_i.

It should be noted that, when the computer device detects that there are two or more connected domains in the mask map of the target product image, the method for balancing and positioning defects between positioning frames based on the first connected domain and all connected domains provided by the embodiment of the invention has the core idea that the largest connected domain is adopted as an initial solution of a defect position, and adjacent connected domains are continuously absorbed until a certain balance is achieved.

309. The computer device determines a second connected domain closest to the first connected domain and a positioning frame of the second connected domain based on the first connected domain.

In one possible implementation manner, the computer device may detect the center and the boundary of each connected domain in the mask map, determine, in combination with the spatial position distribution information of the connected domain, a connected domain with the smallest distance between the center and the boundary of the first connected domain according to the detected result, and determine the connected domain as the second connected domain.

Fig. 5 is a schematic diagram of connected domain merging provided in the embodiment of the present invention, referring to fig. 5, a target positioning frame_m and a positioning frame b corresponding to a connected domain closest to the target positioning frame are respectively represented by rectangular frames as shown in fig. 5, and masks included in the rectangular frames are respectively circular and crescent. The circular mask region is a first communication region, and the crescent mask region is a second communication region.

310. The computer device determines a merge frame area ratio and a merge mask area ratio of the first and second connected domains, the merge frame area ratio being used to represent a ratio between an area and a value of a location frame of the two or more connected domains and an area of a location frame of the merged domain after the connected domains are merged. The merge mask ratio is used to represent the ratio between the area sum value of the two or more connected domains and the area of the positioning frame of the merged domain after the connected domains are merged.

It should be noted that, the area ratio of the merging frame is defined as the ratio of the area of the locating frame corresponding to the target locating frame and the connected domain closest to the target locating frame to the area of the locating frame of the merging domain, namely area (uni)/area (clo), wherein area () may represent the area, the connected domain closest to the target locating frame may represent c, the corresponding locating frame may represent b,_un i may represent the area where b and the target locating frame m merge, namely uni≡m per b, and clo may represent the locating frames of m and b, namely clo+m, b ].

It should be noted that, the merging mask ratio is defined as a ratio of a defect mask area in a target positioning frame and a positioning frame corresponding to a connected domain closest to the target positioning frame to an area of the merging domain, that is, mask (clo)/area (clo), where mask () may represent the defect mask area, area () may represent the area, the connected domain closest to the target positioning frame may be represented as c, its corresponding positioning frame may be represented as b, and clo may represent the positioning frames of m and b.

Referring to the connected domain merging schematic diagram shown in fig. 5, the merging frame area ratio is the ratio of the sum of the areas of the frames m and b to the locating frame area of the merging domain, and the merging mask ratio is the ratio of the sum of the circular and crescent mask areas to the locating frame area of the merging domain.

It should be noted that, according to the calculated values of the merging frame area ratio and the merging mask ratio, the computer device may determine the target positioning frame by adopting the defect positioning strategy based on the spatial distribution of the defect mask map provided by the present invention, and the specific implementation method is as follows, which is shown in steps 311 to 313:

311. When the computer device detects that the area ratio of the merging frame meets the first value range or the area ratio of the merging mask meets the second value range, step 312 is executed, otherwise, step 313 is executed.

It should be noted that, the computer device may preset two thresholds, which are respectively denoted as a merge frame area duty ratio threshold τ₁ and a merge mask duty ratio threshold τ₂, where the merge frame area duty ratio meeting a first value range may be that the merge frame area duty ratio is smaller than the merge frame area duty ratio threshold τ₁, and the merge mask duty ratio meeting a second value range may be that the merge mask duty ratio is smaller than the merge mask duty ratio threshold τ₂.

312. The computer device determines the location box of the first connected domain as a target location box, and performs step 314.

It should be noted that there may be a special case where the merge frame area ratio and the merge mask ratio are both 1, and when the computer device detects that the merge frame area ratio and the merge mask ratio are both 1, it may be determined that only one connected domain, which is a first connected domain, is included in the mask map, and therefore, the computer device may directly determine the positioning frame of the first connected domain as the target positioning frame. Fig. 6 is a schematic diagram of a positioning result of a target positioning frame according to an embodiment of the present invention, referring to fig. 6, a rectangular frame indicated by 601 in the figure is a target positioning frame.

313. The computer device uses the merged domain of the first and second connected domains as the first connected domain, and continues to execute the step 309 and the subsequent steps.

In one possible implementation manner, the computer device merges the second connected domain nearest to the first connected domain according to the comparison result, expands the representative region of the connected domain, and updates the target positioning frame, that is, m≡m, b ], [ m, b ] may represent the positioning frames of the frames m and b.

It should be noted that, the computer device may preset two thresholds, which are respectively denoted as a merge frame area duty ratio threshold τ₁ and a merge mask duty ratio threshold τ₂, when the computer device detects that the merge frame area duty ratio satisfies a first value range or the merge mask duty ratio satisfies a second value range, that is, the merge frame area duty ratio is smaller than the merge frame area duty ratio threshold τ₁ or the merge mask duty ratio is smaller than the merge mask duty ratio threshold τ₂, it is unnecessary to search for a connected domain closest to the currently determined connected domain, and the currently determined positioning frame of the connected domain is the target positioning frame, and referring to fig. 6, the rectangular frame indicated by 603 in the figure is the target positioning frame.

There may be a special case where the area ratio of the merge frame and the merge mask ratio are both 0, and when the computer device detects that the area ratio of the merge frame and the merge mask ratio are both 0, the computer device determines the positioning frame of the merge domain of the first connected domain and the second connected domain in the mask map as the target positioning frame, see fig. 6, where the rectangular frame indicated by 602 in the diagram is the target positioning frame.

It should be noted that, when the number of connected domains is 1, the above steps 308 to 313 provide a cyclic processing procedure, the target positioning frame may be directly determined as the positioning frame of the connected domain, and when the number of connected domains is plural, the target positioning frame may be determined through the cyclic processing procedure, the current largest connected domain and the nearest connected domain may be combined at a time, and judged based on the cyclic cutoff condition, if any one of the cyclic cutoff conditions is met, the positioning frame of the current combined connected domain may be used as the target positioning frame, and if the cyclic cutoff condition is not met, the step 309 and the subsequent steps may be continuously performed using the current combined connected domain as the first connected domain until the value of the combined frame area ratio is smaller than the combined frame area ratio threshold τ₁ or the combined mask ratio is smaller than the combined mask ratio threshold τ₂ or no un-combined connected domain is used as the target positioning frame.

It should be noted that, in the above process, the position relationship between the connected domains and the area of the connected domains may be determined by detecting the spatial position distribution of the connected domains in step 308, and the computer device may obtain the connected domain set c= { C₁,c₂,…,c_n } based on the detected connected domains, and in the execution of the cyclic process, each time the merging process is performed once, the merged connected domain may be deleted from the connected domain set, and when the connected domain set is an empty set, it may be determined that there is no non-merged connected domain at present, and the cyclic process may be stopped.

314. The computer equipment intercepts square target product image blocks based on the center of the target positioning frame and taking the longest side of the target positioning frame as the side length.

It should be noted that, when the segmentation processing layer of the defect detection model is used to process the target product image block, the target product image block is required to be a square image block, so that the square image block needs to be cut out from the mask image of the target product image, in addition, the square target product image block is cut out by taking the longest edge of the target positioning frame as the side length, so that all the connected domains for determining the target positioning frame can be included in the square target product image, and accuracy of defect type identification can be ensured.

315. The computer device identifies the square target product image block.

In one possible implementation, the computer device scales the target product image block to a fixed size, determines a boundary box of a defect mask in the target product image, and combines the defect type data obtained by training according to the boundary box of the detected defect mask to realize the identification of the defect type.

It should be noted that, the foregoing steps 317 to 318 may be replaced by other methods to identify the defect types, which method is not limited to the specific method adopted in the embodiment of the present invention, for example, manual features such as Scale-INVARIANT FEATURE TRANSFORM (SIFT), directional gradient histogram (Histogram of Oriented Gradient, HOG), gray level co-occurrence matrix, wavelet features, etc. may be adopted, machine learning methods such as multi-classification support vector machine, random forest, etc. may be adopted, and deep learning such as convolutional neural network may also be used to identify the defect types.

The method for dividing the background and the foreground in the target product image to obtain a mask image, positioning a defect target in the target product image according to the spatial position distribution and the number of connected domains in the mask image of the target product image, and further identifying the target product image block corresponding to the target positioning frame, the method for dividing converts the prediction of the defect shape and the boundary into the division of the defect foreground and the background, realizes the more accurate prediction of the defect mask, the defect positioning block comprises the defect foreground and the image background meeting the target condition, the defect positioning method provided by the invention can more accurately position the defect, is beneficial to extracting main defect characteristics, reduces the influence of mask noise and target product image background on defect type recognition, improves the accuracy of defect type recognition, supports the recognition of various morphological defects, improves the accuracy of defect type recognition, realizes high-precision recognition of fine defects, and has good classification performance particularly for defects with too small defects and similar appearance characterization.

Any combination of the above optional solutions may be adopted to form an optional embodiment of the present invention, which is not described herein.

Fig. 7 is a schematic diagram of a defect detecting and identifying device according to an embodiment of the present invention, referring to fig. 7, the device includes:

An obtaining module 701, configured to obtain a mask map of a target product image based on the target product image;

A determining module 702, configured to determine a target positioning frame in the target product image according to the spatial position distribution and the number of connected domains in the mask map of the target product image;

The identifying module 703 is configured to identify a target product image block corresponding to the target positioning frame in the target product image.

In one possible implementation, the determining module is further configured to:

In one possible implementation, the apparatus further includes:

And the segmentation extraction module is used for obtaining a mask image of the target product image based on the final feature image and the convolution layer.

In one possible implementation, the apparatus further includes:

The device obtains a mask map by dividing the background and the foreground in the target product image, positions a defect target in the target product image according to the spatial position distribution and the number of connected domains in the mask map of the target product image, further identifies the target product image block corresponding to the target positioning frame, converts the prediction of the defect shape and the boundary into the division of the defect foreground and the background, the defect locating block comprises a defect foreground and an image background which meet target conditions, and meanwhile, the defect locating method provided by the invention can locate the defect position more accurately, is beneficial to extracting main defect characteristics, reduces influence of mask noise and target product image background on defect type identification, and improves accuracy of defect type identification.

It should be noted that: in the defect detection and identification device provided in the above embodiment, only the division of the above functional modules is used for illustration, and in practical application, the above functional allocation may be performed by different functional modules according to needs, that is, the internal structure of the computer device is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the defect detection and identification device and the defect detection and identification method provided in the foregoing embodiments belong to the same concept, and detailed implementation processes of the defect detection and identification device and the defect detection and identification method are detailed in the method embodiments and are not repeated here.

Fig. 8 is a schematic structural diagram of a computer device according to an embodiment of the present invention. The computer device 800 may be: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, motion picture expert compression standard audio plane 3), an MP4 (Moving Picture Experts Group Audio Layer IV, motion picture expert compression standard audio plane 4) player, a notebook computer, or a desktop computer. Computer device 800 may also be referred to by other names as user device, portable computer device, laptop computer device, desktop computer device, etc.

In general, the computer device 800 includes: one or more processors 801, and one or more memories 802.

Processor 801 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like. The processor 801 may be implemented in at least one hardware form of DSP (DIGITAL SIGNAL Processing), FPGA (Field-Programmable gate array), PLA (Programmable Logic Array ). The processor 801 may also include a main processor, which is a processor for processing data in an awake state, also referred to as a CPU (Central Processing Unit ), and a coprocessor; a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 801 may integrate a GPU (Graphics Processing Unit, image processor) for rendering and drawing of content required to be displayed by the display screen. In some embodiments, the processor 801 may also include an AI (ARTIFICIAL INTELLIGENCE ) processor for processing computing operations related to machine learning.

Memory 802 may include one or more computer-readable storage media, which may be non-transitory. Memory 802 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 802 is used to store at least one program code for execution by processor 801 to implement the defect detection identification method provided by the method embodiments of the present invention.

In some embodiments, the computer device 800 may optionally further include: a peripheral interface 803, and at least one peripheral. The processor 801, the memory 802, and the peripheral interface 803 may be connected by a bus or signal line. Individual peripheral devices may be connected to the peripheral device interface 803 by buses, signal lines, or a circuit board. Specifically, the peripheral device includes: at least one of radio frequency circuitry 804, a display 805, a camera 806, audio circuitry 807, and a power supply 809.

Peripheral interface 803 may be used to connect at least one Input/Output (I/O) related peripheral to processor 801 and memory 802. In some embodiments, processor 801, memory 802, and peripheral interface 803 are integrated on the same chip or circuit board; in some other embodiments, either or both of the processor 801, the memory 802, and the peripheral interface 803 may be implemented on separate chips or circuit boards, which is not limited in this embodiment.

The Radio Frequency circuit 804 is configured to receive and transmit RF (Radio Frequency) signals, also known as electromagnetic signals. The radio frequency circuit 804 communicates with a communication network and other communication devices via electromagnetic signals. The radio frequency circuit 804 converts an electrical signal into an electromagnetic signal for transmission, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 804 includes: antenna systems, RF transceivers, one or more amplifiers, tuners, oscillators, digital signal processors, codec chipsets, subscriber identity module cards, and so forth. The radio frequency circuitry 804 may communicate with other computer devices via at least one wireless communication protocol. The wireless communication protocol includes, but is not limited to: metropolitan area networks, various generations of mobile communication networks (2G, 3G, 4G, and 5G), wireless local area networks, and/or WiFi (WIRELESS FIDELITY ) networks. In some embodiments, the radio frequency circuit 804 may further include NFC (NEAR FIELD Communication) related circuits, which is not limited by the present invention.

The display 805 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display 805 is a touch display, the display 805 also has the ability to collect touch signals at or above the surface of the display 805. The touch signal may be input as a control signal to the processor 801 for processing. At this time, the display 805 may also be used to provide virtual buttons and/or virtual keyboards, also referred to as soft buttons and/or soft keyboards. In some embodiments, the display 805 may be one, providing a front panel of the computer device 800; in other embodiments, the display 805 may be at least two, respectively disposed on different surfaces of the computer device 800 or in a folded design; in still other embodiments, the display 805 may be a flexible display disposed on a curved surface or a folded surface of the computer device 800. Even more, the display 805 may be arranged in an irregular pattern other than rectangular, i.e., a shaped screen. The display 805 may be made of LCD (Liquid CRYSTAL DISPLAY), OLED (Organic Light-Emitting Diode), or other materials.

The camera assembly 806 is used to capture images or video. Optionally, the camera assembly 806 includes a front camera and a rear camera. Typically, the front camera is disposed on a front panel of the computer device and the rear camera is disposed on a rear surface of the computer device. In some embodiments, the at least two rear cameras are any one of a main camera, a depth camera, a wide-angle camera and a tele camera, so as to realize that the main camera and the depth camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize a panoramic shooting and Virtual Reality (VR) shooting function or other fusion shooting functions. In some embodiments, the camera assembly 806 may also include a flash. The flash lamp can be a single-color temperature flash lamp or a double-color temperature flash lamp. The dual-color temperature flash lamp refers to a combination of a warm light flash lamp and a cold light flash lamp, and can be used for light compensation under different color temperatures.

Audio circuitry 807 may include a microphone and a speaker. The microphone is used for collecting sound waves of users and the environment, converting the sound waves into electric signals, inputting the electric signals to the processor 801 for processing, or inputting the electric signals to the radio frequency circuit 804 for voice communication. For purposes of stereo acquisition or noise reduction, the microphone may be multiple, each disposed at a different location of the computer device 800. The microphone may also be an array microphone or an omni-directional pickup microphone. The speaker is used to convert electrical signals from the processor 801 or the radio frequency circuit 804 into sound waves. The speaker may be a conventional thin film speaker or a piezoelectric ceramic speaker. When the speaker is a piezoelectric ceramic speaker, not only the electric signal can be converted into a sound wave audible to humans, but also the electric signal can be converted into a sound wave inaudible to humans for ranging and other purposes. In some embodiments, audio circuit 807 may also include a headphone jack.

The power supply 809 is used to power the various components in the computer device 800. The power supply 809 may be an alternating current, direct current, disposable battery, or rechargeable battery. When the power supply 809 includes a rechargeable battery, the rechargeable battery may support wired or wireless charging. The rechargeable battery may also be used to support fast charge technology.

In some embodiments, the computer device 800 also includes one or more sensors 810. The one or more sensors 810 include, but are not limited to: acceleration sensor 811, gyroscope sensor 812, pressure sensor 813, optical sensor 815, and proximity sensor 816.

The acceleration sensor 811 can detect the magnitudes of accelerations on three coordinate axes of the coordinate system established with the computer device 800. For example, the acceleration sensor 811 may be used to detect components of gravitational acceleration in three coordinate axes. The processor 801 may control the display screen 805 to display a user interface in a landscape view or a portrait view based on the gravitational acceleration signal acquired by the acceleration sensor 811. Acceleration sensor 811 may also be used for the acquisition of motion data of a game or user.

The gyro sensor 812 may detect a body direction and a rotation angle of the computer device 800, and the gyro sensor 812 may collect a 3D motion of the user on the computer device 800 in cooperation with the acceleration sensor 811. The processor 801 may implement the following functions based on the data collected by the gyro sensor 812: motion sensing (e.g., changing UI according to a tilting operation by a user), image stabilization at shooting, game control, and inertial navigation.

Pressure sensor 813 may be disposed on a side frame of computer device 800 and/or on an underlying layer of display 805. When the pressure sensor 813 is disposed on a side frame of the computer device 800, a grip signal of the computer device 800 by a user may be detected, and the processor 801 performs left-right hand recognition or quick operation according to the grip signal collected by the pressure sensor 813. When the pressure sensor 813 is disposed at the lower layer of the display screen 805, the processor 801 controls the operability control on the UI interface according to the pressure operation of the user on the display screen 805. The operability controls include at least one of a button control, a scroll bar control, an icon control, and a menu control.

The optical sensor 815 is used to collect the ambient light intensity. In one embodiment, the processor 801 may control the display brightness of the display screen 805 based on the intensity of ambient light collected by the optical sensor 815. Specifically, when the intensity of the ambient light is high, the display brightness of the display screen 805 is turned up; when the ambient light intensity is low, the display brightness of the display screen 805 is turned down. In another embodiment, the processor 801 may also dynamically adjust the shooting parameters of the camera module 806 based on the ambient light intensity collected by the optical sensor 815.

A proximity sensor 816, also referred to as a distance sensor, is typically provided on the front panel of the computer device 800. The proximity sensor 816 is used to collect the distance between the user and the front of the computer device 800. In one embodiment, when the proximity sensor 816 detects a gradual decrease in the distance between the user and the front of the computer device 800, the processor 801 controls the display 805 to switch from the bright screen state to the off screen state; when the proximity sensor 816 detects that the distance between the user and the front of the computer device 800 gradually increases, the processor 801 controls the display 805 to switch from the off-screen state to the on-screen state.

Those skilled in the art will appreciate that the architecture shown in fig. 8 is not limiting and that more or fewer components than shown may be included or that certain components may be combined or that a different arrangement of components may be employed.

In an exemplary embodiment, a computer readable storage medium, such as a memory including program code executable by a processor to perform the defect detection identification method of the above embodiment, is also provided. For example, the computer readable storage medium may be Read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), compact disc Read-Only Memory (CD-ROM), magnetic tape, floppy disk, optical data storage device, and the like.

It will be appreciated by those of ordinary skill in the art that all or part of the steps of implementing the above embodiments may be implemented by hardware, or may be implemented by program code related hardware, where the program may be stored in a computer readable storage medium, where the storage medium may be a read only memory, a magnetic disk or optical disk, etc.

The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, alternatives, and improvements that fall within the spirit and scope of the invention.

Claims

1. A defect detection and identification method, the method comprising:

When only one connected domain exists in the mask diagram of the target product image, determining a positioning frame of the connected domain as a target positioning frame, wherein the foreground and the background determined by the target positioning frame meet target conditions;

When two or more connected domains exist in a mask map of the target product image, determining a positioning frame of a first connected domain in the mask map as an initial target positioning frame;

determining a positioning frame of a second merging domain based on the nearest connected domain of the first connected domain, wherein the second merging domain comprises the first connected domain and the nearest connected domain of the first connected domain;

Calculating a merging frame area occupation ratio and a merging mask occupation ratio of the second merging domain; the area ratio of the merging frame is used for representing the ratio between the area sum value of the positioning frames of the two or more connected domains and the area of the positioning frames of the merging domain after the connected domains are merged; the merging mask ratio is used for representing the ratio between the area sum value of the two or more connected domains and the area of the positioning frame of the merging domain after the connected domains are merged;

When the area ratio of the merging frames meets a first value range or the area ratio of the merging masks meets a second value range, determining a positioning frame of a first connected domain as the target positioning frame;

When the area ratio of the merging frames does not meet a first value range and the area ratio of the merging mask does not meet a second value range, determining one locating frame which is positioned in the locating frame of a first merging domain and comprises the locating frame of a first communicating domain in the mask map as the target locating frame, wherein the first merging domain is obtained by merging all the communicating domains;

2. The method of claim 1, wherein the obtaining a mask map of the target product image based on the target product image comprises:

extracting a feature map of the target product image through a convolutional neural network of the defect detection model;

inputting the feature map into a space pyramid module of the defect detection model to obtain feature maps with different granularities of the target product image;

the spatial pyramid module is used for carrying out up-sampling processing on the feature graphs with different granularities to obtain a final feature graph;

And acquiring a mask image of the target product image based on the final feature image through a segmentation extraction layer of the defect detection model.

3. The method of claim 1, wherein identifying the target product image block corresponding to the target positioning frame in the target product image comprises:

based on the center of the target positioning frame, taking the longest side of the target positioning frame as the side length, and intercepting a square target product image block;

and identifying the square target product image block.

4. A defect detection and identification device, the device comprising:

The positioning module is used for determining a positioning frame of the connected domain as a target positioning frame when only one connected domain exists in the mask image of the target product image, and the foreground and the background determined by the target positioning frame accord with target conditions; when two or more connected domains exist in a mask map of the target product image, determining a positioning frame of a first connected domain in the mask map as an initial target positioning frame;

A determining module, configured to determine, based on a nearest connected domain of the first connected domain, a positioning frame of a second merged domain, where the second merged domain includes the first connected domain and the nearest connected domain of the first connected domain;

The calculation module is used for calculating the area ratio of the merging frame and the merging mask of the second merging domain; the area ratio of the merging frame is used for representing the ratio between the area sum value of the positioning frames of the two or more connected domains and the area of the positioning frames of the merging domain after the connected domains are merged; the merging mask ratio is used for representing the ratio between the area sum value of the two or more connected domains and the area of the positioning frame of the merging domain after the connected domains are merged;

the positioning module is further configured to determine a positioning frame of the first connected domain as the target positioning frame when the area ratio of the merging frames meets a first value range or the area ratio of the merging masks meets a second value range; when the area ratio of the merging frames does not meet a first value range and the area ratio of the merging mask does not meet a second value range, determining one locating frame which is positioned in the locating frame of a first merging domain and comprises the locating frame of a first communicating domain in the mask map as the target locating frame, wherein the first merging domain is obtained by merging all the communicating domains;

5. The apparatus of claim 4, wherein the apparatus further comprises:

The segmentation module is used for extracting a characteristic diagram of the target product image through a convolutional neural network of the defect detection model;

the pyramid module is used for inputting the space pyramid module of the defect detection model based on the feature map to obtain feature maps with different granularities of the target product image;

And the segmentation extraction module is used for extracting the defect mask by using a convolution layer of 1x1 based on the final feature map, and obtaining a mask map of the target product image.

6. The apparatus of claim 4, wherein the apparatus further comprises:

the intercepting module is used for intercepting a square target product image block by taking the longest edge of the target positioning frame as the edge length based on the center of the target positioning frame;

And the identification module is also used for identifying the square target product image block.

7. A computer device comprising one or more processors and one or more memories, the one or more memories having stored therein at least one program code that is loaded and executed by the one or more processors to implement the operations performed by the defect detection identification method of any of claims 1-3.

8. A computer readable storage medium having stored therein at least one program code loaded and executed by a processor to perform the operations performed by the defect detection identification method of any of claims 1 to 3.