CN113706440B

Movatterモバイル変換

Info

Publication number: CN113706440B
Application number: CN202110269969.9A
Authority: CN
Inventors: 高斌斌
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2021-03-12
Filing date: 2021-03-12
Publication date: 2024-10-15
Anticipated expiration: 2041-03-12
Also published as: CN113706440A

Abstract

The application discloses an image processing method, an image processing device, computer equipment and a storage medium, and belongs to the technical field of image processing. According to the application, the first image and the first mask image after marking the first image are utilized, and the target area where the defect is located is transformed and processed, so that the target image taking the second image as the background and taking the target area as the foreground can be synthesized on the basis of the second image which does not contain the defect, and the target image not only contains the defect but also naturally carries the corresponding marking information, namely the second mask image, thereby avoiding the need of introducing additional manual marking to the second image, greatly relieving the strict requirement on the marking information of the second image, reducing the training cost of the defect segmentation model, and being beneficial to improving the accuracy and efficiency of defect segmentation.

Description

Image processing method, device, computer equipment and storage medium

Technical Field

The present application relates to the field of image processing technologies, and in particular, to an image processing method, an image processing device, a computer device, and a storage medium.

Background

With the development of image processing technology, in the quality inspection process of industrial products, the classification of defective pixel areas on the surface of the products is becoming a basic and challenging task, which may also be called as a defect segmentation task of the product image, that is, identifying the target area where the defect is located from the product image.

In recent years, with the development of artificial intelligence technology, a semantic segmentation algorithm based on deep learning can be utilized to perform defect segmentation tasks of product images, but since the semantic segmentation algorithm usually manually pre-marks large-scale training data, but is limited by production and arrangement of factories, the defect data of all products are not practical to obtain at one time, so that the identification efficiency of defect segmentation is low and the accuracy is poor for some products lacking labels.

Disclosure of Invention

The embodiment of the application provides an image processing method, an image processing device, computer equipment and a storage medium, which can improve the identification efficiency and accuracy of defect segmentation. The technical scheme is as follows:

in one aspect, there is provided an image processing method, the method including:

Acquiring a first image and a first mask image, wherein the first image is a surface texture image of a first product, and the first mask image is used for marking a target area where a defect in the first image is located;

transforming the target area in the first mask image to obtain a second mask image;

fusing the target area with a second image based on the second mask image to obtain a target image, wherein the second image is a surface texture image of a second product;

And carrying out parameter adjustment on the initial segmentation model based on the target images and the second mask images to obtain a defect segmentation model, wherein the defect segmentation model is used for identifying a target area where a defect is located in the surface texture image of the second product.

In one aspect, there is provided an image processing apparatus including:

The device comprises an acquisition module, a first mask image and a second mask image, wherein the first image is a surface texture image of a first product, and the first mask image is used for marking a target area where a defect in the first image is located;

the transformation module is used for transforming the target area in the first mask image to obtain a second mask image;

the fusion module is used for fusing the target area with a second image based on the second mask image to obtain a target image, wherein the second image is a surface texture image of a second product;

and the parameter adjustment module is used for carrying out parameter adjustment on the initial segmentation model based on the target images and the second mask images to obtain a defect segmentation model, wherein the defect segmentation model is used for identifying a target area where a defect is located in the surface texture image of the second product.

In one possible implementation, the transformation module is configured to:

Randomly sampling from the defect angle interval to obtain a target angle;

and in the first mask image, performing rotation transformation on the target area according to the target angle to obtain the second mask image.

In one possible implementation, the transformation module is configured to:

randomly sampling from the defect size interval to obtain a target size;

And in the first mask image, converting the size of the target area into the target size to obtain the second mask image.

In one possible implementation, the transformation module is configured to:

In response to the second image and the first mask image being the same in size, randomly sampling from the first mask image to obtain a target position; in the first mask image, translating the target area to the target position to obtain the second mask image; or alternatively

In response to the second image and the first mask image being different in size, randomly sampling in the second image to obtain a target position; scaling the size of the first mask image to the same size as the second image; and translating the target area to the target position in the first mask image after the size scaling to obtain the second mask image.

In one possible embodiment, the apparatus further comprises:

And the truncation module is used for truncating the translated target area based on the edge of the first mask image in response to the edge of the translated target area exceeding the edge of the first mask image.

In one possible embodiment, the fusion module includes:

An acquisition unit configured to acquire a transformed target region based on the second mask image;

the superposition unit is used for superposing the transformed target area to the second image to obtain a third image;

and the smoothing unit is used for carrying out smoothing processing on the third image to obtain the target image.

In one possible embodiment, the superposition unit is configured to:

and carrying out poisson image fusion on the transformed target area and the second image to obtain the third image.

In one aspect, a computer device is provided that includes one or more processors and one or more memories having stored therein at least one computer program loaded and executed by the one or more processors to implement an image processing method as in any of the possible implementations described above.

In one aspect, a storage medium is provided, in which at least one computer program is stored, which is loaded and executed by a processor to implement an image processing method as any one of the possible implementations described above.

In one aspect, a computer program product or computer program is provided, the computer program product or computer program comprising one or more program codes, the one or more program codes being stored in a computer readable storage medium. The one or more processors of the computer device are capable of reading the one or more program codes from the computer-readable storage medium, the one or more processors executing the one or more program codes so that the computer device can execute the image processing method of any one of the possible embodiments described above.

The technical scheme provided by the embodiment of the application has the beneficial effects that at least:

By utilizing the first image and the first mask image after marking the first image, the target image taking the second image as the background and taking the target area as the foreground can be synthesized on the basis of the second image which does not contain the defect by transforming and processing the target area where the defect is located, and the target image not only contains the defect but also naturally carries corresponding marking information, namely the second mask image, so that the second image does not need to be additionally marked manually, the strict requirement on the marking information of the second image is greatly relieved, the training cost of a defect segmentation model is reduced, and the accuracy and the efficiency of defect segmentation are improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic view of an implementation environment of an image processing method according to an embodiment of the present application;

FIG. 2 is a flowchart of an image processing method according to an embodiment of the present application;

FIG. 3 is a schematic illustration of a composite target image according to an embodiment of the present application;

FIG. 4 is a flowchart of an image processing method according to an embodiment of the present application;

FIG. 5 is a schematic illustration of a surface texture image of a product provided by an embodiment of the present application;

FIGS. 6 and 7 are block diagrams of a performance comparison experiment provided by an embodiment of the present application;

Fig. 8 is a schematic structural view of an image processing apparatus according to an embodiment of the present application;

FIG. 9 is a schematic diagram of a computer device according to an embodiment of the present application;

Fig. 10 is a schematic structural diagram of a computer device according to an embodiment of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the embodiments of the present application will be described in further detail with reference to the accompanying drawings.

The terms "first," "second," and the like in this disclosure are used for distinguishing between similar elements or items having substantially the same function and function, and it should be understood that there is no logical or chronological dependency between the terms "first," "second," and "n," and that there is no limitation on the amount and order of execution.

The term "at least one" in the present application means one or more, meaning "a plurality" means two or more, for example, a plurality of first positions means two or more first positions.

Embodiments of the present application relate to machine learning in the field of artificial intelligence, and therefore, prior to introducing embodiments of the present application, some basic concepts in the field of artificial intelligence are first introduced, and are described below.

Artificial intelligence (ARTIFICIAL INTELLIGENCE, AI): artificial intelligence is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and expand human intelligence, sense the environment, acquire knowledge and use the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.

The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

Computer Vision (CV): in the AI field, computer vision technology is a rapidly developing branch, and computer vision is a science for researching how to make a machine "look at", and further means that a camera, a computer and other machines are used to replace human eyes to identify, track and measure a target, and further perform graphic processing, so that the machine processes to obtain an image more suitable for human eyes to observe or transmit to an instrument to detect. As a scientific discipline, computer vision research-related theory and technology has attempted to build artificial intelligence systems that can acquire information from images or multidimensional data. Computer vision typically includes image segmentation, image recognition, image retrieval, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, optical character recognition (optical character recognition, OCR), video processing, 3D (3 dimensions) technology, virtual reality, augmented reality, synchronous positioning, map construction, etc., and of course, common biometric recognition technologies such as face recognition, fingerprint recognition, etc.

Machine learning (MACHINE LEARNING, ML): machine learning is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, and the like. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, confidence networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like.

Image segmentation: the image is represented as a collection of physically meaningful connected regions, i.e., objects and backgrounds in the image are marked, located, and then separated from the background or other pseudo-objects, respectively, based on a priori knowledge of the objects and the backgrounds. The image segmentation plays a role in image understanding applications such as target recognition, object tracking, behavior analysis and the like, so that the data volume to be processed in the subsequent processing procedures such as image analysis, recognition and the like is greatly reduced, and meanwhile, the information about the structural characteristics of the image is reserved. In the industrial product quality inspection process, the segmentation of defects (targets) from the surface texture image (background) of the product is a basic and challenging task, and can achieve pixel-level segmentation of the surface defects of the product, thereby realizing automatic quality inspection of the product.

With research and advancement of artificial intelligence technology, research and application of artificial intelligence technology is being developed in various fields, such as common smart home, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned, automatic driving, unmanned aerial vehicles, robots, smart medical treatment, smart customer service, etc., and it is believed that with the development of technology, artificial intelligence technology will be applied in more fields and with increasing importance value. The scheme provided by the embodiment of the application relates to an image segmentation technology based on machine learning, which is used for carrying out automatic quality inspection on products and is described by the following embodiment.

In the quality inspection process of industrial products, the classification of defects on the surface of the products becomes a basic and challenging task, which can also be called a defect classification task of the product image, i.e. identifying the target region where the defect is located from the product image.

In recent years, with the development of artificial intelligence technology, a semantic segmentation algorithm based on deep learning can be utilized to perform defect segmentation tasks of product images, but since the semantic segmentation algorithm usually manually pre-marks large-scale training data, but is limited by production and arrangement of factories, the defect data of all products are not practical to obtain at one time, so that the identification efficiency of defect segmentation is low and the accuracy is poor for some products lacking labels. While there is some variance between the surface texture characteristics of different products, the texture of the defect itself tends to have some stability, such as scratches, deletions, etc.

In view of the above, the embodiment of the application provides an image processing method, which can synthesize the defect data (target domain) of the marked new product based on the defect data (source domain) of the marked new product for the new product lacking the marking, so as to train a defect segmentation model for the new product to achieve the surface defect segmentation work of the new product, thereby having great significance for promoting the large-scale landing of the automatic quality inspection product.

Fig. 1 is a schematic view of an implementation environment of an image processing method according to an embodiment of the present application. Referring to fig. 1, the implementation environment includes a terminal 101 and a server 102, and is described below:

the terminal 101 installs and runs an application supporting automated quality inspection products. Optionally, the terminal 101 is provided with an application program, and the user inputs a surface texture image of a certain batch of products into the application program, and the application program invokes a defect segmentation model to perform defect identification on the surface texture image and outputs a target area where a defect in the surface texture image is located.

The terminal 101 may be directly or indirectly connected to the server 102 through a wired or wireless communication manner, and the embodiment of the present application is not limited herein.

The server 102 is configured to provide background services for the application programs, where the server 102 includes at least one of a server, a plurality of servers, a cloud computing platform, or a virtualization center. Optionally, the server 102 takes on primary processing work and the terminal 101 takes on secondary processing work; or the server 102 takes on secondary processing work and the terminal 101 takes on primary processing work; or a distributed computing architecture is employed between the server 102 and the terminal 101 for collaborative image processing.

Optionally, the server 102 is configured to train the above-mentioned defect segmentation model, for example, the server 102 adds defects to the normal (i.e. defect-free) surface texture image of the product to be identified according to the surface texture image of the product to be identified and the target area where the noted defects are located, so as to obtain a surface texture image of the product to be identified carrying defects, and the surface texture images have marking information of the defects, so that the initial segmentation model can be trained based on the marking information of the product to be identified, and a defect segmentation model is obtained.

In some embodiments, server 102 is a stand-alone physical server, or a server cluster or distributed system of multiple physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, CDNs (Content Delivery Network, content delivery networks), and basic cloud computing services such as big data and artificial intelligence platforms.

In some embodiments, the terminal 101 is a smart phone, tablet, notebook, desktop, smart box, smart watch, MP3 (Moving Picture Experts Group Audio Layer III, dynamic video expert compression standard audio plane 3) player, MP4 (Moving Picture Experts Group Audio Layer IV, dynamic video expert compression standard audio plane 4) player, e-book reader, etc., but is not limited thereto.

Those skilled in the art will appreciate that the number of terminals 101 may be greater or lesser. For example, the number of the terminals 101 may be only one, or the number of the terminals 101 may be several tens or hundreds, or more. The number and device type of the terminals 101 are not limited in the embodiment of the present application.

Fig. 2 is a flowchart of an image processing method according to an embodiment of the present application. Referring to fig. 2, the embodiment is applied to a computer device, taking the computer device as a server as an example, and includes:

201. the server acquires a first image and a first mask image, wherein the first image is a surface texture image of a first product, and the first mask image is used for identifying a target area where a defect in the first image is located.

The server may be any computer device supporting image processing, where the server is configured to train the initial segmentation model to obtain a defect segmentation model, optionally, the server maintains the defect segmentation model in a cloud, so that interaction is performed between the terminal and the server to complete product quality inspection work, and therefore computing resources of the terminal are saved, or the server issues the defect segmentation model to the terminal in a cold update or hot update manner, so that the terminal can complete product quality inspection work locally, and communication overhead of the work is saved.

The first image refers to a surface texture image of a first product, for example, including but not limited to: a cell phone, a display, a computer, a television, a radio, etc.

In some embodiments, after the server obtains the first image uploaded by the user, a technician marks the first image, marks a target area where the defect is located in the first image, and further generates the first mask image based on the position and the shape of the target area, where optionally, the first mask image is a binary image, and in the binary image, the target area where the defect is located and the background area of the rest have different pixel values, for example, the pixel value of the target area is 1, the pixel value of the background area is 0, or the pixel value of the target area is 0, and the pixel value of the background area is 1.

202. The server transforms the target area in the first mask image to obtain a second mask image.

In some embodiments, the server may transform the target area based on at least one of the following, each of which is described below.

Mode one, target angle conversion

Optionally, the server randomly samples from the defect angle interval to obtain a target angle; and in the first mask image, performing rotation transformation on the target area according to the target angle to obtain the second mask image.

In some embodiments, the defect angle interval may be a fixed value interval, which is set by a technician, alternatively, the defect angle interval may be changed according to the characteristics of the second product, for example, the characteristics of the second product may cause the values of the defect angle interval to be different, which is not particularly limited in the embodiments of the present application.

Optionally, assuming that the defect angle interval is [ - θ, θ ], the server randomly samples an angle from the defect angle interval [ - θ, θ) as a target angle, and then rotates the target area by the target angle in the first mask image to obtain the second mask image. Wherein θ is any angle value greater than 0.

In the above process, defects of the same type are usually presented in different angles in the surface texture image of the product, and the diversity of defect angles in the synthesized target image can be enriched by performing rotation transformation on the target area.

Mode two, target scaling transforms

Optionally, the server randomly samples from the defect size interval to obtain a target size; in the first mask image, the size of the target area is converted into the target size, and the second mask image is obtained.

In some embodiments, the defect size interval may be a fixed value interval, which is set by a technician, alternatively, the defect size interval may be changed according to the characteristics of the second product, for example, the characteristics of the second product may cause the defect size interval to be different, which is not limited in detail in the embodiments of the present application.

Optionally, assuming a defect size interval of [ s₁,s₂ ], the server randomly samples a size from the defect size interval [ s₁,s₂ ] as a target size, and then scales the target area to the target size in the first mask image to obtain the second mask image. Wherein s₁ is any size value greater than 0, and s₂ is any size value greater than s₁.

In the above process, since defects of the same type generally have different sizes in the surface texture image of the product, the diversity of defect sizes in the synthesized target image can be enriched by performing size scaling on the target area.

Mode three, target position randomization

In some embodiments, the server randomly samples a target location from the first mask image in response to the second image being the same size as the first mask image; and translating the target area to the target position in the first mask image to obtain the second mask image.

Because the target area needs to be migrated into the second image, optionally, when the second image and the first mask image have the same size, the server may directly randomly sample a position from the first mask image as a target position, obtain a position coordinate of the target position, and translate the target area in the first mask image, so that a center coordinate of the translated target area coincides with the position coordinate of the target position, to obtain the second mask image.

In some embodiments, the server randomly samples the second image to obtain the target location in response to the second image being different in size from the first mask image; scaling the size of the first mask image to the same size as the second image; and translating the target area to the target position in the first mask image after the size scaling to obtain the second mask image.

Optionally, when the second image and the first mask image are different in size, the server may randomly sample a position from the second image as a target position and acquire a position coordinate of the target position, further, scale the size of the first mask image to be the same as the second image, acquire a center coordinate of the target area, and translate the target area in the first mask image, so that the center coordinate of the translated target area coincides with the position coordinate of the target position, to obtain the second mask image.

In the above process, since the defect may appear at any position in the second image, the position of the target area is randomized, so that the diversity of the position where the defect appears in the synthesized target image can be enriched.

Fourth mode, object cut-off

Optionally, after the server translates the target region, the server truncates the translated target region based on the edge of the first mask image in response to the translated target region edge exceeding the edge of the first mask image.

In some embodiments, after the server translates the target area, since the position coordinates of the target position may be any position in the first mask image, when the distance between the target position and the edge of the first mask image is smaller than the size of the target area, the edge of the translated target area may exceed the edge of the first mask image, and then the translated target area needs to be truncated to avoid the edge spread of the obtained second mask image.

In some embodiments, the server sets the following truncation conditions: the area of the target area after cutting is at least more than or equal to one third of the area of the target area before cutting, so that the discernability of defects can be ensured; alternatively, the following cutoff conditions may also be set: the area of the target area after the cutoff is at least greater than or equal to one half of the area of the target area before the cutoff, so that the defect can be identified, and of course, other cutoff conditions can be set, which is not particularly limited in the embodiment of the present application.

In the above-described process, since the defect is also a phenomenon that the display is not complete in the second image, for example, half-shots in a natural scene, by setting the target position near the edge, the vertex, or the like of the first mask image, it is possible to synthesize the target image having the cut-off defect.

It should be noted that, the server may perform transformation on the target area by any one of the first to fourth modes, or may perform transformation on the target area by a combination of at least two of the first to fourth modes, which is not specifically limited in the embodiment of the present application.

203. And the server fuses the target area with a second image based on the second mask image to obtain a target image, wherein the second image is a surface texture image of a second product.

In some embodiments, the server obtains the transformed target region based on the second mask image; superposing the transformed target area on the second image to obtain a third image; and carrying out smoothing processing on the third image to obtain the target image.

Optionally, since the second mask image actually indicates the position, the size and the shape of the target area in the second image, the server may perform corresponding transformation on the target area based on the size and the shape indicated by the second mask image to obtain a transformed target area, and superimpose the transformed target area on the corresponding position in the second image based on the position indicated by the second mask image to obtain a primarily synthesized third image, and then perform smoothing processing on the third image to obtain the target image.

In some embodiments, when the server acquires the third image, the poisson image fusion may be performed on the transformed target area and the second image to obtain the third image. Optionally, poisson image fusion is a method for solving the optimal value of the pixel by poisson fusion introduced based on poisson equation, and the transformed target area and the second image are fused while the gradient information of the second image is maintained. The poisson image fusion mode solves a poisson equation according to the boundary condition appointed by the user, and realizes continuity on the gradient domain, thereby achieving seamless fusion at the boundary. In poisson image fusion, the boundary smoothing equation can be expressed as follows:

wherein Ω represents an area covered by the transformed target area in the third image, f is a function of pixel representations within the covered area Ω in the third image,The gradient field of the function f is represented for the pixel, v representing the gradient field of the transformed target region.

While the condition for keeping the boundaries consistent can be expressed as the following formula:

Wherein omega represents the area covered by the transformed target area in the third image,Representing the boundary of the covered region Ω, f being the pixel representation function within the covered region Ω in the third image, and f^* being the pixel representation function outside the covered region Ω in the third image.

In the discrete space, the third image f can be obtained by minimizing the boundary smoothing equation when the condition for keeping the boundaries uniform is satisfied.

In some embodiments, the server may perform smoothing on the third image in the following way: and the server performs smoothing processing on the third image based on a two-dimensional Gaussian filter function to obtain the target image. It should be noted that, the "smoothing processing" or "smoothing" according to the embodiments of the present application refers to a smoothing operation performed on an image, that is, modifying pixel values of a part of pixels in an image to be processed, so that the image is flattened and continuous, or reducing or deleting noise points (or outliers) in the image, for example, modifying boundary pixels of a transformed target area in a third image, so that a transition between the transformed target area and a background area of the rest part is smoother and natural. The smoothing process is equivalent to a low-pass filtering process, and highlights a wide area, a low-frequency component and a main part of an image, suppresses image noise and a high-frequency interference component, can make the brightness of the image gradually change gradually, reduces abrupt gradient, improves the image quality, and generally causes blurring of the image edge.

Optionally, the expression of the two-dimensional gaussian filter function G_(x,y) is as follows:

Wherein x and y represent the distance between the abscissa and the ordinate of a certain pixel in the convolution kernel and the center point of the convolution kernel, the degree of smoothness of a sigma control curve is larger, the value of sigma is smoother, the highest point is lower, the blurring degree of a target image can be controlled by setting the magnitude of sigma, and the larger the value of sigma is, the more blurred the target image is. It can be seen from the above two-dimensional gaussian filter function that the value is maximum when x=0 and y=0, i.e. the center point weight of the convolution kernel is maximum.

In the above process, the two-dimensional gaussian filter function is equivalent to that after the transformed target area (foreground) and the second image (background) are fused to obtain the third image, the gaussian filter is used to perform noise processing on the distorted edge in the third image, and the edge of the foreground in the third image is blurred, so as to weaken the negative effect caused by the image distortion.

In some embodiments, in addition to the gaussian blur method described above, the server may also use the following smoothing approach: the neighborhood averaging method (mean filtering method), the over-limit pixel smoothing method, the edge-preserving filtering method, the median filtering method, the convolution method, and the like, which are not particularly limited in the embodiment of the present application.

In an exemplary embodiment, the server may call seamlessClone functions in the OpenCV library to perform poisson image fusion, and call GaussianBlur functions in the OpenCV library to perform gaussian blur processing, or may use related functions in other native libraries, which is not limited in detail in the embodiment of the present application.

Fig. 3 is a schematic diagram of a synthetic target image according to an embodiment of the present application, as shown in fig. 3, after a server obtains a first image 300 and a first mask image 301, defect synthesis is performed on a second image 302 without defects, a target area where defects in the first image 300 are located is migrated into the second image 302, so as to obtain a target image 303 and a second mask image 304, that is, defects in a surface texture image of a first product can be migrated into a surface texture image of a second product without defects, so that a target image with defects of the second product is synthesized, and positions, sizes and shapes of the defects are all carried by the server without consuming additional manpower.

204. And the server carries out parameter adjustment on the initial segmentation model based on the plurality of target images and the plurality of second mask images to obtain a defect segmentation model, wherein the defect segmentation model is used for identifying a target area where a defect is located in the surface texture image of the second product.

In some embodiments, the server repeatedly executes the steps 201-203, so as to obtain a plurality of target images and a plurality of corresponding second mask images, where the plurality of target images and the plurality of corresponding second mask images may form a training sample set. The server inputs the multiple target images into an initial segmentation model, performs semantic segmentation on the multiple target images through the initial segmentation model, outputs multiple semantic segmentation images, wherein the multiple semantic segmentation images correspond to the prediction results of the initial segmentation model on the multiple target images, the multiple second mask images correspond to the labeling results of the multiple target images, the loss function value of the iterative process is obtained through the differences between the multiple semantic segmentation images and the multiple second mask images, and after the parameters of the initial segmentation model are adjusted in response to the loss function value not meeting the stop condition, the step of obtaining the loss function value is iteratively executed until the loss function value in any iteration process meets the stop condition, and the defect segmentation model is obtained.

In some embodiments, the defect segmentation model includes, but is not limited to: FCN (Fully Convolutional Networks, full convolution Network), segNet (semantic segmentation Network), dilated Convolutions (hole convolution), deepLab series semantic segmentation Network (v 1& v2& v 3), PSPNet (PYRAMID SCENE PARSING Network, pyramid scene analysis Network), uperNet (Unified Perceptual Networks, unified perception analysis Network), and the like, and the structure of the defect segmentation model is not specifically limited in the embodiment of the present application.

In an exemplary scenario, a source domain defect image set (a dataset made up of a plurality of first images and a corresponding plurality of first mask images) is denoted { (S₁,Y₁),(S₂,Y₂),...,(S_n,Y_n) }, and a target domain composite image set (a dataset made up of a plurality of target images and a corresponding plurality of second mask images) is denoted { (T₁,Z₁),(T₂,Z₂),...,(T_n,Z_n) }.

Optionally, the server may train to obtain the initial segmentation model by using the source domain defect image set { (S₁,Y₁),(S₂,Y₂),...,(S_n,Y_n) }, and train to obtain the defect segmentation model by using the target domain synthesis image set { (T₁,Z₁),(T₂,Z₂),...,(T_n,Z_n) } in a fine-tuning (fine-tuning) manner on the basis of the initial segmentation model, which has higher pertinence, and can improve the defect identification accuracy of the surface texture image of the second product.

Optionally, the server may further directly combine the source domain defect image set { (S₁,Y₁),(S₂,Y₂),...,(S_n,Y_n) } and the target domain synthesis image set { (T₁,Z₁),(T₂,Z₂),...,(T_n,Z_n) } into a unified training sample set, and train the defect segmentation model from the head based on the initialized initial segmentation model, so that the defect segmentation model has good generalization capability, and the management work in the model deployment stage is simple and practical.

All the above optional solutions can be combined to form an optional embodiment of the present disclosure, which is not described in detail herein.

According to the method provided by the embodiment of the application, the first image and the first mask image after marking the first image are utilized, and the target area where the defect is located is transformed and processed, so that the target image taking the second image as the background and taking the target area as the foreground can be synthesized on the basis of the second image which does not contain the defect, and the target image not only contains the defect but also naturally carries the corresponding marking information, namely the second mask image, thereby avoiding the need of introducing additional manual marking to the second image, greatly relieving the strict requirement on the marking information of the second image, reducing the training cost of the defect segmentation model, and being beneficial to improving the accuracy and efficiency of defect segmentation.

Fig. 4 is a flowchart of an image processing method provided in an embodiment of the present application, please refer to fig. 4, the embodiment is applied to a computer device, taking the computer device as a server for example, and the embodiment includes:

401. The server acquires a first image and a first mask image, wherein the first image is a surface texture image of a first product, and the first mask image is used for identifying a target area where a defect in the first image is located.

Fig. 5 is a schematic diagram of a surface texture image of a product according to an embodiment of the present application, as shown in 500, which shows surface texture images of products obtained after processing 4 different products at 2 different stations (i.e. factories), it can be seen that there is a large difference between the surface texture images of different products in the same factory, and there is a large difference between the surface texture images of the same product processed in different factories.

402. The server randomly samples from the defect angle interval to obtain a target angle.

Optionally, assuming that the defective angle interval is [ - θ, θ ], the server randomly samples an angle from the defective angle interval [ - θ, θ) as the target angle, where θ is any angle value greater than 0.

403. And the server rotates and transforms the target area according to the target angle in the first mask image to obtain a second mask image.

Optionally, the server rotates the target area by the target angle in the first mask image to obtain the second mask image.

In some embodiments, the above steps 402-403 may also be replaced as follows: the server randomly samples from the defect size interval to obtain a target size; in the first mask image, the size of the target area is converted into the target size, and the second mask image is obtained.

In some embodiments, the above steps 402-403 may also be replaced as follows: the server responds to the second image and the first mask image with the same size, and randomly samples the second image and the first mask image to obtain a target position; and translating the target area to the target position in the first mask image to obtain the second mask image.

In some embodiments, the above steps 402-403 may also be replaced as follows: the server responds to the second image and the first mask image with different sizes, and randomly samples the second image to obtain a target position; scaling the size of the first mask image to the same size as the second image; and translating the target area to the target position in the first mask image after the size scaling to obtain the second mask image.

Because the target area needs to be migrated into the second image, optionally, when the second image is different from the first mask image in size, the server may randomly sample a position from the second image as a target position and acquire a position coordinate of the target position, further, scale the size of the first mask image to be the same as the second image, acquire a center coordinate of the target area, and translate the target area in the first mask image, so that the center coordinate of the translated target area coincides with the position coordinate of the target position, thereby obtaining the second mask image.

In some embodiments, after the server translates the target region, the server truncates the translated target region based on the edge of the first mask image in response to the translated target region edge exceeding the edge of the first mask image.

In the above process, the server transforms the target area in the first mask image to obtain multiple transformation modes of the second mask image, including angle transformation, scaling transformation, position randomization, target truncation, and the like, which are not specifically limited in the embodiment of the present application.

404. The server acquires the transformed target region based on the second mask image.

Alternatively, since the second mask image actually indicates the position, size, and shape of the target region in the second image, the server may perform a corresponding transformation on the target region based on the size and shape indicated by the second mask image, resulting in a transformed target region.

405. And the server performs poisson image fusion on the transformed target area and a second image to obtain a third image, wherein the second image is a surface texture image of a second product.

The Poisson image fusion is a method for solving the optimal value of a pixel by the Poisson fusion introduced based on the Poisson equation, and the transformed target area and the second image are fused while the gradient information of the second image is reserved. The poisson image fusion mode solves a poisson equation according to the boundary condition appointed by the user, and realizes continuity on the gradient domain, thereby achieving seamless fusion at the boundary. In poisson image fusion, the boundary smoothing equation can be expressed as follows:

In one exemplary embodiment, the server may call seamlessClone functions in the OpenCV library for poisson image fusion.

In the above process, that is, the server superimposes the transformed target area on the second image to obtain a possible implementation manner of the third image, the poisson fusion can promote the fusion natural sense of the third image, and reduce the distortion degree of the synthesized target image. Alternatively, the server may superimpose the transformed target region on a corresponding position in the second image based on the position indicated by the second mask image, resulting in a preliminarily synthesized third image.

In other embodiments, the server may not perform poisson fusion, but directly cover the transformed target area to a corresponding position in the second image based on the position indicated by the second mask image, so as to obtain a third image, thereby simplifying the image fusion process.

406. And the server performs smoothing processing on the third image to obtain a target image.

The server may perform smoothing processing on the third image in the following smoothing manner: and the server performs smoothing processing on the third image based on a two-dimensional Gaussian filter function to obtain the target image. It should be noted that, the "smoothing processing" or "smoothing" according to the embodiments of the present application refers to a smoothing operation performed on an image, that is, modifying pixel values of a part of pixels in an image to be processed, so that the image is flattened and continuous, or reducing or deleting noise points (or outliers) in the image, for example, modifying boundary pixels of a transformed target area in a third image, so that a transition between the transformed target area and a background area of the rest part is smoother and natural. The smoothing process is equivalent to a low-pass filtering process, and highlights a wide area, a low-frequency component and a main part of an image, suppresses image noise and a high-frequency interference component, can make the brightness of the image gradually change gradually, reduces abrupt gradient, improves the image quality, and generally causes blurring of the image edge.

In one exemplary embodiment, the server may call GaussianBlur functions in the OpenCV library for gaussian blur processing.

In the above steps 404-406, the server fuses the target area with the second image based on the second mask image to obtain a target image. In some embodiments, the server may not perform poisson fusion and gaussian blur, but directly cover the transformed target area to a corresponding position in the second image based on the position indicated by the second mask image, to obtain a third image, and use the third image as the target image, so that the image synthesis process can be simplified.

407. And the server carries out parameter adjustment on the initial segmentation model based on the plurality of target images and the plurality of second mask images to obtain a defect segmentation model, wherein the defect segmentation model is used for identifying a target area where a defect is located in the surface texture image of the second product.

In an exemplary scenario, a defect segmentation model is used for testing UperNet, source domain defect data (i.e., a data set formed by a plurality of first images and a corresponding plurality of first mask images) is kept unchanged, and three different target domain defect data are respectively used for performing a comparison experiment, where the three different target domain defect data include: the normal image, the defect synthesized image and the full-scale marked image (the image marked by the manual) provided by the embodiment of the application adopt an evaluation index MIOU (mean Intersection Over Union, equal intersection ratio), MIOU is a measurement standard of semantic segmentation and is an average value of the intersection and union ratios of all categories. Four comparative experiments are described below:

(1) And (3) performing a benchmark experiment, namely training a defect segmentation model by adopting source domain defect data.

(2) Normal samples: and the normal image is used as the defect data of the target domain, and the corresponding defect mask is a background type because the normal image of the target domain has no defects, so that manual labeling is not needed. And mixing the source domain defect data and the target domain normal image to train a defect segmentation model.

(3) Defect synthesis (technical scheme of the embodiment of the application): according to the image processing method provided by the embodiment of the application, the source domain defect data and the target domain normal image are utilized to synthesize the target domain defect data, and the defect position of the synthesized target domain defect data (namely the target image) is determined, so that manual labeling is not needed. And mixing the source domain defect data and the synthesized target domain defect data to train a defect segmentation model.

(4) Target domain defect data: the defect segmentation model is trained using real (to be manually annotated) target domain defect data and source domain defect data. The performance of this experiment is the upper limit of the performance of the defect synthesis method to achieve the domain migration segmentation method.

And after the defect segmentation model is trained, performing evaluation and comparison based on the same target domain test set.

Fig. 6 and fig. 7 are block diagrams of a performance comparison experiment provided by the embodiment of the present application, please refer to 600 and 700, it can be seen that, based on the method for synthesizing target domain defect data provided by the embodiment of the present application, the defect segmentation performance of the target domain is significantly improved. And, between different products of the same site, the defect synthesis method approaches the performance of using the full amount of target domain defect data compared with the reference experiment and using only the target domain normal image. This fully verifies the great advantage of the defect synthesis method in domain migration segmentation. In addition, the cross-site of the same product is obviously improved compared with a reference experiment.

Fig. 8 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application, please refer to fig. 8, which includes:

an acquiring module 801, configured to acquire a first image and a first mask image, where the first image is a surface texture image of a first product, and the first mask image is used to identify a target area where a defect in the first image is located;

A transforming module 802, configured to transform the target area in the first mask image to obtain a second mask image;

A fusion module 803, configured to fuse the target area with a second image based on the second mask image, to obtain a target image, where the second image is a surface texture image of a second product;

The parameter adjustment module 804 is configured to perform parameter adjustment on the initial segmentation model based on the plurality of target images and the plurality of second mask images to obtain a defect segmentation model, where the defect segmentation model is used to identify a target region where a defect is located in the surface texture image of the second product.

According to the device provided by the embodiment of the application, the first image and the first mask image after marking the first image are utilized, and the target area where the defect is located is transformed and processed, so that the target image taking the second image as the background and taking the target area as the foreground can be synthesized on the basis of the second image which does not contain the defect, and the target image not only contains the defect but also naturally carries the corresponding marking information, namely the second mask image, so that the second image does not need to be additionally marked manually, the strict requirement on the marking information of the second image is greatly relieved, the training cost of a defect segmentation model is reduced, and the accuracy and the efficiency of defect segmentation are improved.

In one possible implementation, the transformation module 802 is configured to:

Randomly sampling from the defect angle interval to obtain a target angle;

In one possible implementation, the transformation module 802 is configured to:

randomly sampling from the defect size interval to obtain a target size;

In the first mask image, the size of the target area is converted into the target size, and the second mask image is obtained.

In one possible implementation, the transformation module 802 is configured to:

In one possible embodiment, the device based on fig. 8 is composed, and the device further comprises:

In one possible implementation, based on the apparatus composition of fig. 8, the fusion module 803 includes:

a superposition unit, configured to superimpose the transformed target area on the second image, to obtain a third image;

In one possible embodiment, the superposition unit is configured to:

It should be noted that: the image processing apparatus provided in the above embodiment is only exemplified by the division of the above functional modules when processing an image, and in practical application, the above functional allocation can be performed by different functional modules according to needs, that is, the internal structure of the computer device is divided into different functional modules to perform all or part of the functions described above. In addition, the image processing apparatus and the image processing method provided in the foregoing embodiments belong to the same concept, and specific implementation processes of the image processing apparatus and the image processing method are detailed in the image processing method embodiment, which is not described herein again.

Fig. 9 is a schematic structural diagram of a computer device according to an embodiment of the present application, and please refer to fig. 9, in which the computer device is taken as a terminal 900 for illustration. Optionally, the device types of the terminal 900 include: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, motion picture expert compression standard audio plane 3), an MP4 (Moving Picture Experts Group Audio Layer IV, motion picture expert compression standard audio plane 4) player, a notebook computer, or a desktop computer. Terminal 900 may also be referred to by other names of user devices, portable terminals, laptop terminals, desktop terminals, etc.

In general, the terminal 900 includes: a processor 901 and a memory 902.

Optionally, processor 901 includes one or more processing cores, such as a 4-core processor, an 8-core processor, or the like. Optionally, the processor 901 is implemented in at least one hardware form of DSP (DIGITAL SIGNAL Processing), FPGA (Field-Programmable gate array), PLA (Programmable Logic Array ). In some embodiments, processor 901 includes a main processor, which is a processor for processing data in an awake state, also referred to as a CPU (Central Processing Unit, central processor), and a coprocessor; a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 901 is integrated with a GPU (Graphics Processing Unit, image processor) for rendering and drawing of content to be displayed by the display screen. In some embodiments, the processor 901 further includes an AI (ARTIFICIAL INTELLIGENCE ) processor for processing computing operations related to machine learning.

In some embodiments, memory 902 includes one or more computer-readable storage media, optionally non-transitory. The memory 902 also optionally includes high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 902 is used to store at least one program code for execution by processor 901 to implement the image processing methods provided by the various embodiments of the present application.

In some embodiments, the terminal 900 may further optionally include: a peripheral interface 903, and at least one peripheral. The processor 901, the memory 902, and the peripheral interface 903 can be connected by a bus or signal lines. The individual peripheral devices can be connected to the peripheral device interface 903 via buses, signal lines, or circuit boards. Specifically, the peripheral device includes: at least one of radio frequency circuitry 904, a display 905, a camera assembly 906, audio circuitry 907, and a power source 909.

The peripheral interface 903 may be used to connect at least one peripheral device associated with an I/O (Input/Output) to the processor 901 and the memory 902. In some embodiments, the processor 901, memory 902, and peripheral interface 903 are integrated on the same chip or circuit board; in some other embodiments, any one or both of the processor 901, the memory 902, and the peripheral interface 903 are implemented on separate chips or circuit boards, which is not limited in this embodiment.

The Radio Frequency circuit 904 is configured to receive and transmit RF (Radio Frequency) signals, also known as electromagnetic signals. The radio frequency circuit 904 communicates with a communication network and other communication devices via electromagnetic signals. The radio frequency circuit 904 converts an electrical signal into an electromagnetic signal for transmission, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 904 includes: antenna systems, RF transceivers, one or more amplifiers, tuners, oscillators, digital signal processors, codec chipsets, subscriber identity module cards, and so forth. Optionally, the radio frequency circuit 904 communicates with other terminals via at least one wireless communication protocol. The wireless communication protocol includes, but is not limited to: metropolitan area networks, various generations of mobile communication networks (2G, 3G, 4G, and 5G), wireless local area networks, and/or WiFi (WIRELESS FIDELITY ) networks. In some embodiments, the radio frequency circuit 904 further includes NFC (NEAR FIELD Communication) related circuits, which are not limited by the present application.

The display 905 is used to display a UI (User Interface). Optionally, the UI includes graphics, text, icons, video, and any combination thereof. When the display 905 is a touch display, the display 905 also has the ability to capture touch signals at or above the surface of the display 905. The touch signal can be input to the processor 901 as a control signal for processing. Optionally, the display 905 is also used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, the display 905 is one, providing a front panel of the terminal 900; in other embodiments, the display 905 is at least two, and is disposed on different surfaces of the terminal 900 or in a folded design; in still other embodiments, the display 905 is a flexible display disposed on a curved surface or a folded surface of the terminal 900. Even alternatively, the display 905 is arranged in an irregular pattern that is not rectangular, i.e. a shaped screen. Optionally, the display 905 is made of LCD (Liquid CRYSTAL DISPLAY), OLED (Organic Light-Emitting Diode) or other materials.

The camera assembly 906 is used to capture images or video. Optionally, the camera assembly 906 includes a front camera and a rear camera. Typically, the front camera is disposed on the front panel of the terminal and the rear camera is disposed on the rear surface of the terminal. In some embodiments, the at least two rear cameras are any one of a main camera, a depth camera, a wide-angle camera and a tele camera, so as to realize that the main camera and the depth camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize a panoramic shooting and Virtual Reality (VR) shooting function or other fusion shooting functions. In some embodiments, camera assembly 906 also includes a flash. Alternatively, the flash is a single-color temperature flash, or a dual-color temperature flash. The dual-color temperature flash lamp refers to a combination of a warm light flash lamp and a cold light flash lamp, and is used for light compensation under different color temperatures.

In some embodiments, the audio circuit 907 includes a microphone and a speaker. The microphone is used for collecting sound waves of users and the environment, converting the sound waves into electric signals, and inputting the electric signals to the processor 901 for processing, or inputting the electric signals to the radio frequency circuit 904 for voice communication. For the purpose of stereo acquisition or noise reduction, a plurality of microphones are respectively disposed at different positions of the terminal 900. Optionally, the microphone is an array microphone or an omni-directional pickup microphone. The speaker is used to convert electrical signals from the processor 901 or the radio frequency circuit 904 into sound waves. Alternatively, the speaker is a conventional thin film speaker, or a piezoelectric ceramic speaker. When the speaker is a piezoelectric ceramic speaker, not only an electric signal but also an acoustic wave audible to humans can be converted into an acoustic wave inaudible to humans for ranging and other purposes. In some embodiments, the audio circuit 907 further comprises a headphone jack.

The power supply 909 is used to supply power to the various components in the terminal 900. Optionally, the power source 909 is alternating current, direct current, disposable battery, or rechargeable battery. When the power supply 909 includes a rechargeable battery, the rechargeable battery supports wired or wireless charging. The rechargeable battery is also used to support fast charge technology.

In some embodiments, terminal 900 can further include one or more sensors 910. The one or more sensors 910 include, but are not limited to: acceleration sensor 911, gyro sensor 912, pressure sensor 913, optical sensor 915, and proximity sensor 916.

In some embodiments, the acceleration sensor 911 detects the magnitudes of accelerations on three coordinate axes of the coordinate system established with the terminal 900. For example, the acceleration sensor 911 is used to detect components of gravitational acceleration on three coordinate axes. Optionally, the processor 901 controls the display 905 to display the user interface in a lateral view or a longitudinal view according to the gravitational acceleration signal acquired by the acceleration sensor 911. The acceleration sensor 911 is also used for acquisition of motion data of a game or a user.

In some embodiments, the gyro sensor 912 detects the body direction and the rotation angle of the terminal 900, and the gyro sensor 912 and the acceleration sensor 911 cooperate to collect the 3D motion of the user on the terminal 900. The processor 901 realizes the following functions according to the data collected by the gyro sensor 912: motion sensing (e.g., changing UI according to a tilting operation by a user), image stabilization at shooting, game control, and inertial navigation.

Optionally, the pressure sensor 913 is provided at a side frame of the terminal 900 and/or at a lower layer of the display 905. When the pressure sensor 913 is disposed on the side frame of the terminal 900, a grip signal of the user on the terminal 900 can be detected, and the processor 901 performs left-right hand recognition or quick operation according to the grip signal collected by the pressure sensor 913. When the pressure sensor 913 is provided at the lower layer of the display 905, the processor 901 performs control of the operability control on the UI interface according to the pressure operation of the user on the display 905. The operability controls include at least one of a button control, a scroll bar control, an icon control, and a menu control.

The optical sensor 915 is used to collect the intensity of ambient light. In one embodiment, processor 901 controls the display brightness of display panel 905 based on the intensity of ambient light collected by optical sensor 915. Specifically, when the ambient light intensity is high, the display luminance of the display screen 905 is turned up; when the ambient light intensity is low, the display luminance of the display panel 905 is turned down. In another embodiment, the processor 901 also dynamically adjusts the shooting parameters of the camera assembly 906 based on the intensity of ambient light collected by the optical sensor 915.

A proximity sensor 916, also referred to as a distance sensor, is typically provided on the front panel of the terminal 900. Proximity sensor 916 is used to collect the distance between the user and the front of terminal 900. In one embodiment, when the proximity sensor 916 detects that the distance between the user and the front face of the terminal 900 gradually decreases, the processor 901 controls the display 905 to switch from the bright screen state to the off screen state; when the proximity sensor 916 detects that the distance between the user and the front surface of the terminal 900 gradually increases, the processor 901 controls the display 905 to switch from the off-screen state to the on-screen state.

Those skilled in the art will appreciate that the structure shown in fig. 9 is not limiting of terminal 900 and can include more or fewer components than shown, or certain components may be combined, or a different arrangement of components may be employed.

Fig. 10 is a schematic structural diagram of a computer device according to an embodiment of the present application, where the computer device 1000 may have a relatively large difference due to different configurations or performances, and the computer device 1000 includes one or more processors (Central Processing Units, CPU) 1001 and one or more memories 1002, where at least one computer program is stored in the memories 1002, and the at least one computer program is loaded and executed by the one or more processors 1001 to implement the image processing method according to the above embodiments. Optionally, the computer device 1000 further includes a wired or wireless network interface, a keyboard, an input/output interface, and other components for implementing the functions of the device, which are not described herein.

In an exemplary embodiment, a computer readable storage medium is also provided, for example a memory comprising at least one computer program executable by a processor in a terminal to perform the image processing method in the respective embodiments described above. For example, the computer readable storage medium includes ROM (Read-Only Memory), RAM (Random-Access Memory), CD-ROM (Compact Disc Read-Only Memory), magnetic tape, floppy disk, optical data storage device, and the like.

In an exemplary embodiment, a computer program product or computer program is also provided, comprising one or more program codes, the one or more program codes being stored in a computer readable storage medium. The one or more processors of the computer device are capable of reading the one or more program codes from the computer-readable storage medium, and executing the one or more program codes so that the computer device can execute to complete the image processing method in the above embodiment.

Those of ordinary skill in the art will appreciate that all or a portion of the steps implementing the above-described embodiments can be implemented by hardware, or can be implemented by a program instructing the relevant hardware, optionally stored in a computer readable storage medium, optionally a read-only memory, a magnetic disk or an optical disk, etc.

The foregoing description of the preferred embodiments of the present application is not intended to limit the application, but rather, the application is to be construed as limited to the appended claims.

Claims

1. An image processing method, the method comprising:

In response to the second image and the first mask image being the same in size, randomly sampling from the first mask image to obtain a target position; translating the target area to the target position in the first mask image, wherein the second image is a surface texture image of a second product;

In response to the second image and the first mask image being different in size, randomly sampling in the second image to obtain a target position; scaling the size of the first mask image to the same size as the second image; translating the target region to the target position in the scaled first mask image;

Responding to the situation that the edge of the translated target area exceeds the edge of the first mask image, and cutting off the translated target area based on the edge of the first mask image to obtain a second mask image; wherein the area of the target area after interception is greater than or equal to one third of the area of the target area before interception;

Fusing the target area with a second image based on the second mask image to obtain a target image;

2. The method of claim 1, wherein prior to the obtaining the second mask image, the method further comprises:

Randomly sampling from the defect angle interval to obtain a target angle;

And in the first mask image, performing rotation transformation on the target area according to the target angle.

3. The method of claim 1, wherein prior to the obtaining the second mask image, the method further comprises:

randomly sampling from the defect size interval to obtain a target size;

in the first mask image, the size of the target area is transformed into the target size.

4. The method of claim 1, wherein fusing the target region with a second image based on the second mask image to obtain a target image comprises:

acquiring a transformed target area based on the second mask image;

overlapping the transformed target area to the second image to obtain a third image;

and carrying out smoothing processing on the third image to obtain the target image.

5. The method of claim 4, wherein the superimposing the transformed target region onto the second image to obtain a third image comprises:

6. An image processing apparatus, characterized in that the apparatus comprises:

The transformation module is used for responding to the fact that the second image is the same as the first mask image in size, and randomly sampling the first mask image to obtain a target position; translating the target area to the target position in the first mask image, wherein the second image is a surface texture image of a second product; in response to the second image and the first mask image being different in size, randomly sampling in the second image to obtain a target position; scaling the size of the first mask image to the same size as the second image; translating the target region to the target position in the scaled first mask image;

The truncation module is used for responding to the fact that the edge of the translated target area exceeds the edge of the first mask image, and truncating the translated target area based on the edge of the first mask image to obtain a second mask image; wherein the area of the target area after interception is greater than or equal to one third of the area of the target area before interception;

7. The apparatus of claim 6, wherein the transformation module is further configured to:

Randomly sampling from the defect angle interval to obtain a target angle;

8. The apparatus of claim 6, wherein the transformation module is further configured to:

randomly sampling from the defect size interval to obtain a target size;

9. The apparatus of claim 6, wherein the fusion module comprises:

10. The apparatus of claim 9, wherein the superposition unit is configured to:

11. A computer device comprising one or more processors and one or more memories, the one or more memories having stored therein at least one computer program loaded and executed by the one or more processors to implement the image processing method of any of claims 1 to 5.

12. A storage medium having stored therein at least one computer program loaded and executed by a processor to implement the image processing method of any one of claims 1 to 5.