Movatterモバイル変換


[0]ホーム

URL:


US10824910B2 - Image processing method, non-transitory computer readable storage medium and image processing system - Google Patents

Image processing method, non-transitory computer readable storage medium and image processing system
Download PDF

Info

Publication number
US10824910B2
US10824910B2US15/970,901US201815970901AUS10824910B2US 10824910 B2US10824910 B2US 10824910B2US 201815970901 AUS201815970901 AUS 201815970901AUS 10824910 B2US10824910 B2US 10824910B2
Authority
US
United States
Prior art keywords
image
processed
comparison result
reference images
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US15/970,901
Other versions
US20180322367A1 (en
Inventor
Fu-Chieh CHANG
Chun-Nan Chou
Edward Chang
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
HTC Corp
Original Assignee
HTC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by HTC CorpfiledCriticalHTC Corp
Priority to US15/970,901priorityCriticalpatent/US10824910B2/en
Assigned to HTC CORPORATIONreassignmentHTC CORPORATIONASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS).Assignors: CHANG, EDWARD, CHANG, FU-CHIEH, CHOU, CHUN-NAN
Publication of US20180322367A1publicationCriticalpatent/US20180322367A1/en
Application grantedgrantedCritical
Publication of US10824910B2publicationCriticalpatent/US10824910B2/en
Activelegal-statusCriticalCurrent
Adjusted expirationlegal-statusCritical

Links

Images

Classifications

Definitions

Landscapes

Abstract

An image processing training method includes the following steps. A template label image is obtained, in which the template label image comprises a label corresponding to a target. A plurality of first reference images are obtained, in which each of the first reference images comprises object image data corresponding to the target. A target image according to the template label image and the first reference images is generated, in which the target image comprises a generated object, a contour of the generated object is generated according to the template label image, and a color or a texture of the target image is generated according to the first reference images.

Description

RELATED APPLICATIONS
This application claims priority to Provisional U.S. Application Ser. No. 62/501,100 filed May 4, 2017, which is herein incorporated by reference.
BACKGROUNDTechnical Field
The present disclosure relates to an image processing method, a non-transitory computer readable storage medium and an image processing system. More particularly, the present disclosure relates to train an image processing model to generate a labeled image from an input image.
Description of Related Art
With the rapid development of machine learning, it is a nightmare for researchers to search or create the huge amount of labeled data. It is laborious and time-consuming.
Therefore, how to solve the issue is very important nowadays.
SUMMARY
The disclosure provides an image processing method. The image processing method includes the following steps. A template labeled image is obtained, in which the template label image comprises a label corresponding to a target. A plurality of first reference images is obtained, in which each of the first reference images comprises object image data corresponding to the target. A target image according to the template label image and the first reference images is generated, in which the target image comprises a generated object, a contour of the generated object is generated according to the template label image, and a color or a texture of the target image is generated according to the first reference images.
The disclosure also provides a non-transitory computer readable storage medium with a computer program. The computer program is configured to execute aforesaid image processing method.
The disclosure also provides an image processing system. The image processing system includes a memory and a processor. The memory is coupled to the processor, and is configured to store a template label image, in which the template label image comprises a label corresponding to a target. A processor is operable to obtain a plurality of first reference images, in which each of the first reference images comprises object image data corresponding to a target. The processor is further operable to generate a target image according to the template label image and the first reference images, in which the target image comprises a generated object, a contour of the generated object is generated according to the template label image, and a color or a texture of the target image is generated according to the first reference images.
Through the operations of one embodiment described above, a large volume of pixel-wise labeled image can be automatically generated by using the image processing system to achieve high accuracy when doing the task of object segmentation or segmenting an object out of an image.
It is to be understood that both the foregoing general description and the following detailed description are demonstrated by examples, and are intended to provide further explanation of the disclosure as claimed.
BRIEF DESCRIPTION OF THE DRAWINGS
The disclosure can be more fully understood by reading the following detailed description of the embodiments, with reference made to the accompanying drawings as follows:
FIG. 1 is a schematic diagram illustrating an image processing system according to an embodiment of the disclosure.
FIG. 2 is a flowchart of an image processing method of the image processing system inFIG. 1, in accordance with one embodiment of the present disclosure.
FIG. 3 is a schematic diagram illustrating the image processing method in a demonstrational example.
FIG. 4 is a partial flowchart of an image processing method inFIG. 2, in accordance with one embodiment of the present disclosure.
FIG. 5 is a partial flowchart of an image processing method of the image processing system inFIG. 1, in accordance with one embodiment of the present disclosure.
FIGS. 6A-6D are schematic diagrams illustrating the image processing method in a demonstrational example.
FIG. 7 is a partial flowchart of an image processing method inFIG. 2, in accordance with one embodiment of the present disclosure.
FIG. 8 is a partial flowchart of an image processing method of the image processing system inFIG. 1, in accordance with one embodiment of the present disclosure.
FIG. 9 is a schematic diagram illustrating the image processing method in a demonstrational example.
FIG. 10 is a schematic diagram illustrating the image processing method in a demonstrational example.
DETAILED DESCRIPTION
Reference will now be made in detail to the present embodiments of the disclosure, examples of which are illustrated in the accompanying drawings. Wherever possible, the same reference numbers are used in the drawings and the description to refer to the same or similar parts.
It will be understood that, although the terms first, second, third etc. may be used herein to describe various elements, components and/or sections, these elements, components and/or sections should not be limited by these terms. These terms are only used to distinguish one element, component or section from another element, component or section. Thus, a first element, component or section discussed below could be termed a second element, component or section without departing from the teachings of the present disclosure.
Reference is made toFIG. 1, which is a schematic diagram illustrating animage processing system100 according to an embodiment of the disclosure. Theimage processing system100 includes aprocessor130 and amemory140, in which theprocessor130 is coupled to thememory140.
In some embodiments, thememory140 is configured to store a plurality of reference images (e.g., real images captured by a camera) and template label image, and to provide the reference images and template label image to theprocessor130. In some embodiments, the reference images can be real images captured by a camera. For example, the real images are filmed by a photographer regarding a real scene, or collected from an image database.
In some embodiments, the template label image can be obtained from a three-dimension (3D) model, a camera, a generator of a learning model or depicted by hand. For example, if the template label image contains hands with a white color and background with a black color, the template label image can be generated from projecting 3D hand models onto a 2D image, capturing hands wearing white gloves with black background by a camera, or a generator of the learning model with inputting labeled image, in which the inputted labeled image consists of hands with a white color and a background with a black color.
In some embodiments, thememory140 can be realized by, for example, a read-only memory (ROM), a flash memory, a floppy disk, a hard disk, an optical disc, a flash disk, a flash drive, a tape, a database accessible from a network, or any storage medium with the same functionality that can be contemplated by persons of ordinary skill in the art.
In some embodiments, theprocessor130 is configured to run or execute various software programs and/or sets of instructions to perform various functions to process data. In some embodiments, theprocessor130 is configured to fetch the image stored in thememory140 or fetch the image directly from a camera (not shown) and to generate a processed image based on the original image. In detail, theprocessor130 is configure to process an input image without labels to generate a labeled image (i.e., target image) according to objects and a background in the input image, in which the labeled image contains several labels related to the objects and the background respectively. In some embodiments, theprocessor130 can be realized by, for example, one or more processors, such as central processors and/or microprocessors, but are not limited in this exemplar.
In some embodiments, theprocessor130 includes animage generating engine110 and animage processing model120. Theimage generating engine110 is coupled to theimage processing model120, and both theimage generating engine110 and theimage processing model120 are coupled to thememory140 respectively.
In some embodiments, theimage generating engine110 is configured for capturing a data distribution of real images without labels and generating labeled images (i.e., target images) corresponding to the template label images, in which the target image should be almost the same as real images, according to template label images and real images. A pair of template label image and its corresponding target image is supplied to theimage processing model120. The template label image, which is represented by a color mask. In some embodiments, the color mask consists of pixels of specific colors, and these colors indicate which pixels in the labeled image belong to a segmented object or not.
In some embodiments, theimage generating engine110 can be realized by software programs, firmware and/or hardware circuits based on a learning model (e.g., a generative adversarial network (GAN) model). Various learning models that can generate images similar to real images inputted to the learning models are within the contemplated scope of the present disclosure.
In some embodiments, theimage processing model120 is configured for processing an input image without labels and generating a label of the input image. In other words, theimage processing model120 is configured to do the image segmentation for generating a label image according to a background and an object in the input image, in which the label image includes a first label related to the object and a second label related to the background.
Reference is made toFIG. 1,FIG. 2 andFIG. 3. Details of the present disclosure are described in the paragraphs below with reference to the image processing method inFIG. 2, in whichFIG. 2 is a flowchart of theimage processing method200 of theimage processing system100 inFIG. 1, in accordance with one embodiment of the present disclosure.FIG. 3 is a schematic diagram illustrating theimage processing method200 in a demonstrative example. However, the present disclosure is not limited to the embodiment below.
It should be noted that, in some embodiments, theimage processing method200 may be implemented as a computer program. When the computer program is executed by a computer, an electronic device, or theprocessor130 inFIG. 1, this executing device performs theimage processing method200.
In addition, it should be noted that in the operations of the followingimage processing method200, no particular sequence is required unless otherwise specified. Moreover, the following operations also may be performed simultaneously or the execution times thereof may at least partially overlap.
Furthermore, the operations of the followingimage processing method200 may be added to, replaced, and/or eliminated as appropriate, in accordance with various embodiments of the present disclosure.
In operation S210, theimage generating engine110 obtains atemplate label image310 from thememory140. In some embodiments, thetemplate label image310 contains a label associated with an object contour of a target (i.e., hands). For example, the target can be realized by hands, a pen, a book and etc. As shown inFIG. 3, thetemplate label image310 contains two labels, a first label311 (i.e., an area within the object contour of the target) and a second label312 (i.e., an area outside the object contour of the target), in which thefirst label311 is filled with white color and thesecond label312 is filled with black color.
In operation S220, theimage generating engine110 obtains a plurality ofreal images340,350 and360 from thememory140. In some embodiments, thereal images340,350 and360 must contain objects with the same target (i.e., hands), such that theimage generating engine110 can generate images similar to thetemplate label image310. As shown inFIG. 3, thereal image340 contains object image data341 (i.e., hands) and background image data342 (i.e., a house), thereal image350 contains object image data351 (i.e., hands) and background image data352 (i.e., clouds), thereal image360 contains object image data361 (i.e., hands) and background image data362 (i.e., mountains), in which theobject image data341,351 and361 are hands with different color, texture, gesture and shape. In this embodiment, thememory140 contains, but not limited to, threereal images340,350 and360. Thememory140 contains more real images can generate better results.
In operation S225, theimage generating engine110 is trained according to training data. In some embodiments, the training data used to train theimage generating engine110 includes the template label image and the real images. Theimage generating engine110 is trained by the training data to be able to generate a pair of a target image and the template label image. The target image is desired to be similar to the real images. Details about how to train theimage generating engine110 will be explained in following paragraphs.
In operation S230, theimage generating engine110 after training is able to generate atarget image330. In an embodiment, theimage generating engine110 generates thetarget image330 according to thetemplate label image310 and thereal images340,350 and360. In one embodiment, theimage generating engine110 generates, but not limited to, one hundred target images according to the template label image and the real images. Various numbers of target images generated according to the template label image and the real images by theimage generating engine110 are within the contemplated scope of the present disclosure.
In some embodiments, thetarget image330 contains a generatedobject331, in which an object contour of the generatedobject331 is generated according to thetemplate label image310, and a color or a texture of thetarget image330 is generated according to thereal images340,350 and360. In this operation, theimage generating engine110 uses several steps to generate thetarget image330, in which these steps will be discussed below inFIG. 4 andFIG. 5.
In operation S240, theimage generating engine110 trains theimage processing model120 with thetarget image330 and thetemplate label image310, such that theimage processing model120 can convert an input image without labels into an image with labels. In some embodiments, theimage processing model120 can be realized by software programs, firmware and/or hardware circuits based on a learning model.
In some embodiments, the learning model may use an approach which combines of a Conditional Random Field (CRF) method and an image classification method. Various learning models are within the contemplated scope of the present disclosure.
Reference is made toFIG. 1,FIG. 3 andFIG. 4.FIG. 4 is a partial flowchart illustrating further operations within the operation S225 in animage processing method200 inFIG. 2, in accordance with one embodiment of the present disclosure.
In some embodiments, theimage generating engine110 includes a generator and a discriminator, in which the generator is configured for generating a processedimage320 according to thetemplate label image310, and the discriminator is configured for determining whether the processedimage320 is sufficiently similar to thereal images340,350 and360 and updating the generator according to the result. In some embodiments, the generator can be realized by, but not limited to, an image-to-image translation model.
In operation S410, theimage generating engine110 generates a processedimage320 according to thetemplate label image310 and thereal images340,350 and360 and a random number. In this operation, the generator of theimage generating engine110 generates the processedimage320 with a generated object321 (i.e., hands) and a generatedbackground322.
In some embodiments, the random number is supplied to the generator of theimage generating engine110 in order to generate different processedimage320 with thetemplate label image310 and thereal images340,350 and360.
In operation S420, theimage generating engine110 compares the processedimage320 with thereal images340,350 and360. In some embodiments, the discriminator of theimage generating engine110 compares a color, a texture or a content-object shape of the processedimage320 with thereal images340,350 and360. In some embodiments, after the discriminator of theimage generating engine110 determines that the color and the texture of the processedimage320 are similar to thereal images340/350/360, the discriminator of theimage generating engine110 compares the shape of the content-object shape of the processedimage320 to the content-object shape of thereal images340/350/360.
After the discriminator of theimage generating engine110 compared the processedimage320 with thereal images340,350 and360, operation S430 is executed. In operation S430, the discriminator of theimage generating engine110 determines whether a comparison result is lower than a threshold. In some embodiments, the threshold is set based on the loss function in the GAN model, in order to determine whether the processedimage320 is sufficiently similar to thereal images340,350 and360. In an embodiment, the comparison result is represented by the output values of the loss function in the GAN model. If the comparison result includes fewer differences, the loss function outputs a lower value. If the comparison result includes more differences, the loss function outputs a higher value. The threshold can be set to a low value indicating that the target image is sufficiently similar to the real image when the output of the loss function is lower than this value.
If the processedimage320 is sufficiently similar (i.e., the comparison result includes fewer differences, and the output of the loss function is lower than the threshold.) to thereal images340,350 and360, the training of theimage generating engine110 is terminated (completed). In other words, the generator of theimage generating engine110 training stage is completed. On the other hand, if the processedimage320 is not sufficiently similar (i.e., the comparison result includes more differences, and the output of the loss function is higher than the threshold.) to thereal images340,350 and360, the operation S440 is executed.
In operation S440, theimage generating engine110 will be updated, and the updatedimage generating engine110 will generate a processed image again (i.e., the operation S410), in which the processedimage320 is updated, according to the comparison result and continues to compare the updated processed image with thereal images340,350 and360 (i.e., the operation S420). In operation s440, both the generator and the discriminator of theimage generating engine110 are updated. The updated generator can generate the updated processed image (e.g., the target image330), which is more similar to thereal images340,350 and360, and the updated discriminator can have a better discriminative capability so that the generator is forced to generate more realistic images to fool the discriminator.
In another embodiment, the operation S225 inFIG. 2 includes different operations as shown inFIG. 5.FIG. 5 is a partial flowchart of animage processing method200 inFIG. 2, in accordance with one embodiment of the present disclosure. Compared with the operations inFIG. 4, the operations inFIG. 5 can further prevent the color or the texture of the generatedbackground332 from being filled to the generatedobject331, and prevent the color or the texture of the generatedobject331 from being filled to the generatedbackground332.
Reference is made toFIG. 1,FIG. 5 andFIGS. 6A-6D.FIGS. 6A-6D are schematic diagrams illustrating the operation S225 of theimage processing method200 in a demonstrative example.
In operation S511, the generator of theimage generating engine110 generates a processed image611 (as shown inFIG. 6A) according to the template label image610 (as shown inFIG. 6A),real images643,644 and645 (as shown inFIG. 6D) and a random number. In some embodiments, thetemplate label image610 includes thefirst label610aand thesecond label610b, in which thefirst label610ais an area inside the object contour of the target (i.e., the hands).
In some embodiments, the random number is supplied to the generator of theimage generating engine110 in order to generate different processedimage611 with the sametemplate label image610 and the samereal images643,644 and645.
In operation S512, theimage generating engine110 separates the processedimage611 into the generatedbackground611band the generatedobject611a, in which the generatedbackground611bincludes upside down houses and upside down rainbow, and the generatedobject611aincludes hands filled with the texture of roads. After the operation S512 is executed, the operations S513 and S514 are simultaneously executed thereafter.
In operation S513, theimage generating engine110 forms the processed foreground image612 (as shown inFIG. 6A) according to the generatedobject611ain the processedimage611, in which the processedforeground image612 includes the first generatedobject612awith the same color, texture and shape of the generatedobject611ain the processedimage611, and the first generatedbackground612bwith the black color (or other single color, such as dark blue, dark brown or other colors). After the operation S513 is executed, the operation S515 is then executed.
In operation S515, theimage generating engine110 obtains the plurality of reference images (e.g., firstreal images623/624/625 captured by a camera as shown inFIG. 6B). In some embodiments, the reference images can be firstreal images623/624/625 captured by a camera. For example, the firstreal images623/624/625 are filmed by a photographer regarding a real scene, or collected from an image database.
In some embodiments, each of the firstreal images623,624 and625 contains firstobject image data623a/624a/625acorresponding to the target (i.e., the hands) with chromatic colors and firstbackground image data623b/624b/625bwith a black color. For example, as shown inFIG. 6B, the firstreal image623 contains firstobject image data623a(i.e., hands) with a beige color and firstbackground image data623bwith a black color, the firstreal image624 contains firstobject image data624a(i.e., hands) with a dark brown color and firstbackground image data624bwith a black color, and the firstreal image625 contains firstobject image data625a(i.e., hands) with a white color and firstbackground image data625bwith a black color.
In some embodiments, the firstreal images623,624 and625 can be obtained in the following steps: recording hands waving in front of a black screen as a video streaming, and taking a plurality snapshot of photos (i.e., the firstreal images623,624 or625) from the video streaming. After the operation S515 is executed, the operation S517 is then executed.
In operation S517, the discriminator of theimage generating engine110 compares the processedforeground image612 with the firstreal images623,624 and625 as a first comparison result. In detail, the discriminator of theimage generating engine110 determines whether the processedforeground image612 is sufficiently similar to the firstreal images623,624 and625. After the operation S517 is executed, the operation S520 is then executed to determine whether the first comparison result is lower than a threshold.
In operation S514, theimage generating engine110 forms a processed background image613 (as shown inFIG. 6A) according to the generatedbackground611bin the processedimage611, in which the processedbackground image613 includes the second generatedobject613awith a black color, and the second generatedbackground613bwith the same color, texture and shape of the generatedbackground611bin the processedimage611. The operation S514 is executed, followed by the operation S516.
In operation S516, theimage generating engine110 obtains a plurality of reference images (e.g., secondreal images633/634/635 captured by a camera as shown inFIG. 6C). In some embodiments, third reference images can be the secondreal images633/634/635 captured by a camera. For example, the secondreal images633/634/635 are filmed by a photographer regarding a real scene, or collected from an image database.
In some embodiments, each of the secondreal images633,634 and635 contains the secondobject image data633a/634a/635acorresponding to the target (i.e., hands) with a black color (or other single color, such as dark blue, dark brown or other colors) and the secondbackground image data633b/634b/635bwith chromatic colors. For example, as shown inFIG. 6C, the secondreal image633 contains the secondobject image data633a(i.e., hands) with a black color and the secondbackground image data633b(i.e., groves) with chromatic colors, the secondreal image634 contains the secondobject image data634a(i.e., hands) with a black color and the secondbackground image data634b(i.e., trees) with chromatic colors, and the secondreal image635 contains the secondobject image data635a(i.e., hands) with a black color and the secondbackground image data635b(i.e., buildings) with chromatic colors.
In some embodiments, the secondreal images633,634 and635 can be obtained in the following steps: putting a black paper having hands shapes in front of a camera and taking photos by the camera. Another example of obtaining the secondreal images633,634 and635 includes the following steps: taking photos without hands, synthesizing black hands and the photos by a computer to form the secondreal images633,634 and635.
In operation S518, the discriminator of theimage generating engine110 compares the processedbackground image613 with the secondreal images633,634 and635 as the second comparison result. In detail, the discriminator of theimage generating engine110 determines whether the processedbackground image613 is sufficiently similar to the secondreal images633,634 and635. After the operation S518 is executed, the operation S520 is then executed to determine whether the second comparison result is lower than the threshold.
In operation S519, the discriminator of theimage generating engine110 compares the processedimage611 with thereal images643,644 and645 as the third comparison result. In detail, the discriminator of theimage generating engine110 determines whether the processedimage611 is sufficiently similar to thereal images643,644 and645. After the operation S519 is executed, the operation S520 is then executed to determine whether the third comparison result is lower than the threshold.
In operation S520, the discriminator of theimage generating engine110 determines whether all of the first comparison result, the second comparison result and the third comparison result are lower than the threshold. In some embodiments, the threshold is set based on the loss function in the GAN model, in order to determine whether the processedforeground image612 is sufficiently similar to the firstreal images623,624 and625, determine whether the processedbackground image613 is sufficiently similar to the secondreal images633,634 and635, and determine whether the processedimage611 is sufficiently similar to thereal images643,644 and645. In an embodiment, the comparison result is represented by the output values of the loss function in the GAN model. If the comparison result includes fewer differences, the loss function outputs a lower value. If the comparison result includes more differences, the loss function outputs a higher value. The threshold can be set to a low value indicating that the target image is sufficiently similar to the real image when the output of the loss function is lower than this value.
About the first comparison result, for example, as shown inFIG. 6B, the discriminator of theimage generating engine110 may learn the color of the first generatedobject612awhich cannot be the color of roads (i.e., gray) based on the color of firstobject image data623a(i.e., beige), the color of firstobject image data624a(i.e., dark brown), and the color of firstobject image data625a(i.e., white). Therefore, the discriminator of theimage generating engine110 determines the processedforeground image612 as a fake image since the color of the first generatedobject612acannot be gray. Consequently, the first comparison result is higher than the threshold.
About the second comparison result, for example, as shown inFIG. 6C, the discriminator of theimage generating engine110 may learn the content of the second generatedbackground613bwhich cannot be houses set upside down and rainbow presented upside down (i.e., stand from the top of an image) since the content of the secondbackground image data633b. The content of the secondbackground image data634band the content of the secondbackground image data635bteach the generator of theimage generating engine110 that the content are all stand from the bottom of an image. Therefore, the discriminator of theimage generating engine110 determines the processedbackground image613 as a fake image since the content of the second generatedbackground613bcannot include an upside down rainbow and upside down houses. Consequently, the second comparison result is higher than the threshold.
About the third comparison result, for example, as shown inFIG. 6D, since thereal image643 contains theobject image data643a(i.e., hands with a beige color) andbackground image data643b(i.e., shrubs), thereal image644 contains theobject image data644a(i.e., hands with a dark brown color) andbackground image data644b(i.e., coco palms), thereal image645 contains theobject image data645a(i.e., hands with a white color) andbackground image data645b(i.e., waves), the discriminator may determine that the processedimage611 is a fake image since the texture of the generatedobject611ais the same as the road instead of a real hand, and the content of the generatedbackground611bis unreasonable. Consequently, the third comparison result is higher than the threshold.
In some embodiments, when all of the first comparison result, the second comparison result and the third comparison result are lower than the threshold, the training stage of the generator of theimage generating engine110 is completed. In some embodiments, in response to at least one of the first comparison result, the second comparison result and the third comparison result is higher than the threshold, the operation S521 is executed to update both the generator and discriminator of theimage generating engine110 and the processedimage611 is updated.
In one embodiment, in response to the first comparison result that is higher than a threshold, the processedimage611 is updated according to the first comparison result. In one embodiment, in response to the second comparison result that is higher than the threshold, the processedimage611 is updated according to the second comparison result. In one embodiment, in response to the third comparison result that is higher than the threshold, the processedimage611 is updated according to the second comparison result.
In practical, when the discriminator of theimage generating engine110 does not regard the processedimage611 as a real image captured by a camera, the generator of theimage generating engine110 is updated and generates a new processedimage611, and the discriminator is updated to have a better discriminative capability so that the generator is forced to generate more realistic images to fool the discriminator.
In some embodiments, after times of updating the generator of theimage generating engine110, as shown inFIG. 6D, the generator of theimage generating engine110 generates a processedimage642 including the generatedobject642awith a beige color and the generatedbackground642bwith palms, shrubs and waves in operation S511. Theimage generating engine110 then separates the processedimage642 into the generatedbackground642band the generatedobject642ain operation S512. Theimage generating engine110 then forms the processedforeground image622 and the processedbackground image632 in the operation S513 and S514 respectively. The discriminator of theimage generating engine110 compares the processedforeground image622 with the firstreal images623,624 and625, compares the processedbackground image632 with the secondreal images633,634 and635, and compares the processedimage642 with thereal images643,644 and645 in the operation S517, S518 and S519 separately. The discriminator of theimage generating engine110 then determines that all of the first comparison result, the second comparison result and the third comparison result are lower than the threshold in operation S520. In other words, the processedforeground image622 is sufficiently similar to the firstreal images623,624 and625, the processedbackground image632 is sufficiently similar to the secondreal images633,634 and635, and the processedimage642 is sufficiently similar to thereal images643,644 and645. Then, the generator of theimage generating engine110 training stage is completed and the operation S230 is executed to generate a target image by the trained generator, in which the target image is the processedimage642.
Through the determination mechanism in operation S520, it can be ensured that the processedimage642 is not distorted. In details, the operation S520 is utilized to avoid the color or the texture of the generatedbackground642bbeing filled into the generatedobject642a, and avoid the color or the texture of the generatedobject642abeing filled into the generatedbackground642b.
In some embodiments, a template label image may contain more than one object contour, for example, the template label image may contain three object contours of three targets (e.g., hand, pen and book), in which a color of an area within each of the three object contours can be shown in red, green and blue respectively, and a color of the background is black. In detail, the area within the first object contour consists of red pixels, the area within the second object contour consists of green pixels, the area within the third object contour consists of blue pixels, and the rest area consists of black pixels. In order to generate an image which is similar to a real image and contains three targets, theimage generating engine110 can use theimage processing method200 to generate a target image which meets the above conditions. The detailed steps of theimage processing method200 applied to the template label image with three targets are shown below.
In one embodiment, when the operation S225 in theimage processing method200 is replaced with the operations inFIG. 4. First, theimage generating engine110 obtains a template label image which contains three object contours of three targets, obtains real images which contains object image data corresponding to the three targets, generates a processed image according to the template label image and the real images. Then theimage generating engine110 compares the processed image with the real images, updates the processed image when the comparison result is higher than a threshold, outputs the updated processed image as the target image in response to the comparison result is lower than the threshold, and trains theimage processing model120 with the target image and the template label image.
In another embodiment, when the operation S225 in theimage processing method200 is replaced with the operations inFIG. 5. First, theimage generating engine110 obtains a template label image which contains three object contours of three targets, the generator of theimage generating engine110 generates a processed image according to the template label image, real images and a random number, and then the processed image is separated into three generated objects (i.e., generated object corresponding to hands, generated object corresponding to a pen, and generated object corresponding to a book) and a generated background, and then formed three processed foreground images (i.e., first processed foreground image, second processed image and third processed image) according to the generated object corresponding to hands, the generated object corresponding to a pen, and the generated object corresponding to a book respectively, and a processed background image corresponding to the background.
Then, five types of real images (i.e., first real images, second real images, third real images, fourth real images and real images) are obtained. For example, each of the first real images contains first object image data corresponding to first target (i.e., hands) with chromatic colors and first background image data with a black color, each of the second real images contains second object image data corresponding to second target (i.e., pen) with chromatic colors and second background image data with a black color, each of the third real images contains third object image data corresponding to third target (i.e., book) with chromatic colors and third background image data with a black color, each of the fourth real images contains fourth object image data corresponding to the three targets (i.e., hands, pen and book) with a black color and fourth background image data with chromatic colors, and each of the real images contains object image data corresponding to the three targets (i.e., hands, pen and book) and a background with chromatic colors. Each of the first processed foreground image consists of a first generated object corresponding to the first target with chromatic colors and a first generated background with a black color, each of the second processed foreground image consists of a second generated object corresponding to the second target with chromatic colors and a second generated background with a black color, each of the third processed foreground image consists a third generated object corresponding to the third target with chromatic colors and a third generated background with a black color, and each of the processed background image consists of a fourth generated object corresponding to the three targets with a black color and a fourth generated background with chromatic colors, and each of the real images consists of a hand, a pen, a book, and a rest area with chromatic colors. Then the first real images, the second real images, the third real images, the fourth real images and the real images are compared with their corresponding processed images.
FIG. 7 is a partial flowchart illustrating further operations within the operation S240 in animage processing method200 inFIG. 2, in accordance with one embodiment of the present disclosure.
After the trained generator of theimage generating engine110 generates the target image330 (e.g., the processedimage642 can also be regarded as a target image) according to the template label image310 (or template label image610), the training stage of theimage processing model120 starts.
In operation S710, a pair of the target image330 (or the processed image642) and the template label image310 (or template label image610) is supplied to theimage processing model120, in which the labels of the target image330 (or the processed image642) are the same as the labels in the template label image310 (or template label image610).
In operation S720, the processor (not shown) of theimage processing model120 generates a predicted label image (not shown) according to the target image330 (or the processed image642).
In operation S730, the processor (not shown) of theimage processing model120 compares the predicted label image (not shown) with the template label image310 (or template label image610).
In operation S740, the processor (not shown) of theimage processing model120 determines whether a comparison result is lower than a threshold. When the comparison result is higher than the threshold, the operation S750 is executed. When the comparison result is lower than the threshold, i.e. the predicted label image (not shown) is sufficiently similar to the template label image310 (or template label image610), such that theimage processing model120 training stage is completed.
In operation S750, theimage processing model120 is updated according to the comparison result obtained in the operation S730.
In detail, theimage processing model120 is trained by estimating the differences between the predicted label image (not shown) and the given template label image310 (or template label image610), and further updating the predicted label image (not shown) to approximate the template label image310 (or template label image610).
In some embodiments, after theimage processing model120 is trained, an object of the input image can be separated from its background by theimage processing model120 whose the detailed operations are discussed withFIG. 8.
Reference is made toFIG. 8 andFIG. 9.FIG. 8 is a partial flowchart of animage processing method800 of theimage processing system100 inFIG. 1, in accordance with one embodiment of the present disclosure.FIG. 9 is a schematic diagram illustrating theimage processing method800 in a demonstrative example.
In operation S810, theimage processing model120 is obtained. In this operation, aninput image910 is supplied to theimage processing model120 in order to acquire a label image (i.e., a label image920).
In operation S820, theimage processing model120 separates theobject910afrom thebackground910bin theinput image910.
In operation S830, theimage processing model120 generates alabel image920 according to thebackground910band theobject910aof theinput image910, in which thelabel image920 contains thefirst label920arelated to theobject910a(e.g., hands) with the first color (e.g., white) and thesecond label920brelated to thebackground910bwith the second color (e.g., black).
Reference is made toFIG. 10, in whichFIG. 10 is a schematic diagram illustrating theimage processing method800 in a demonstrative example. As shown inFIG. 10, theinput image1010 consists of afirst object1010a(i.e., a hand), asecond object1010b(i.e., a pen), athird object1010c(i.e., a book) and abackground1010d(i.e., roads).
In one embodiment, after theimage processing model120 is trained by the template label images and the target images obtained from theimage generating engine110, theinput image1010 supplied to theimage processing model120 can be separated into thebackground1010d, thefirst object1010a, thesecond object1010band thethird object1010c, and thelabel image1020 can be generated according to thefirst object1010a, thesecond object1010b, thethird object1010cand thebackground1010d, in which thelabel image1020 contains thefirst label1020arelated to thefirst object1010a(e.g., hand) with the first color (e.g., red), thesecond label1020brelated to thesecond object1010b(e.g., pen) with the second color (e.g., green), thethird label1020crelated to thethird object1010c(e.g., book) with the third color (e.g., blue), thefourth label1020drelated to thebackground1010dwith the fourth color (e.g., black).
In summary, by using theimage processing system100, a large volume of pixel-wise labeled image can be automatically generated to achieve high accuracy when doing the task of object segmentation or segmenting an object out of an image.
Another embodiment of disclosure includes a non-transitory computer readable storage medium (e.g., thememory140 shown inFIG. 1, a hard drive, or any equivalent storage unit) with a computer program to execute aforesaidimage processing method200 and/or800 shown inFIG. 2,FIG. 4,FIG. 7 andFIG. 8 respectively.
Although the present disclosure has been described in considerable details with reference to certain embodiments thereof, other embodiments are possible. Therefore, the scope of the appended claims should not be limited to the description of the embodiments contained herein.
It will be apparent to those skilled in the art that various modifications and variations can be made to the structure of the present disclosure without departing from the scope or spirit of the disclosure. In view of the foregoing, it is intended that the present disclosure cover modifications and variations of this disclosure provided they fall within the scope of the following claims.

Claims (20)

What is claimed is:
1. An image processing method comprising:
obtaining a template label image, wherein the template label image comprises a label corresponding to a target;
obtaining a plurality of first reference images, wherein each of the first reference images comprises object image data corresponding to the target; and
generating a target image according to the template label image and the first reference images, wherein the target image comprises a generated object, a contour of the generated object is generated according to the template label image, and a color or a texture of the target image is generated according to the first reference images, wherein the target image and the template image are used for training an image processing model, and wherein the image processing model after training is configured to process an input image without labels for generating a label image related to the input image.
2. The image processing method as claimed inclaim 1, further comprising:
obtaining a background and an object in the input image by the image processing model; and
generating the label image according to the background and the object, wherein the label image comprises a first label related to the object and a second label related to the background.
3. The image processing method as claimed inclaim 1, wherein before generating the target image, the image processing method further comprises:
training an image generating engine, wherein the image generating engine is utilized to generate the target image.
4. The image processing method as claimed inclaim 3, wherein the operation of training the image generating engine comprises:
generating a processed image according to the template label image and the first reference images;
comparing the processed image with the first reference images; and
in response to whether a comparison result is higher than a threshold, updating the processed image or terminating the training of the image generating engine.
5. The image processing method as claimed inclaim 4, further comprising:
in response to the comparison result is higher than the threshold, updating the processed image according to the comparison result and comparing the updated processed image with the first reference images until the comparison result is lower than the threshold; and
in response to the comparison result is lower than the threshold, terminating the training of the image generating engine.
6. The image processing method as claimed inclaim 4, wherein the operation of comparing the processed image with the first reference images comprises:
comparing a color, a texture or a content-object shape of the processed image with the first reference images.
7. The image processing method as claimed inclaim 3, wherein the operation of training the image generating engine comprises:
generating a processed image according to the template label image and the first reference images;
generating a generated background and a generated object based on the processed image;
forming a processed foreground image according to the generated object;
obtaining a plurality of second reference images, wherein each of the second reference images comprises first object image data corresponding to the target with chromatic colors and first background image data with a single color;
comparing the processed foreground image with the second reference images as a first comparison result; and
updating the processed image according to whether the first comparison result is higher than a threshold.
8. The image processing method as claimed inclaim 7, wherein the operation of training the image generating engine further comprises:
forming a processed background image according to the generated background;
obtaining a plurality of third reference images, wherein each of the third reference images comprises second object image data corresponding to the target with the single color and second background image data with chromatic colors;
comparing the processed background image with the third reference images as a second comparison result; and
updating the processed image according to whether the second comparison result is higher than the threshold.
9. The image processing method as claimed inclaim 8, wherein the operation of training the image generating engine further comprises:
comparing the processed image with the first reference images as a third comparison result;
in response to the third comparison result is higher than the threshold, updating the processed image according to the third comparison result; and
terminating the training of the image generating engine according to all of the first comparison result, the second comparison result and the third comparison result are lower than the threshold.
10. The image processing method as claimed inclaim 1, wherein the target image is generated by a generative adversarial network (GAN) model, and training data of the GAN model comprises the template label image and the first reference images.
11. A non-transitory computer readable storage medium storing one or more programs comprising instructions, which when executed, causes one or more processing components to perform the image processing method as claimed inclaim 1.
12. An image processing system comprising:
a memory configured to store a template label image, wherein the template label image comprises a label corresponding to a target; and
a processor coupled to the memory and being operable to:
obtain a plurality of first reference images, wherein each of the first reference images comprises object image data corresponding to a target; and
generate a target image according to the template label image and the first reference images, wherein the target image comprises a generated object, a contour of the generated object is generated according to the template label image, and a color or a texture of the target image is generated according to the first reference images, wherein the target image and the template label image are used for training an image processing model, and the image processing model after training is configured to process an input image without labels for generating a label image related to the input image.
13. The image processing system as claimed inclaim 12, wherein the processor is further operable to:
obtain a background and an object in the input image by the image processing model; and
generate the label image according to the background and the object, wherein the label image comprises a first label related to the object and a second label related to the background.
14. The image processing system as claimed inclaim 12, wherein the processor is further operable to:
train an image generating engine before generating the target image, wherein the image generating engine is utilized to generating the target image;
generate a processed image according to the template label image and the first reference images;
compare the processed image with the first reference images; and
in response to whether a comparison result is higher than a threshold, update the processed image or terminate the training of the image generating engine.
15. The image processing system as claimed inclaim 14, further comprising:
in response to the comparison result is higher than the threshold, updating the processed image according to the comparison result and comparing the updated processed image with the first reference images until the comparison result is lower than the threshold; and
in response to the comparison result is lower than the threshold, terminating the training of the image generating engine.
16. The image processing system as claimed inclaim 14, wherein the processor is further operable to:
compare a color, a texture or a content-object shape of the processed image with the first reference images.
17. The image processing system as claimed inclaim 12, wherein the processor is further operable to:
train an image generating engine before generating the target image, wherein the image generating engine is utilized to generating the target image;
generate a processed image according to the template label image and the first reference images;
generate a generated background and a generated object based on the processed image;
form a processed foreground image according to the generated object;
obtain a plurality of second reference images, wherein each of the second reference images comprises first object image data corresponding to the target with chromatic colors and first background image data with a single color;
compare the processed foreground image with the second reference images as a first comparison result; and
update the processed image according to whether the first comparison result is higher than a threshold.
18. The image processing system as claimed inclaim 17, wherein the processor is further operable to:
form a processed background image according to the generated background;
obtain a plurality of third reference images, wherein each of the third reference images comprises second object image data corresponding to the target with a single color and second background image data with chromatic colors;
compare the processed background image with the third reference images as a second comparison result; and
update the processed image according to whether the second comparison result is higher than the threshold.
19. The image processing system as claimed inclaim 18, wherein the processor is further operable to:
compare the processed image with the first reference images as a third comparison result;
in response to the third comparison result is higher than the threshold, update the processed image according to the third comparison result; and
terminate the training of the image generating engine according to all of the first comparison result, the second comparison result and the third comparison result are lower than the threshold.
20. The image processing system as claimed inclaim 12, wherein the target image is generated by a generative adversarial network (GAN) model and training data of GAN model comprises the template label image and the first reference images.
US15/970,9012017-05-042018-05-04Image processing method, non-transitory computer readable storage medium and image processing systemActive2038-12-20US10824910B2 (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
US15/970,901US10824910B2 (en)2017-05-042018-05-04Image processing method, non-transitory computer readable storage medium and image processing system

Applications Claiming Priority (2)

Application NumberPriority DateFiling DateTitle
US201762501100P2017-05-042017-05-04
US15/970,901US10824910B2 (en)2017-05-042018-05-04Image processing method, non-transitory computer readable storage medium and image processing system

Publications (2)

Publication NumberPublication Date
US20180322367A1 US20180322367A1 (en)2018-11-08
US10824910B2true US10824910B2 (en)2020-11-03

Family

ID=64013716

Family Applications (1)

Application NumberTitlePriority DateFiling Date
US15/970,901Active2038-12-20US10824910B2 (en)2017-05-042018-05-04Image processing method, non-transitory computer readable storage medium and image processing system

Country Status (3)

CountryLink
US (1)US10824910B2 (en)
CN (1)CN108805169B (en)
TW (1)TWI672638B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN113255911A (en)*2021-06-072021-08-13杭州海康威视数字技术股份有限公司Model training method and device, electronic equipment and storage medium

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
WO2020047466A1 (en)*2018-08-302020-03-05The Government Of The United States Of America, As Represented By Thesecretary Of The NavyHuman-assisted machine learning through geometric manipulation and refinement
CN111435432B (en)*2019-01-152023-05-26北京市商汤科技开发有限公司 Network optimization method and device, image processing method and device, storage medium
EP3742346A3 (en)*2019-05-232021-06-16HTC CorporationMethod for training generative adversarial network (gan), method for generating images by using gan, and computer readable storage medium
CN110414480A (en)*2019-08-092019-11-05威盛电子股份有限公司 Training image generation method and electronic device
CN110751630B (en)*2019-09-302020-12-08山东信通电子股份有限公司Power transmission line foreign matter detection method and device based on deep learning and medium
CN110796673B (en)*2019-10-312023-02-24Oppo广东移动通信有限公司 Image Segmentation Method and Related Products
CN112967338B (en)*2019-12-132024-05-31宏达国际电子股份有限公司 Image processing system and image processing method
EP3843038B1 (en)*2019-12-232023-09-20HTC CorporationImage processing method and system
CN112613333A (en)*2019-12-272021-04-06珠海大横琴科技发展有限公司Method for calculating difference between network output image and label
US11272097B2 (en)*2020-07-302022-03-08Steven Brian DemersAesthetic learning methods and apparatus for automating image capture device controls
US12020418B2 (en)*2021-04-222024-06-25Taiwan Semiconductor Manufacturing Company, Ltd.Image processing method and system, and non-transitory computer readable medium
CN114332267B (en)*2021-11-052025-08-29腾讯科技(深圳)有限公司 Method, device, computer equipment and medium for generating ink painting images

Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
EP0936682A1 (en)1996-07-291999-08-18Nichia Chemical Industries, Ltd.Light emitting device and display device
CN102055873A (en)2009-11-022011-05-11夏普株式会社Image processing apparatus, image processing method
CN106339997A (en)2015-07-092017-01-18株式会社理光Image fusion method, device and system
CN106548208A (en)2016-10-282017-03-29杭州慕锐科技有限公司A kind of quick, intelligent stylizing method of photograph image
US20180322679A1 (en)*2013-02-212018-11-08Dolby Laboratories Licensing CorporationSystems and methods for appearance mapping for compositing overlay graphics

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CA2347181A1 (en)*2000-06-132001-12-13Eastman Kodak CompanyPlurality of picture appearance choices from a color photographic recording material intended for scanning
US8606009B2 (en)*2010-02-042013-12-10Microsoft CorporationHigh dynamic range image generation and rendering
US20140225991A1 (en)*2011-09-022014-08-14Htc CorporationImage capturing apparatus and method for obatining depth information of field thereof

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
EP0936682A1 (en)1996-07-291999-08-18Nichia Chemical Industries, Ltd.Light emitting device and display device
CN102055873A (en)2009-11-022011-05-11夏普株式会社Image processing apparatus, image processing method
US20180322679A1 (en)*2013-02-212018-11-08Dolby Laboratories Licensing CorporationSystems and methods for appearance mapping for compositing overlay graphics
CN106339997A (en)2015-07-092017-01-18株式会社理光Image fusion method, device and system
CN106548208A (en)2016-10-282017-03-29杭州慕锐科技有限公司A kind of quick, intelligent stylizing method of photograph image

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Corresponding Chinese office action dated Jun. 9, 2020.
Kwak, Hanock, and Byoung-Tak Zhang. "Generating images part by part with composite generative adversarial networks." arXiv preprint arXiv:1607.05387 (2016). (Year: 2016).*

Cited By (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN113255911A (en)*2021-06-072021-08-13杭州海康威视数字技术股份有限公司Model training method and device, electronic equipment and storage medium
CN113255911B (en)*2021-06-072023-10-13杭州海康威视数字技术股份有限公司Model training method and device, electronic equipment and storage medium

Also Published As

Publication numberPublication date
CN108805169B (en)2021-06-01
US20180322367A1 (en)2018-11-08
CN108805169A (en)2018-11-13
TW201909028A (en)2019-03-01
TWI672638B (en)2019-09-21

Similar Documents

PublicationPublication DateTitle
US10824910B2 (en)Image processing method, non-transitory computer readable storage medium and image processing system
US10762608B2 (en)Sky editing based on image composition
US11475246B2 (en)System and method for generating training data for computer vision systems based on image segmentation
US10019823B2 (en)Combined composition and change-based models for image cropping
US20180357819A1 (en)Method for generating a set of annotated images
US20150117783A1 (en)Iterative saliency map estimation
CN109712145A (en)A kind of image matting method and system
BeyelerOpenCV with Python blueprints
US10249029B2 (en)Reconstruction of missing regions of images
JP2016095854A (en) Image processing method and apparatus
WO2020037881A1 (en)Motion trajectory drawing method and apparatus, and device and storage medium
US20240386639A1 (en)Video cover generation method, apparatus, electronic device and readable medium
US20200193611A1 (en)Method, system and apparatus for segmenting an image of a scene
CN115063447A (en) A video sequence-based target animal motion tracking method and related equipment
CN113012030A (en)Image splicing method, device and equipment
CN111681270A (en) A method, device and storage medium for realizing registration between image frames
CN110378250A (en) Neural network training method, device and terminal equipment for scene recognition
CN118608926A (en) Image quality evaluation method, device, electronic device and storage medium
JP6272071B2 (en) Image processing apparatus, image processing method, and program
CN110751668B (en)Image processing method, device, terminal, electronic equipment and readable storage medium
CN113724143A (en)Method and device for image restoration
CN106469437B (en) Image processing method and image processing device
CN111191580A (en) Synthetic rendering method, device, electronic device and medium
US12322010B2 (en)Logo labeling method and device, update method and system of logo detection model, and storage medium
CN112989924A (en)Target detection method, target detection device and terminal equipment

Legal Events

DateCodeTitleDescription
ASAssignment

Owner name:HTC CORPORATION, TAIWAN

Free format text:ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:CHANG, FU-CHIEH;CHOU, CHUN-NAN;CHANG, EDWARD;REEL/FRAME:045713/0011

Effective date:20180502

FEPPFee payment procedure

Free format text:ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STPPInformation on status: patent application and granting procedure in general

Free format text:DOCKETED NEW CASE - READY FOR EXAMINATION

STPPInformation on status: patent application and granting procedure in general

Free format text:RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPPInformation on status: patent application and granting procedure in general

Free format text:NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS

STPPInformation on status: patent application and granting procedure in general

Free format text:NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS

STPPInformation on status: patent application and granting procedure in general

Free format text:PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED

STCFInformation on status: patent grant

Free format text:PATENTED CASE

MAFPMaintenance fee payment

Free format text:PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment:4


[8]ページ先頭

©2009-2025 Movatter.jp