Disclosure of Invention
The present disclosure provides an image segmentation method, apparatus, and storage medium, which can overcome the problems in the related art.
According to a first aspect of embodiments of the present disclosure, there is provided an image segmentation method, the method including:
acquiring a trained pixel classification model, wherein the pixel classification model is used for determining a classification identifier of each pixel in any image, the classification identifier comprises a first identifier and a second identifier, the first identifier is used for indicating that the corresponding pixel belongs to a foreground region, and the second identifier is used for indicating that the corresponding pixel belongs to a background region;
inputting a target image to be processed into the pixel classification model, and determining a classification identifier of each pixel in the target image based on the pixel classification model;
and determining a foreground area and a background area of the target image according to the classification identification of each pixel in the target image.
In one possible implementation, the method further includes:
obtaining a plurality of sample images and a classification identification of each pixel in the plurality of sample images;
and performing model training according to the plurality of sample images and the classification identifier of each pixel in the plurality of sample images to obtain the pixel classification model.
In another possible implementation manner, the determining a foreground region and a background region of the target image according to the classification identifier of each pixel in the target image includes:
and generating a foreground probability map according to the classification identification of each pixel in the target image, wherein the foreground probability map is used for indicating the positions of a foreground area and a background area of the target image.
In another possible implementation manner, the method further includes:
according to the foreground probability map, performing enhancement processing on a foreground region of the target image; or,
and performing fuzzification processing on the background area of the target image according to the foreground probability map.
In another possible implementation manner, the blurring a background region in the target image according to the foreground probability map includes:
extracting a foreground region of the target image according to the foreground probability map;
blurring the target image to obtain a first image, and extracting a background area of the first image according to the foreground probability map;
and combining the foreground area of the target image and the background area of the first image to obtain a second image.
In another possible implementation manner, in the foreground probability map, a pixel value of a pixel in a foreground region is 1, and a pixel value of a pixel in a background region is 0, and the blurring processing on the background region of the target image according to the foreground probability map includes:
according to the foreground probability map, blurring a background area of the target image by adopting the following formula:
Target=Source*mask+Gaussian*(255-mask);
wherein, Source represents the Target image, Gaussian represents the image obtained after the Target image is subjected to the fuzzification processing, mask represents the foreground probability map, and Target represents the image obtained after the background area of the Target image is subjected to the fuzzification processing.
According to a second aspect of the embodiments of the present disclosure, there is provided an image segmentation apparatus, the apparatus comprising:
a model obtaining module configured to obtain a trained pixel classification model, where the pixel classification model is used to determine a classification identifier of each pixel in any image, where the classification identifier includes a first identifier and a second identifier, the first identifier is used to indicate that the corresponding pixel belongs to a foreground region, and the second identifier is used to indicate that the corresponding pixel belongs to a background region;
the identification determining module is configured to input a target image to be processed into the pixel classification model, and determine the classification identification of each pixel in the target image based on the pixel classification model;
and the region determining module is configured to determine a foreground region and a background region of the target image according to the classification identification of each pixel in the target image.
In one possible implementation, the apparatus further includes:
a sample acquisition module configured to acquire a plurality of sample images and a classification identifier for each pixel in the plurality of sample images;
and the training module is configured to perform model training according to the plurality of sample images and the classification identifier of each pixel in the plurality of sample images to obtain the pixel classification model.
In another possible implementation manner, the region determining module includes:
the generation unit is configured to generate a foreground probability map according to the classification identification of each pixel in the target image, wherein the foreground probability map is used for indicating positions of a foreground area and a background area of the target image.
In another possible implementation manner, the apparatus further includes:
the processing module is configured to perform enhancement processing on a foreground region of the target image according to the foreground probability map; or, the processing module is further configured to perform blurring processing on a background region of the target image according to the foreground probability map.
In another possible implementation manner, the processing module includes:
an extraction unit configured to extract a foreground region of the target image according to the foreground probability map;
the extraction unit is further configured to perform blurring processing on the target image to obtain a first image, and extract a background area of the first image according to the foreground probability map;
a combining unit configured to combine a foreground region of the target image and a background region of the first image to obtain a second image.
In another possible implementation manner, in the foreground probability map, a pixel value of a pixel in a foreground region is 1, and a pixel value of a pixel in a background region is 0, and the processing module is further configured to perform blurring processing on the background region of the target image according to the foreground probability map by using the following formula:
Target=Source*mask+Gaussian*(255-mask);
wherein, Source represents the Target image, Gaussian represents the image obtained after the Target image is subjected to the fuzzification processing, mask represents the foreground probability map, and Target represents the image obtained after the background area of the Target image is subjected to the fuzzification processing.
According to a third aspect of the embodiments of the present disclosure, there is provided an image segmentation apparatus, the apparatus including:
a processor;
a memory for storing processor executable commands;
wherein the processor is configured to:
acquiring a trained pixel classification model, wherein the pixel classification model is used for determining a classification identifier of each pixel in any image, the classification identifier comprises a first identifier and a second identifier, the first identifier is used for indicating that the corresponding pixel belongs to a foreground region, and the second identifier is used for indicating that the corresponding pixel belongs to a background region;
inputting a target image to be processed into the pixel classification model, and determining a classification identifier of each pixel in the target image based on the pixel classification model;
and determining a foreground area and a background area of the target image according to the classification identification of each pixel in the target image.
According to a fourth aspect of embodiments of the present disclosure, there is provided a non-transitory computer-readable storage medium having instructions therein, which when executed by a processor of a processing apparatus, enable the processing apparatus to perform a method of image segmentation, the method comprising:
acquiring a trained pixel classification model, wherein the pixel classification model is used for determining a classification identifier of each pixel in any image, the classification identifier comprises a first identifier and a second identifier, the first identifier is used for indicating that the corresponding pixel belongs to a foreground region, and the second identifier is used for indicating that the corresponding pixel belongs to a background region;
inputting a target image to be processed into the pixel classification model, and determining a classification identifier of each pixel in the target image based on the pixel classification model;
and determining a foreground area and a background area of the target image according to the classification identification of each pixel in the target image.
According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product, wherein instructions which, when executed by a processor of a processing apparatus, enable the processing apparatus to perform a method of image segmentation, the method comprising:
acquiring a trained pixel classification model, wherein the pixel classification model is used for determining a classification identifier of each pixel in any image, the classification identifier comprises a first identifier and a second identifier, the first identifier is used for indicating that the corresponding pixel belongs to a foreground region, and the second identifier is used for indicating that the corresponding pixel belongs to a background region;
inputting a target image to be processed into the pixel classification model, and determining a classification identifier of each pixel in the target image based on the pixel classification model;
and determining a foreground area and a background area of the target image according to the classification identification of each pixel in the target image.
The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:
the method comprises the steps of obtaining a trained pixel classification model, wherein the pixel classification model is used for determining a classification identifier of each pixel in any image, the classification identifier comprises a first identifier and a second identifier, the first identifier is used for indicating that the corresponding pixel belongs to a foreground region, the second identifier is used for indicating that the corresponding pixel belongs to a background region, inputting a target image to be processed into the pixel classification model, determining the classification identifier of each pixel in the target image based on the pixel classification model, and determining the foreground region and the background region of the target image according to the classification identifier of each pixel in the target image, so that the foreground region or the background region of the target image can be extracted. By means of training the pixel classification model, the target image can be subjected to image segmentation directly based on the pixel classification model without carrying out a large amount of calculation, so that the calculation amount is reduced, and the time is saved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
Fig. 1 is a flowchart illustrating an image segmentation method according to an exemplary embodiment, as shown in fig. 1, for use in a processing device, comprising the steps of:
in step 101, a trained pixel classification model is obtained, where the pixel classification model is used to determine a classification identifier of each pixel in any image, where the classification identifier includes a first identifier and a second identifier, the first identifier is used to indicate that the corresponding pixel belongs to a foreground region, and the second identifier is used to indicate that the corresponding pixel belongs to a background region.
In step 102, a target image to be processed is input into a pixel classification model, and a classification identifier of each pixel in the target image is determined based on the pixel classification model.
In step 103, a foreground region and a background region of the target image are determined according to the classification identifier of each pixel in the target image.
The method provided by the embodiment of the disclosure includes the steps of obtaining a trained pixel classification model, wherein the pixel classification model is used for determining a classification identifier of each pixel in any image, the classification identifier includes a first identifier and a second identifier, the first identifier is used for indicating that the corresponding pixel belongs to a foreground region, the second identifier is used for indicating that the corresponding pixel belongs to a background region, inputting a target image to be processed into the pixel classification model, determining the classification identifier of each pixel in the target image based on the pixel classification model, and determining the foreground region and the background region of the target image according to the classification identifier of each pixel in the target image, so that the foreground region or the background region of the target image can be extracted. By means of training the pixel classification model, the target image can be subjected to image segmentation directly based on the pixel classification model without carrying out a large amount of calculation, so that the calculation amount is reduced, and the time is saved.
In one possible implementation, the method further includes:
obtaining a plurality of sample images and a classification identification of each pixel in the plurality of sample images;
and performing model training according to the classification identification of each pixel in the plurality of sample images and the plurality of sample images to obtain a pixel classification model.
In another possible implementation manner, determining a foreground region and a background region of the target image according to the classification identifier of each pixel in the target image includes:
and generating a foreground probability map according to the classification identification of each pixel in the target image, wherein the foreground probability map is used for indicating the positions of the foreground area and the background area of the target image.
In another possible implementation manner, the method further includes:
according to the foreground probability map, performing enhancement processing on a foreground region of the target image; or,
and performing fuzzification processing on the background area of the target image according to the foreground probability map.
In another possible implementation manner, the blurring processing is performed on the background area in the target image according to the foreground probability map, and the blurring processing includes:
extracting a foreground region of the target image according to the foreground probability map;
blurring the target image to obtain a first image, and extracting a background area of the first image according to the foreground probability map;
and combining the foreground area of the target image and the background area of the first image to obtain a second image.
In another possible implementation manner, in the foreground probability map, a pixel value of a pixel in a foreground region is 1, a pixel value of a pixel in a background region is 0, and according to the foreground probability map, performing blurring processing on the background region of the target image includes:
according to the foreground probability map, fuzzifying the background area of the target image by adopting the following formula:
Target=Source*mask+Gaussian*(255-mask);
wherein, Source represents a Target image, Gaussian represents an image obtained after the Target image is subjected to blurring processing, mask represents a foreground probability graph, and Target represents an image obtained after the background area of the Target image is subjected to blurring processing.
Fig. 2 is a flowchart illustrating an image segmentation method according to an exemplary embodiment, as shown in fig. 2, the image segmentation method is used in a processing apparatus, and the processing apparatus may be a mobile phone, a computer, a server, a camera, a monitoring device, or other apparatuses with an image processing function, and the method includes the following steps:
in step 201, the processing device acquires a target image to be processed and acquires a trained pixel classification model.
The target image can be obtained by shooting by the processing device, or extracted from a video shot by the processing device, or downloaded from the internet by the processing device, or sent to the processing device by other equipment. Alternatively, in the process of live video broadcast by the processing device, each image in the video stream may be acquired and each picture may be taken as a target image, so as to process each image in the video stream.
The target image can be various types of images, such as a person image, a landscape picture, a gourmet picture and the like, the target image comprises a foreground area and a background area, and the colors displayed by the foreground area and the background area, the shapes of objects and the like can be the same or different, and the difference is only that: the foreground region is a region where an object close to the camera is located when the target image is shot, and is equivalent to the 'center of interest' of the target image, and the background region is a region for accompanying the foreground region except the foreground region.
For example, the taken photo of the person includes the person and the environment behind the person, the area where the person is located may be considered as a foreground area, and the area where the environment is located may be considered as a background area, or the gourmet picture includes the table and the gourmet on the table, the area where the table is located may be considered as a background area, and the area where the gourmet is located may be considered as a foreground area.
In the embodiment of the disclosure, the foreground region and the background region of the target image can be distinguished based on the pixel classification model. The pixel classification model is used for determining a classification identifier of each pixel in any image, the classification identifier includes a first identifier and a second identifier, the first identifier is used for indicating that the corresponding pixel belongs to a foreground region, and the second identifier is used for indicating that the corresponding pixel belongs to a background region, that is, when the classification identifier of any pixel in the image is the first identifier, the pixel is represented to belong to the foreground region, and when the classification identifier of any pixel in the image is the second identifier, the pixel is represented to belong to the background region.
The first and second identifiers are two different identifiers, for example, the first identifier is 1 and the second identifier is 0, or the first identifier is 0 and the second identifier is 1.
The pixel classification model may be trained by the processing device and stored by the processing device, or the pixel classification model may be trained by another device and sent to the processing device and stored by the processing device.
When training the pixel classification model, the classification identifier of each pixel in the plurality of sample images and the plurality of sample images may be obtained first, and model training is performed according to the classification identifier of each pixel in the plurality of sample images and the plurality of sample images to obtain the pixel classification model.
In a possible implementation manner, when training the pixel classification model, an initial pixel classification model may be constructed, and a training data set and a test data set are obtained, where the training data set and the test data set each include a plurality of sample images and a classification identifier of each pixel in each sample image.
In the training process, a plurality of sample images in a training data set are used as the input of a pixel classification model, the classification identification of each pixel in the plurality of sample images is used as the output of the pixel classification model, the pixel classification model is trained, the pixel classification model learns the difference between a foreground region and a background region, and the pixel classification model has the capability of dividing the foreground region and the background region. Then, each sample image in the test data set is input into a pixel classification model, a test classification identifier of each pixel in each sample image is respectively determined based on the pixel classification model, the test classification identifier is compared with an actual labeled classification identifier, and the pixel classification model is corrected according to a comparison result.
In one possible implementation, a preset training algorithm may be used in training the pixel classification model, and the preset training algorithm may be a deep learning network algorithm, a decision tree algorithm, an artificial neural network algorithm, or the like.
In the subsequent process, a new sample image and the classification identification of each pixel in the sample image can be obtained, and the pixel classification model is continuously trained, so that the classification accuracy of the pixel classification model can be improved.
In step 202, the processing device inputs the target image into the pixel classification model, and determines the classification identifier of each pixel in the target image based on the pixel classification model.
The processing device inputs the target image into the pixel classification model and processes the target image, so that the pixels in the target image can be classified according to the classification identification of each pixel in the target image.
When the classification mark of the pixel is a first mark, the pixel is represented to belong to a foreground region, when the classification mark of the pixel is a second mark, the pixel is represented to belong to a background region, and the foreground region and the background region of the target image can be determined by combining the classification mark of each pixel in the target image and the position of each pixel in the target image, so that the pixel classification model is a semantic segmentation model in fact and can segment the target image into the foreground region and the background region.
In step 203, the processing device generates a foreground probability map according to the classification identifier of each pixel in the target image.
The foreground probability map is used for indicating positions of a foreground region and a background region in the target image, and whether a pixel at each position in the target image belongs to the foreground region or the background region can be determined according to the foreground probability map, so that the target image can be divided into the foreground region and the background region, and the matting of the foreground region or the background region of the target image is realized.
In one possible implementation, the classification flag of each pixel in the target image is used as the pixel value of the pixel at the corresponding position in the foreground probability map, so as to generate the foreground probability map. In the foreground probability map, the pixel value of any pixel is the first identifier indicating that the pixel corresponding to the target image belongs to the foreground region, and the pixel value of any pixel is the second identifier indicating that the pixel corresponding to the target image belongs to the background region.
For example, the first flag is 1, the second flag is 0, a white area in the foreground probability map indicates that the corresponding position in the target image is a foreground area, and a black area in the foreground probability map indicates that the corresponding position in the target image is a background area.
In step 204, the processing device processes the foreground region or the background region of the target image according to the foreground probability map.
In order to improve the definition of the foreground region, the foreground region may be enhanced, and the background region may also be blurred. Accordingly, step 204 includes at least one of the following steps 2041 and 2042:
in step 2041, the foreground region of the target image is enhanced according to the foreground probability map.
The processing device determines an image area with the pixel value being the pixel of the first identification according to the pixel value of each pixel in the foreground probability map, extracts the image area corresponding to the image area in the target image to be used as the foreground area of the target image, determines the image area with the pixel value being the pixel of the second identification, and extracts the image area corresponding to the image area in the target image to be used as the background area of the target image.
And then, an image enhancement algorithm is adopted to enhance the foreground area to obtain a processed foreground area, and the processed foreground area and the processed background area are combined to obtain a processed image, so that the quality of the foreground area is improved, the foreground area is clearer, and the foreground area is more prominent.
For example, the image enhancement algorithm may be a histogram equalization algorithm, a contrast enhancement algorithm, or other algorithms capable of enhancing an image, and is not described in detail herein.
In step 2042, the background region of the target image is blurred according to the foreground probability map.
In one possible implementation, the processing means performs a blurring process on the target image, resulting in a first image, determining an image area formed by pixels with pixel values as second identifications according to the pixel value of each pixel in the foreground probability map, extracting the image area corresponding to the image area in the first image, wherein the image area is an area of the background area of the target image after fuzzification processing, namely the background area after the fuzzification, determining an image area which is formed by pixels with pixel values as the first identification, extracting the image area corresponding to the image area in the target image to be used as a foreground area of the target image, the foreground area and the background area after the fuzzification processing are combined to obtain a processed image, the fuzzification processing of the background area of the target image is achieved, and therefore the foreground area is more prominent.
In another possible implementation manner, in the foreground probability map, the pixel value of the pixel in the foreground region is 1, and the pixel value of the pixel in the background region is 0, and according to the foreground probability map, the background region of the target image is blurred by using the following formula:
Target=Source*mask+Gaussian*(255-mask)。
wherein, Source represents a Target image, Gaussian represents an image obtained after the Target image is subjected to blurring processing, mask represents a foreground probability graph, and Target represents an image obtained after the background area of the Target image is subjected to blurring processing. Source mask represents the foreground region of the target image, Gaussian (255-mask) represents the background region of the image obtained after the target image is subjected to the fuzzification processing, and the foreground region of the target image and the background region of the image obtained after the target image is subjected to the fuzzification processing are combined to obtain the image obtained after the background region of the target image is subjected to the fuzzification processing.
When the processing device in the embodiment of the present disclosure processes the target image, the processing device may perform enhancement processing on the target image first, and then perform blurring processing on the target image. Alternatively, when the processing device processes the target image, only the target image may be subjected to the enhancement processing, or only the target image may be subjected to the blurring processing.
The method provided by the embodiment of the disclosure includes the steps of obtaining a trained pixel classification model, wherein the pixel classification model is used for determining a classification identifier of each pixel in any image, the classification identifier includes a first identifier and a second identifier, the first identifier is used for indicating that the corresponding pixel belongs to a foreground region, the second identifier is used for indicating that the corresponding pixel belongs to a background region, inputting a target image to be processed into the pixel classification model, determining the classification identifier of each pixel in the target image based on the pixel classification model, and determining the foreground region and the background region of the target image according to the classification identifier of each pixel in the target image, so that the foreground region or the background region of the target image can be extracted. By means of training the pixel classification model, the target image can be subjected to image segmentation directly based on the pixel classification model without carrying out a large amount of calculation, so that the calculation amount is reduced, the time is saved, and the flow speed of background blurring is increased.
And the foreground area can be clearer by enhancing the foreground area, the background area can be weakened by fuzzifying the background area, and the difference between the foreground area and the background area is enhanced, so that the foreground area can be more prominent, and the foreground area can be more obvious.
The embodiment of the disclosure can be applied to scenes such as face recognition, video monitoring, live video broadcasting and picture beautification, for example, in a face recognition scene, a face area can be extracted from a target image, and areas except the face area are ignored, so that the face area is subjected to face recognition. In a video monitoring scene, a target object area can be extracted from each image in a video stream, and areas except the target object area are ignored, so that dynamic change information of a target object in the video stream is obtained, and the target object is tracked. Under a live video scene, a foreground area and a background area can be distinguished in real time in each image in a live video stream, and the background area is subjected to blurring processing, so that the foreground area is highlighted, the foreground area is clearer, and the live broadcast effect is enhanced.
FIG. 3 is a block diagram illustrating a processing device according to an example embodiment. Referring to fig. 3, the apparatus includes a model acquisition module 301, an identity determination module 302, and a region determination module 303.
A model obtaining module 301, configured to obtain a trained pixel classification model, where the pixel classification model is used to determine a classification identifier of each pixel in any image, and the classification identifier includes a first identifier and a second identifier, where the first identifier is used to indicate that the corresponding pixel belongs to a foreground region, and the second identifier is used to indicate that the corresponding pixel belongs to a background region;
an identification determination module 302 configured to input a target image to be processed into a pixel classification model, and determine a classification identification of each pixel in the target image based on the pixel classification model;
a region determining module 303 configured to determine a foreground region and a background region of the target image according to the classification identifier of each pixel in the target image.
In one possible implementation manner, the apparatus further includes:
a sample acquisition module configured to acquire a plurality of sample images and a classification identifier of each pixel in the plurality of sample images;
and the training module is configured to perform model training according to the classification identification of each pixel in the plurality of sample images to obtain a pixel classification model.
In another possible implementation manner, the region determining module 303 includes:
and the generation unit is configured to generate a foreground probability map according to the classification identification of each pixel in the target image, wherein the foreground probability map is used for indicating the positions of the foreground area and the background area of the target image.
In another possible implementation manner, the apparatus further includes:
the processing module is configured to perform enhancement processing on a foreground region of the target image according to the foreground probability map; or the processing module is further configured to perform blurring processing on the background area of the target image according to the foreground probability map.
In another possible implementation manner, the processing module includes:
an extraction unit configured to extract a foreground region of the target image according to the foreground probability map;
the extraction unit is also configured to perform fuzzification processing on the target image to obtain a first image, and extract a background area of the first image according to the foreground probability map;
and the combining unit is configured to combine the foreground area of the target image and the background area of the first image to obtain a second image.
In another possible implementation manner, in the foreground probability map, the pixel value of a pixel in the foreground region is 1, and the pixel value of a pixel in the background region is 0, and the processing module is further configured to perform blurring processing on the background region of the target image according to the foreground probability map by using the following formula:
Target=Source*mask+Gaussian*(255-mask);
wherein, Source represents a Target image, Gaussian represents an image obtained after the Target image is subjected to blurring processing, mask represents a foreground probability graph, and Target represents an image obtained after the background area of the Target image is subjected to blurring processing.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
Fig. 4 is a block diagram illustrating a terminal 400 for image segmentation according to an exemplary embodiment. The terminal 400 is used for executing the steps executed by the processing device in the image segmentation method, and can be a portable mobile terminal, such as: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, motion video Experts compression standard Audio Layer 3), an MP4 player (Moving Picture Experts Group Audio Layer IV, motion video Experts compression standard Audio Layer 4), a notebook computer, or a desktop computer. The terminal 400 may also be referred to by other names such as user equipment, portable terminal, laptop terminal, desktop terminal, etc.
Generally, the terminal 400 includes: a processor 401 and a memory 402.
Processor 401 may include one or more processing cores, such as a 4-core processor, an 8-core processor, or the like. The processor 401 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 401 may also include a main processor and a coprocessor, where the main processor is a processor for processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 401 may be integrated with a GPU (Graphics Processing Unit), which is responsible for rendering and drawing the content required to be displayed by the display screen. In some embodiments, the processor 401 may further include an AI (Artificial Intelligence) processor for processing computing operations related to machine learning.
Memory 402 may include one or more computer-readable storage media, which may be non-transitory. Memory 402 may also include high speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 402 is used to store at least one instruction for being possessed by processor 401 to implement the image segmentation methods provided by the method embodiments herein.
In some embodiments, the terminal 400 may further optionally include: a peripheral interface 403 and at least one peripheral. The processor 401, memory 402 and peripheral interface 403 may be connected by bus or signal lines. Each peripheral may be connected to the peripheral interface 403 via a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one of radio frequency circuitry 404, touch screen display 405, camera 406, audio circuitry 407, positioning components 408, and power supply 409.
The peripheral interface 403 may be used to connect at least one peripheral related to I/O (Input/Output) to the processor 401 and the memory 402. In some embodiments, processor 401, memory 402, and peripheral interface 403 are integrated on the same chip or circuit board; in some other embodiments, any one or two of the processor 401, the memory 402 and the peripheral interface 403 may be implemented on a separate chip or circuit board, which is not limited by this embodiment.
The Radio Frequency circuit 404 is used for receiving and transmitting RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuitry 404 communicates with communication networks and other communication devices via electromagnetic signals. The rf circuit 404 converts an electrical signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 404 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. The radio frequency circuitry 404 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: metropolitan area networks, various generation mobile communication networks (2G, 3G, 4G, and 13G), Wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, the rf circuit 404 may further include NFC (Near Field Communication) related circuits, which are not limited in this application.
The display screen 405 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display screen 405 is a touch display screen, the display screen 405 also has the ability to capture touch signals on or over the surface of the display screen 405. The touch signal may be input to the processor 401 as a control signal for processing. At this point, the display screen 405 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, the display screen 405 may be one, providing the front panel of the terminal 400; in other embodiments, the display screen 405 may be at least two, respectively disposed on different surfaces of the terminal 400 or in a folded design; in still other embodiments, the display 405 may be a flexible display disposed on a curved surface or a folded surface of the terminal 400. Even further, the display screen 405 may be arranged in a non-rectangular irregular pattern, i.e. a shaped screen. The Display screen 405 may be made of LCD (liquid crystal Display), OLED (Organic Light-emitting diode), and the like.
The camera assembly 406 is used to capture images or video. Optionally, camera assembly 406 includes a front camera and a rear camera. Generally, a front camera is disposed at a front panel of the terminal, and a rear camera is disposed at a rear surface of the terminal. In some embodiments, the number of the rear cameras is at least two, and each rear camera is any one of a main camera, a depth-of-field camera, a wide-angle camera and a telephoto camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions. In some embodiments, camera assembly 406 may also include a flash. The flash lamp can be a monochrome temperature flash lamp or a bicolor temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp, and can be used for light compensation at different color temperatures.
The audio circuit 407 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals to the processor 401 for processing, or inputting the electric signals to the radio frequency circuit 404 for realizing voice communication. For the purpose of stereo sound collection or noise reduction, a plurality of microphones may be provided at different portions of the terminal 400. The microphone may also be an array microphone or an omni-directional pick-up microphone. The speaker is used to convert electrical signals from the processor 401 or the radio frequency circuit 404 into sound waves. The loudspeaker can be a traditional film loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments, audio circuitry 407 may also include a headphone jack.
The positioning component 408 is used to locate the current geographic position of the terminal 400 for navigation or LBS (location based Service). The positioning component 408 may be a positioning component based on the GPS (global positioning System) of the united states, the beidou System of china, the graves System of russia, or the galileo System of the european union.
The power supply 409 is used to supply power to the various components in the terminal 400. The power source 409 may be alternating current, direct current, disposable or rechargeable. When power source 409 comprises a rechargeable battery, the rechargeable battery may support wired or wireless charging. The rechargeable battery may also be used to support fast charge technology.
In some embodiments, the terminal 400 also includes one or more sensors 410. The one or more sensors 410 include, but are not limited to: acceleration sensor 411, gyro sensor 412, pressure sensor 413, fingerprint sensor 414, optical sensor 415, and proximity sensor 416.
The acceleration sensor 411 may detect the magnitude of acceleration in three coordinate axes of the coordinate system established with the terminal 400. For example, the acceleration sensor 411 may be used to detect components of the gravitational acceleration in three coordinate axes. The processor 401 may control the touch display screen 405 to display the user interface in a landscape view or a portrait view according to the gravitational acceleration signal collected by the acceleration sensor 411. The acceleration sensor 411 may also be used for acquisition of motion data of a game or a user.
The gyro sensor 412 may detect a body direction and a rotation angle of the terminal 400, and the gyro sensor 412 may cooperate with the acceleration sensor 411 to acquire a 3D motion of the terminal 400 by the user. From the data collected by the gyro sensor 412, the processor 401 may implement the following functions: motion sensing (such as changing the UI according to a user's tilting operation), image stabilization at the time of photographing, game control, and inertial navigation.
The pressure sensor 413 may be disposed on a side bezel of the terminal 400 and/or a lower layer of the touch display screen 405. When the pressure sensor 413 is disposed on the side frame of the terminal 400, a user's holding signal to the terminal 400 can be detected, and the processor 401 performs left-right hand recognition or shortcut operation according to the holding signal collected by the pressure sensor 413. When the pressure sensor 413 is disposed at the lower layer of the touch display screen 405, the processor 401 controls the operability control on the UI interface according to the pressure operation of the user on the touch display screen 405. The operability control comprises at least one of a button control, a scroll bar control, an icon control and a menu control.
The fingerprint sensor 414 is used for collecting a fingerprint of the user, and the processor 401 identifies the identity of the user according to the fingerprint collected by the fingerprint sensor 414, or the fingerprint sensor 414 identifies the identity of the user according to the collected fingerprint. Upon recognizing that the user's identity is a trusted identity, the processor 401 authorizes the user to have relevant sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying, and changing settings, etc. The fingerprint sensor 414 may be disposed on the front, back, or side of the terminal 400. When a physical key or vendor Logo is provided on the terminal 400, the fingerprint sensor 414 may be integrated with the physical key or vendor Logo.
The optical sensor 415 is used to collect the ambient light intensity. In one embodiment, the processor 401 may control the display brightness of the touch display screen 405 based on the ambient light intensity collected by the optical sensor 415. Specifically, when the ambient light intensity is high, the display brightness of the touch display screen 405 is increased; when the ambient light intensity is low, the display brightness of the touch display screen 405 is turned down. In another embodiment, the processor 401 may also dynamically adjust the shooting parameters of the camera assembly 406 according to the ambient light intensity collected by the optical sensor 415.
A proximity sensor 416, also known as a distance sensor, is typically disposed on the front panel of the terminal 400. The proximity sensor 416 is used to collect the distance between the user and the front surface of the terminal 400. In one embodiment, when the proximity sensor 416 detects that the distance between the user and the front surface of the terminal 400 gradually decreases, the processor 401 controls the touch display screen 405 to switch from the bright screen state to the dark screen state; when the proximity sensor 416 detects that the distance between the user and the front surface of the terminal 400 gradually becomes larger, the processor 401 controls the touch display screen 405 to switch from the breath screen state to the bright screen state.
Those skilled in the art will appreciate that the configuration shown in fig. 4 is not intended to be limiting of terminal 400 and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components may be used.
Fig. 5 is a schematic structural diagram of a server according to an exemplary embodiment, where the server 500 may generate a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 501 and one or more memories 502, where the memory 502 stores at least one instruction, and the at least one instruction is loaded and executed by the processors 501 to implement the methods provided by the above method embodiments. Of course, the server may also have components such as a wired or wireless network interface, a keyboard, and an input/output interface, so as to perform input/output, and the server may also include other components for implementing the functions of the device, which are not described herein again.
The server 500 may be configured to perform the steps performed by the processing device in the image segmentation method.
In an exemplary embodiment, there is also provided a non-transitory computer readable storage medium having instructions therein, which when executed by a processor of a processing device, enable the processing device to perform a method of image segmentation, the method comprising:
acquiring a trained pixel classification model, wherein the pixel classification model is used for determining a classification identifier of each pixel in any image, the classification identifier comprises a first identifier and a second identifier, the first identifier is used for indicating that the corresponding pixel belongs to a foreground region, and the second identifier is used for indicating that the corresponding pixel belongs to a background region;
inputting a target image to be processed into a pixel classification model, and determining a classification identifier of each pixel in the target image based on the pixel classification model;
and determining a foreground area and a background area of the target image according to the classification identification of each pixel in the target image.
In an exemplary embodiment, there is also provided a computer program product, instructions of which, when executed by a processor of a processing apparatus, enable the processing apparatus to perform a method of image segmentation, the method comprising:
acquiring a trained pixel classification model, wherein the pixel classification model is used for determining a classification identifier of each pixel in any image, the classification identifier comprises a first identifier and a second identifier, the first identifier is used for indicating that the corresponding pixel belongs to a foreground region, and the second identifier is used for indicating that the corresponding pixel belongs to a background region;
inputting a target image to be processed into a pixel classification model, and determining a classification identifier of each pixel in the target image based on the pixel classification model;
and determining a foreground area and a background area of the target image according to the classification identification of each pixel in the target image.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.