Disclosure of Invention
The invention aims to provide an image composition quality evaluation method, an image composition quality evaluation device, electronic equipment and a computer readable storage medium, and aims to overcome the defects of the existing image quality evaluation method.
In a first aspect, an embodiment of the present invention provides an image composition quality evaluation method, including: 101: acquiring a completeness score and a position score of at least one specific target in an image; 102: the composition quality of the image is evaluated based on the integrity score and the location score of the at least one specific target.
In the optimization scheme of this embodiment, thestep 101 is: acquiring a completeness score, a position score and a picture proportion score of at least one specific target in an image; thestep 102 is: and evaluating the composition quality of the image according to the integrity score, the position score and the picture proportion score of at least one specific target.
In a specific scheme of this embodiment, the integrity score instep 101 is obtained by inputting the image to be detected into a specific target integrity detection model.
Further, the method for constructing the specific target integrity detection model includes: 201: acquiring a plurality of images containing a specific target and a plurality of images not containing the specific target; 202: marking and grading each image according to the integrity of a specific target in the image; 203: and inputting the marked image into a pre-constructed convolutional neural network for training to generate a trained specific target integrity detection model.
Further, in a specific aspect of this embodiment, thestep 202 includes: marking the image containing the complete specific target as a first class and setting a first score; labeling images containing a main part of the specific target as a second class and setting a second score; marking the images containing other parts of the specific target as a third class and setting a third score; images that do not contain the particular target are marked as a fourth class and a fourth score is set.
Further, to facilitate the design of the integrity detection model for a specific target,step 203 further includes scaling all images to the same specific size.
Further, in a specific aspect of this embodiment, the position score is associated with a distance from the center position of the specific target to the center position of the image, that is, the position score can be obtained by calculating the distance from the center position of the specific target to the center position of the image.
In a second aspect, an embodiment of the present invention further provides a method for evaluating quality of a video composition, where the method includes: 301: acquiring each video frame of a video to be evaluated; 302: carrying out composition quality evaluation on each video frame according to the image composition quality evaluation method; 303: and evaluating the composition quality of the video according to the composition quality evaluation result of each video frame.
In a third aspect, an embodiment of the present invention further provides an image composition quality evaluation apparatus, including: the acquisition module is used for acquiring the integrity score and the position score of at least one specific target in the image; and the composition quality evaluation module is used for evaluating the composition quality of the image according to the integrity score and the position score of at least one specific target.
In a fourth aspect, the present invention provides an electronic device, comprising: a memory storing a computer program; a processor for executing the computer program to implement the image composition quality evaluation method described above.
In a fifth aspect, the present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the above to implement the above-described image composition quality evaluation method.
Compared with the prior art, the method and the device have the advantages that the completeness and the position reasonableness of the specific target in the image are evaluated, the traditional aesthetic evaluation of the bottom layer visual characteristics is solved, the factors such as the completeness and the position reasonableness of the object in the image are not fully considered, and the filtering of the image with poor composition quality in automatic editing is facilitated.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more clearly apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In order to explain the technical means of the present invention, the following description will be given by way of specific examples.
Example 1
As shown in fig. 1, the present embodiment discloses an image composition quality evaluation method, which includes the following steps.
101: a completeness score in the image and a location score in the image are obtained for at least one particular target.
The specific target in this embodiment refers to a target for evaluating the composition quality of an image, and may be a type of object (e.g., a bird, a fish, etc.) or a single individual (e.g., a person, a dog, etc.), so if a plurality of objects or persons are included in an image, if the specific targets are different, the integrity score and the position score are also different.
The manner of acquiring the specific target in this embodiment may also be various, for example, when the method is used on a camera or a mobile phone, a user may select one or more specific targets on a related interface, or detect one or more specific targets in an image through a standard detection model.
In this embodiment, the score of the integrity score can be set for the image from four dimensions: the image containing the complete specific target is set with a first score, the image containing the main constituent of the specific target is set with a second score, the image containing the other constituent of the specific target is set with a third score, and the image not containing any portion of the specific target is set with a fourth score, wherein the first score to the fourth score are different.
In this embodiment, the position score is related to the distance from the center position (e.g., geometric center) of the specific target to the center position of the image, and the larger the distance, the lower the score, and the smaller the distance, the higher the score. The location score may be discrete, e.g., a set first score for a center location of a particular target to a center location of the image being less than a first distance, a set second score between the first distance and a second distance, and so on; the location score may also be generated based on a statistical model, for example, using a 2D gaussian model to model the location score density for a particular target, the expression of the 2D gaussian model being as follows:

where n is a dimension, n =2 in this embodiment, μ is a coordinate of a center position of the picture or a position near the center of the picture, and the attenuation rate of the position score can be adjusted by adjusting the covariance matrix Σ. For a human body with a rectangular frame with the position of (x1, y1, x2 and y2), wherein x1 and y1 are coordinates of the upper left corner of the rectangular frame, x2 and y2 are coordinates of the lower right corner of the human body frame, the position score S of the human body frame can be obtained by integrating p (x) of the rectangular frame where the human body is located
position。
The integrity scoring process in this embodiment will be described below by taking a specific target as an example. The figure integrity score in this embodiment is obtained by inputting the image to be detected into the figure integrity detection model. As shown in fig. 2, the human integrity detection model in this embodiment is constructed as follows.
201: a plurality of images containing the specific object and a plurality of images not containing the specific object are acquired.
Specifically, a plurality of images containing persons and a plurality of images not containing persons are obtained, it should be noted that the persons in the images should satisfy diversity to satisfy the robustness of the model, for example, the persons in the images have enough image characteristics in size, skin color, wearing, position, a plurality of persons, and the like.
202: and labeling and grading each image according to the integrity of the specific target in the image.
In this embodiment, the pictures are divided into four types according to the integrity of the persons in the images, specifically, the images containing the intact persons are marked as a first type and a first score is set; marking the image containing the main part of the person as a second type and setting a second score; marking the images containing other parts of the person as a third type and setting a third score; and marking the image without the person as a fourth class and setting a fourth score, wherein each score can be the same or different. The details are shown in the following table.
| Categories | Detailed Description | Score of |
| First kind | The image has complete human limbs, and the picture has no limb missing or obstruction | 10 |
| Second class | Sample of a person in an image with intact head and torso but incomplete lower limbs | 5 |
| Class III | Sample of partially occluded or missing head, torso and the like of person in image | -10 |
| Class IV | The image does not contain a person | 0 |
It should be noted that the score setting in the table is only a specific scheme in this embodiment, and in this scheme, it is beneficial to screen out images including complete characters, and filter out images not including character images and missing head and trunk of characters, so as to meet the aesthetic choice of people for the composition of characters.
203: and inputting the marked image into a pre-constructed convolutional neural network for training to generate a trained specific target integrity detection model.
In the process of markingBefore the later images are input into the pre-constructed convolutional neural network for training, all the images are preferably scaled to the same specific size to facilitate the design of the human integrity detection model. Specifically, the input image is analyzed by using a deep convolution neural network, and the input image is scaled to 224 size to obtain the image characteristic R224*224Outputting the classification of human body integrity classification, and training the network model parameters by using a gradient descent algorithm until the training completion condition is reached, such as reaching the preset training times or the training accuracy or the loss value reaches the expected number, thereby completing the training of the character integrity detection model. In the training process of the embodiment, the network parameters can be optimized by calculating the cross entropy loss value between the output probability distribution of the deep neural network and the real category.
After the character integrity detection model is constructed, the image to be detected can be input into the model, so that the probability distribution of the character integrity output by the character integrity detection model is obtained, the category with the maximum probability is selected, and the category is converted into the score. Specifically, the image to be detected is scaled to 224 through a bilinear difference scaling algorithm, and then a category probability distribution vector R representing the integrity of the human body in the image is obtained after the image is processed through a deep neural network4Directly using the output of the deep neural network as the prediction probability vector of the picture, selecting the index number with the maximum probability as the category prediction result, and then obtaining the integrity score S of the figurecategory。
Likewise, the location score can also be obtained by inputting the image into a correlation model or computer.
102: the composition quality of the image is evaluated based on the integrity score and the location score of the at least one specific target.
As shown in fig. 3,step 102 comprises the following two substeps in this embodiment.
1021: and weighting and summing the integrity score and the position score of each specific target to obtain the composition quality score of the specific target.
The integrity score and position of a person in an image are obtained via step 101And scoring, and then carrying out weighted summation on the integrity score and the position score of the person to obtain the composition quality score of the person. Composition quality score for the ith person SiThe calculation formula of (2) is as follows: si=α*Scategory-i+β*Sposition-iWherein, α and β are weight parameters, α + β =1, α and β can be adjusted according to actual conditions, Scategory-iScoring the integrity of the ith character, Sposition-iThe location of the ith person is scored.
1022: the composition quality of the image is evaluated based on an average score or a weighted score of the composition quality scores of each particular target in the image.
When composition quality evaluation is performed on only one person, only the image composition quality score of the person is calculated; when composition quality evaluation is performed on a plurality of persons, an average score or a weighted score of image composition quality scores of the plurality of persons can be calculated as evaluation basis. In a specific aspect of this embodiment, the image composition quality evaluation scores S of the multiple persons are: s =1/n
。
It is to be noted that when composition quality evaluations are performed for a plurality of specific targets, the plurality of specific targets may belong to different categories, such as a person and a pet, respectively.
Example 2
As shown in fig. 2, the present embodiment discloses another image composition quality evaluation method.
The integrity score in the image and the location score in the image for a particular target in this embodiment are substantially the same, except that: in this embodiment, in addition to obtaining the integrity score and the position score of the specific target in the image, the picture proportion score S of the specific target in the image is also performedproportion. It should be noted that the picture proportion score is related to a specific target, that is, the specific target is different, and the picture proportion score is different. For example, a humanThe picture proportion of the object in the image is more suitable from 1/6 to 1/3, and the picture proportion score in the case is higher; for another example, birds have a high picture aspect ratio in the images from 1/24 to 1/12, in which case the picture aspect ratio score is high. The specific implementation mode is as follows: and calculating the ratio or difference between the actual picture ratio and the ideal picture ratio of the specific target in the image according to the category and the area of the specific target detected by the detector to generate a picture ratio score.
Similarly, the description will be made with respect to a specific target person, and the composition quality score S for the ith personiThe calculation formula of (2) is as follows: si=α*Scategory-i+β*Sposition-i+γ*Sproportion-iWherein, α, β, γ are weight parameters, α + β + γ =1, α, β, γ can be adjusted according to actual conditions, Scategory-iScoring the integrity of the ith character, Sposition-iScoring the location of the ith person, Sproportion-iAnd scoring the aspect ratio of the ith character.
Example 3
As shown in fig. 3, the present embodiment discloses a video composition quality evaluation method, which includes the following steps.
301: and acquiring each video frame of the video to be evaluated.
Video frames of a video captured by a capture device, including but not limited to a camera, smartphone, etc., are acquired.
302: and carrying out composition quality evaluation on each video frame according to an image composition quality evaluation method.
And carrying out composition quality evaluation on each video frame according to the composition quality evaluation method in the embodiment 1 or the embodiment 2 to obtain a composition quality score of each video frame.
303: and evaluating the composition quality of the video according to the composition quality evaluation result of each video frame.
Throughstep 302, the composition quality score of each video frame can be obtained, and the composition quality of the video is evaluated according to the average score of the composition quality scores of the video frames.
Example 4
As shown in fig. 4, the present embodiment discloses an image composition quality evaluation apparatus, including: the acquisition module is used for acquiring the integrity score and the position score of at least one specific target in the image; and the composition quality evaluation module is used for evaluating the composition quality of the image according to the integrity score and the position score of at least one specific target. Wherein, the integrity score and the position score can refer to the related description in the embodiment 1. In the optimization scheme of this embodiment, the obtaining module may further be configured to obtain a picture ratio score of the at least one specific target in the image, and the composition quality evaluating module is configured to evaluate the composition quality of the image according to the integrity score, the position score, and the picture ratio score of the at least one specific target.
Example 5
As shown in fig. 5, the present embodiment discloses an electronic device, including: a memory storing a computer program; a processor for executing the computer program to implement the image composition quality evaluation method in embodiment 1 or embodiment 2, or to implement the video composition quality evaluation method in embodiment 3. The electronic device in this embodiment may specifically be a camera or a mobile phone.
Example 6
Provided in this embodiment is a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the image composition quality evaluation method in embodiment 1 or embodiment 2, or implements the video composition quality evaluation method in embodiment 3.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by a program instructing associated hardware, and the storage medium may be a computer-readable storage medium, such as a ferroelectric Memory (FRAM), a Read Only Memory (ROM), a Programmable Read Only Memory (PROM), an Erasable Programmable Read Only Memory (EPROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a flash Memory, a magnetic surface Memory, an optical disc, or a Compact disc Read Only Memory (CD-ROM), etc.; or may be various devices including one or any combination of the above memories.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.