Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In video monitoring, a large number of face images need to be classified, so that the face images of the same class are classified into the same class. Specifically, the face images of the same person are made to be one single category.
Referring to fig. 1 in detail, fig. 1 is a schematic flow chart of a first embodiment of the image clustering method according to the present invention, and the image clustering method of the present embodiment includes the following steps.
S11, a plurality of pieces of image data are acquired.
In a specific embodiment, the image data may be acquired in real time, specifically, a large number of cameras may capture the image in real time to obtain an image, and the image data is acquired by performing feature processing on the acquired image and stored in a preset database. A preset number of pieces of image data are then periodically retrieved from the database. Specifically, the periodicity may be preset, such as 1h, 12h, 24h, etc., which are not limited herein. The preset number may be specifically preset, and may be set with reference to the calculation amount of the calculation unit, when image data greater than or equal to the preset number exists in the database, the preset number of image data is obtained, and when image data smaller than the preset number exists in the database, all the image data in the database is obtained. And the identification has been obtained for the image data already obtained in the database so that the acquisition will not be repeated the next time.
S12, clustering the image data to obtain at least one image cluster.
And clustering the plurality of pieces of image data to obtain at least one image cluster, specifically, one image cluster is a set of the plurality of pieces of image data with the similarity reaching a certain threshold.
Referring to fig. 2, fig. 2 is a sub-step of step S12 in the first embodiment of the image clustering method according to the present invention. The clustering method of the images comprises the following steps:
s121, calculating the similarity of each piece of image data and other pieces of image data;
and calculating the similarity of the plurality of pieces of image data, specifically calculating the similarity of each piece of image data and other pieces of image data, and taking n pieces of image data as an example, acquiring results of n × n (n-1) pieces of similarity. For example, if five pieces of ABCDE image data exist, the similarity of A-B, A-C, A-D, B-A and B-C is acquired in sequence, and so on until the similarity of E-D. Thereby determining the results of the 20 pieces of similarity.
And S122, using other image data with the similarity larger than the similarity threshold value as a similar image set of each image data.
And taking other image data with the similarity larger than the similarity threshold value as a similar image set of each image data. Specifically, similarity judgment is performed on each piece of image data, and if the similarity between the other pieces of image data and the piece of image data is greater than a similarity threshold, the other pieces of image data are used as a similar image set of the piece of image data.
For example, in the ABCDE five pieces of image data, if the similarity of A-B, B-A, B-C, C-B, D-E and E-D is larger than the threshold value of the similarity. The similar image set of the image data a is (B), the similar image set of the image data B is (a, C), and so on, the similar image set of the image data E is (D).
And S123, sequentially determining whether each piece of image data belongs to the image cluster.
And then sequentially determining each piece of image data, and judging whether the image data are classified into image clusters. For example, a sequence identification may be performed on the plurality of pieces of image data, and the plurality of pieces of image data may be determined sequentially in the sequence of the sequence identification. For example, 1-5 are sequentially marked on the five pieces of ABCDE data, and then the ABCDE is sequentially determined according to the sequence of 1-5. Or other traversal methods are adopted for determination, so as to ensure that each piece of image data in the plurality of pieces of image data is determined and is not repeated.
And S124, if the image data is not classified into the image cluster, taking the image data and the similar image set as a new image cluster.
And if the image data is not classified into the image cluster, taking the image data and the similar image set of the image data as a new image cluster.
Taking five pieces of image data of ABCDE as an example, when determining the ABCDE in sequence, firstly determining A, and taking the similar image sets of A and A as an image cluster if A does not belong to any image cluster because the images are not clustered. Namely, A and B are clustered as an image.
When D is determined, D (E) is taken as a new image cluster because D is not classified into the existing image clusters.
And S125, if the image data is classified into the image cluster, classifying the similar image set into the image cluster.
If the image data has been classified into an image cluster, a similar image set of the image data is also classified into the image cluster.
Taking ABCDE five pieces of image data as an example, when the image cluster (a, (B)) already exists, when B is determined, since B is already included in the image cluster (a, (B)), the similarity image set of B is further included in the image cluster (a, (B)), that is, (a, C) is included in the image cluster (a, (B)), and the image cluster (a, (B), (a, C)) is acquired.
In a particular embodiment, the deduplication process is performed on each image cluster, such that only one per identical image data is stored in each image cluster. If the image clusters (A, (B), (A, C)) are subjected to the deduplication processing, the image clusters become (A, B, C).
By the above manner, each image data is determined in turn, so that each image data is classified into an image cluster.
And S13, acquiring a representative image of the image cluster, wherein the representative image is the image data with the highest quality value in the image cluster.
And in the acquisition of at least one image cluster, further acquiring a representative image of the image cluster, wherein the representative image of one image cluster is the image data with the highest quality value in the image cluster.
In another embodiment, the representative image may also be a cluster center of the image cluster, in particular the image data of the first belonging image cluster. In the image cluster (a, (B)) as in step S124, a is the cluster center of the image cluster (a, (B)), or the representative image set.
Specifically, the representative image may be one image data or a set of a plurality of image data, which is not limited herein.
Specifically, the quality value is a weighted sum of the occlusion coefficient, the blur coefficient, the illumination coefficient, and the three-dimensional angle in the image data. Specifically, the calculation can be performed as follows:
f=occlusion*k1+blur*k2+illumination*k3+Pitch*k4+Roll*k5+Yaw*k6;
wherein f is the quality value of the image data, oclusion is the occlusion coefficient, blu is the blur coefficient, illumination is the illumination coefficient, Pitch is the Pitch angle [ -90 (up), 90 (down) ] in the three-dimensional angle, the in-plane rotation angle [ -180 (counterclockwise), 180 (clockwise) ] in the Roll three-dimensional angle, and Yaw is the left and right rotation angle [ -90 (left), 90 (right) ] in the three-dimensional angle. k1-k6 are different weighting coefficients. The weighting factors of k1-k6 may be preset according to specific situations.
And S14, confirming the collection center matched with the representative image in the image collection library.
And determining whether a set center in the image set library is matched with the representative image, specifically, calculating the similarity between the representative image and the set center in the image set library, and determining whether the set center is matched with the representative image according to whether the similarity is greater than a similarity threshold. If the similarity is greater than the threshold, it is determined that the set center matches the representative image.
And S15, storing the image cluster to an image collection library according to the confirmation result.
In one embodiment, if the image set library does not have a set center matched with the representative image, the image cluster is saved to the image set library as a new image set;
in particular, the image collection library may specifically be a collection of a plurality of image collections.
In a specific embodiment, the image collection library is originally an empty library, and then the representative images of the image clusters obtained in the first batch of steps S11-S13 are compared with the collection center of the image collection library, because the image collection library is an empty library at this time, the same collection center as the representative image cannot be found in the image collection library, and the image clusters are used as a new image collection in the image collection library, and the representative images are used as the collection center.
When determining the image clusters acquired through S11-S13 in the subsequent batch operation, there are partial image collections and collection centers already in the image collection library. Then, the representative images of the image clusters are sequentially compared with the set centers in the image set library, and whether the similarity is greater than a similarity threshold is calculated, where the similarity threshold may be the same as or different from the similarity threshold of the above embodiment, and this is not limited here.
If the similarity is smaller than a certain threshold value, the fact that the set center matched with the representative image does not exist in the image set library is proved, the image cluster is used as a new image set in the image set library, and the representative image is used as the set center.
In another embodiment, if there is a set center matching the representative image in the image set library, the image cluster is saved to the image set corresponding to the matching set center.
If the image set inventory is in the set center matched with the representative image, namely an image set exists in the image set inventory, and the similarity between the set center of the image set and the representative image of the image cluster is greater than a certain similarity threshold, the image cluster is stored in the image set corresponding to the set center.
Specifically, when the image cluster is stored in the image set corresponding to the matched set center, the quality value of the set center of the image cluster representative image and the image set to be stored are compared, and the above steps of calculating the quality value have already been described, and are not described herein again.
And if the quality value of the representative image is larger than that of the set center, taking the representative image as the set center of the image set after the image cluster is added. So that the center of the image collection is the image data of the best quality in the whole image collection.
In the above embodiment, the acquired image data is periodically processed, and primary clustering and secondary clustering are sequentially performed on the image data. And pre-clustering the image data in a primary cluster, so that the image data with the similarity larger than a certain similarity threshold value is used as an image cluster, and determining the image data with the best quality value in the image cluster as a representative image. And because in one clustering, similarity calculation is performed on each image data in sequence, taking five image data ABCDE as an example, the similarity between the image data A and the image data B is greater than a similarity threshold, the similarity between the image data B and the image data C is greater than the similarity threshold, but the image data A and the image data C are properly compared, and the similarity is smaller than the similarity threshold, but through the above manner, the image data C and the image data A can still be stored in the same clustering. By the method, similar image data can be stored in one cluster as much as possible. The error is greatly reduced.
And in secondary clustering, comparing the representative image with a collection center of the image collection library. Thereby further saving the image data. So that the image data with the similarity reaching a certain threshold value is stored in an image set in the image set library. Because the number of image sets in the image set library is generally large, only the representative images are compared after one-time clustering. Therefore, the calculation amount is greatly reduced, and the batch image data can be rapidly classified so as to be stored in the image collection library.
Furthermore, compared with the prior art, the method provided by the application can be suitable for a large number of camera systems, and the classification and storage of image data are optimized by establishing an image collection library to be interconnected with each camera.
As shown in fig. 3, the present application further provides anapparatus 300 for determining clusters of images, where theapparatus 300 for determining clusters of images includes an obtainingmodule 31, aprocessing module 32, a confirmingmodule 33, and astoring module 34. The acquiringmodule 31 is configured to acquire a plurality of pieces of image data; theprocessing module 32 is configured to perform clustering processing on the plurality of pieces of image data to obtain at least one image cluster, and obtain a representative image of the image clusters, where the representative image is image data with the highest quality value in the image clusters; the confirmingmodule 34 is used for confirming a collection center matched with the representative image in the image collection library; the storage module 35 is configured to store the image clusters to the image collection library according to the confirmation result. The specific steps of the method have already been described in the above embodiments, and are not described herein again.
The image clustering determination method is generally realized by an image clustering determination device, so the invention also provides an image clustering determination device. Referring to fig. 4, fig. 4 is a schematic structural diagram of an embodiment of the image cluster determining device of the present invention. Thecluster determining apparatus 100 of the image of the present embodiment includes aprocessor 12 and amemory 11; thememory 11 has stored therein a computer program for execution by theprocessor 12 for implementing the steps of the method for cluster determination of images as described above.
The logical processes of the above-described clustering determination method of images are presented as a computer program, and on the computer program side, if it is sold or used as a stand-alone software product, it may be stored in a computer storage medium, and thus the present invention proposes a computer storage medium. Referring to fig. 5, fig. 5 is a schematic structural diagram of acomputer storage medium 200 according to an embodiment of the present invention, in which acomputer program 21 is stored, and the computer program is executed by a processor to implement the distribution network method or the control method.
Thecomputer storage medium 200 may be a medium that can store a computer program, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk, or may be a server that stores the computer program, and the server may send the stored computer program to another device for running or may run the stored computer program by itself. Thecomputer storage medium 200 may be a combination of a plurality of entities from a physical point of view, for example, a plurality of servers, a server plus a memory, or a memory plus a removable hard disk.
The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes performed by the present specification and drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.