Method for searching video by using graphTechnical Field
The invention relates to the technical field of video searching, in particular to a method for searching videos by using pictures.
Background
The technology for searching videos by pictures comprises the related technologies in the classical pattern recognition and deep learning field, and the principle is that the optimal combination of massive video searching in precision and speed is achieved through the fusion of the classical pattern recognition technology and the deep learning technology. However, the video is searched by a graph at present, the calculation speed is low, each calculation requires a calculation time of minutes or even hours, and during the period, a user cannot operate software and only waits for the completion of calculation; meanwhile, the multi-core characteristic of the modern CPU cannot be fully utilized, no matter how many processing cores the user has, only one of the processing cores can be utilized, the resource utilization rate is low, and the computing resources are consumed, and in particular, the deep learning technology needs special GPU resources to accelerate the learning process; the method has the advantages that the expandability is insufficient, massive training data are required to be prepared in advance by the technologies such as pattern recognition and deep learning, the adaptability of training results is poor, and the method is strongly associated with the selection of samples and can only be suitable for limited types of scenes.
Disclosure of Invention
The invention aims to provide a method for searching videos by using graphs, which aims to solve the problems in the background technology.
In order to achieve the above object, the present invention provides a method for searching video in a graph, including a data index creating stage and a video searching stage, wherein the data index creating stage includes the following steps:
s1.1, reading video frame pictures;
s1.2, calculating digital fingerprints of the picture frames;
s1.3, dividing the fingerprint into a plurality of sections according to 16 bits as one section;
s1.4, circularly traversing all the segments, and putting the fingerprints into index directories corresponding to the segments;
s1.5, adding fingerprint data into an index file;
the video searching stage comprises the following steps:
s2.1, reading video screenshot data to be searched;
s2.2, calculating screenshot fingerprints;
s2.3, circularly obtaining data indexes under different fingerprint segments;
s2.4, searching fingerprints through a data index;
and S2.5, obtaining video information and corresponding frames through the searched fingerprints.
Preferably, in S1.1, the method for reading the video frame picture includes: and restoring the video and audio compression coding data into uncompressed video, and decoding to obtain uncompressed video color data.
Preferably, in S1.2, the method for calculating the digital fingerprint of the picture frame is as follows: through a perceptual Hash algorithm, gray calculation is firstly carried out on an original image, after the image is reduced to 8x8 pixels, 64-bit binary data is stored in an array to be used as a 64-bit image fingerprint.
Preferably, in S1.4, the method for placing the fingerprint into the index directory corresponding to the segment includes: four catalogues are established in a file system, corresponding fingerprints are divided into serial numbers corresponding to 4 sections, serial number values are expressed as 1, 2, 3 and 4, 2-16=65536 Hash catalogues are established under each serial number and are respectively expressed by 1-65535, 10 files are established under each catalogue, and the complete 64-bit Hash values are stored in the files.
Preferably, in S1.4, the method of cycling through all segments is as follows: the complete 64-bit Hash value is stored in the file through a uniquely determined writing path of "/segmentation number/current segment Hash directory/Hash file".
Preferably, in S2.2, the method for calculating the screenshot fingerprint is as follows: a digital fingerprint is obtained using a perceptual Hash algorithm.
Preferably, in S2.3, the method for circularly obtaining the data indexes under different fingerprint segments includes: the fingerprint is segmented into 4 segments, and then segment indexing is performed from the first segment.
Preferably, in S2.4, the method for searching the fingerprint through the data index is as follows: and reading all files under the section number\current section Hash value\of the catalog, traversing the whole file content, calculating the Hamming distance, and returning the fingerprint with the minimum Hamming distance.
Preferably, in S2.5, the method for obtaining the video information and the corresponding frame through the searched fingerprint includes: after the fingerprint is identified, the database is queried to obtain the video and the frame of the video.
Preferably, the segment indexing method comprises the following steps: and storing the fingerprints with the same current segment in the 4 segments under the catalogues corresponding to the segment numbers, and storing the fingerprints with the same current segment in the files corresponding to the segments.
Compared with the prior art, the invention has the beneficial effects that: according to the method for searching the video by the graph, the searching range is effectively reduced through segmentation of the frame image fingerprint, the searching speed is improved, and meanwhile, a final result is rapidly positioned in a distributed processing mode of multiple sections, so that a target frame, a specific video and the number of frames in which the target frame is positioned can be rapidly and accurately searched.
Drawings
FIG. 1 is a schematic diagram of a frame image fingerprint segmentation of the present invention;
FIG. 2 is a flow chart of the multi-node real-time search of the present invention;
FIG. 3 is a schematic diagram of video frame fingerprint computation and segmentation of the present invention;
FIG. 4 is a diagram of a video frame fingerprint segment storage format of the present invention;
fig. 5 is a diagram of a search process for graphically searching for video in accordance with the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1-5, the present invention provides a technical solution:
the invention provides a method for searching video by using a graph, which comprises a data index creating stage and a video searching stage, wherein the data index creating stage comprises the following steps of:
s1.1, reading video frame pictures;
s1.2, calculating digital fingerprints of the picture frames;
s1.3, dividing the fingerprint into a plurality of sections according to 16 bits as one section;
s1.4, circularly traversing all the segments, and putting the fingerprints into index directories corresponding to the segments;
s1.5, adding fingerprint data into an index file;
the search video phase includes the steps of:
s2.1, reading video screenshot data to be searched;
s2.2, calculating screenshot fingerprints;
s2.3, circularly obtaining data indexes under different fingerprint segments;
s2.4, searching fingerprints through a data index;
and S2.5, obtaining video information and corresponding frames through the searched fingerprints.
In this embodiment, in S1.1, the method for reading the video frame picture is as follows: video and audio compression coding data are restored into uncompressed video and audio original data, compression coding standards of audio comprise AAC, MP3, AC-3 and the like, video compression coding standards comprise H.264, MPEG2, VC-1 and the like, and uncompressed video color data such as YUV420P, RGB and uncompressed audio data such as PCM and the like are obtained through decoding.
Further, in S1.2, the method for calculating the digital fingerprint of the picture frame includes: through a perceptual Hash algorithm, gray calculation is firstly carried out on an original image, after the image is reduced to 8x8 pixels, 64-bit binary data is stored in an array to be used as a 64-bit image fingerprint.
Specifically, in S1.3, the method for dividing the fingerprint into multiple sections according to 16 bits as one section is as follows: since the video frame to be searched and the picture used for searching have the same or mostly the same content (neglecting watermarks, station marks and the like), and assuming that the maximum fault tolerance 3 bits in the picture fingerprints are different, the fingerprints are divided into 4 segments by taking 16 bits as one segment.
In S1.4, the method for placing the fingerprint into the index directory corresponding to the segment includes: four catalogues are established in a file system, corresponding fingerprints are divided into serial numbers corresponding to 4 sections, serial number values are expressed as 1, 2, 3 and 4, 2-16=65536 Hash catalogues are established under each serial number and are respectively expressed by 1-65535, 10 files are established under each catalogue, complete 64-bit Hash values are stored in the files, and estimation is carried out: assuming that 80 ten thousand 1 hour videos are available, a total of about 720 hundred million frames are generated, each hash directory needs to store 720 hundred million/65535≡120 ten thousand fingerprints, and each file stores 120 ten thousand/10=12 ten thousand hash values.
In addition, in S1.4, the method of cycling through all segments is: the complete 64-bit Hash value is stored into the file through a uniquely determined writing path of "/segmentation number/current segment Hash directory/Hash file", specifically: first, the scheme assumes that the video frame to be searched is identical or mostly identical to the picture content used for searching (neglecting watermarks, station marks, etc.), and that the maximum fault tolerance 3 bits in the picture fingerprint are different, that is, if the fingerprint is segmented into 4 segments, one segment is identical. Therefore, the scheme can search all Hash values under the 'segmentation number/current segment Hash directory/Hash file', and obtain corresponding nearest similar Hash values by comparing one Hash with the nearest Hamming distance in the Hash values as output.
It should be noted that, the specific method for searching the video phase is as follows: the method comprises the steps of obtaining a digital fingerprint (see how to calculate the digital fingerprint of a picture frame in detail) by using a perceptual Hash algorithm, segmenting the fingerprint (see how to segment the fingerprint in detail), starting from a first segment after the fingerprint is segmented into 4 segments, carrying out segment indexing, and reading all files under the catalog segment number/current segment Hash value. Since all fingerprints with the same Hash value are stored in the file, we only need to traverse the whole file content, calculate the hamming distance, and return the fingerprint with the smallest hamming distance, and record the fingerprint as "A1". Similarly, we walk through the remaining three hashes and find the fingerprints "A2", "A3", "A4" with the smallest hamming distance. Next we compare these four fingerprints to find the fingerprint closest to hamming, such as: "A2". Finally, we determine that the video frame corresponding to "A2" is the one that we want to find. Because of the large number of hashes in traversed files, we store the hashed files deliberately as multiple files when they are stored. Thus, a plurality of files can be traversed by a plurality of processing nodes at the same time, and searching time is reduced.
Still further, the method for obtaining video information and corresponding frames through the searched fingerprints comprises the following steps: when fingerprint data is input, the association relation between the fingerprint Hash value and the video and the frame number of the corresponding video is stored into the data, so that after the fingerprint is identified, the database is only queried to obtain the video and the frame number of the video.
Specifically, the segment indexing method comprises the following steps: and storing the fingerprints with the same current segment in the 4 segments under the catalogues corresponding to the segment numbers, and storing the fingerprints with the same current segment in the files corresponding to the segments.
The foregoing has shown and described the basic principles, principal features and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the above-described embodiments, and that the above-described embodiments and descriptions are only preferred embodiments of the present invention, and are not intended to limit the invention, and that various changes and modifications may be made therein without departing from the spirit and scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.