Movatterモバイル変換


[0]ホーム

URL:


CN114155456A - Repeated video recognition method and related device - Google Patents

Repeated video recognition method and related device
Download PDF

Info

Publication number
CN114155456A
CN114155456ACN202111298828.6ACN202111298828ACN114155456ACN 114155456 ACN114155456 ACN 114155456ACN 202111298828 ACN202111298828 ACN 202111298828ACN 114155456 ACN114155456 ACN 114155456A
Authority
CN
China
Prior art keywords
multimedia data
frame
video file
matching
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111298828.6A
Other languages
Chinese (zh)
Inventor
王国彬
牟锟伦
卢铄波
林虔赐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tubatu Group Co Ltd
Original Assignee
Tubatu Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tubatu Group Co LtdfiledCriticalTubatu Group Co Ltd
Priority to CN202111298828.6ApriorityCriticalpatent/CN114155456A/en
Publication of CN114155456ApublicationCriticalpatent/CN114155456A/en
Pendinglegal-statusCriticalCurrent

Links

Images

Classifications

Landscapes

Abstract

Translated fromChinese

本申请适用于数据处理领域,提供了一种重复视频识别方法以及相关装置,实现在庞大数量的视频中识别重复视频。本申请重复视频识别方法主要包括:分离出待识别视频文件的第一多媒体数据流;提取第一多媒体数据流中的第一多媒体数据特征集合,第一多媒体数据特征集合包括若干第一多媒体数据帧;将第一多媒体数据帧与对比视频文件的第二多媒体数据帧进行匹配,得到匹配序列对集合,匹配序列对集合包括若干匹配序列对;判断匹配序列对集合所占的比重是否超过预设阈值;若比重超过预设阈值,则确定待识别视频文件与对比视频文件重复;若比重未超过预设阈值,则确定待识别视频文件与对比视频文件未重复。

Figure 202111298828

The present application is applicable to the field of data processing, and provides a method for recognizing repeated videos and a related device, so as to realize recognizing repeated videos in a huge number of videos. The duplicate video identification method of the present application mainly includes: separating a first multimedia data stream of a video file to be identified; extracting a first multimedia data feature set in the first multimedia data stream, the first multimedia data feature set The set includes several first multimedia data frames; the first multimedia data frame is matched with the second multimedia data frame of the comparison video file to obtain a set of matching sequence pairs, and the matching sequence pair set includes several matching sequence pairs; Determine whether the proportion of the matching sequence to the set exceeds the preset threshold; if the proportion exceeds the preset threshold, it is determined that the video file to be identified is duplicated with the comparison video file; if the proportion does not exceed the preset threshold, then the to-be-identified video file is determined to be compared with the comparison Video files are not duplicated.

Figure 202111298828

Description

Duplicate video identification method and related device
Technical Field
The present application relates to the field of data processing, and in particular, to a method and a related apparatus for identifying duplicate videos.
Background
With the rapid development of the internet, videos gradually become application hotspots of the internet, the videos are conveniently spread on the internet, the information content of the videos is rich compared with other contents, and users can shoot, edit and upload video clips at any time with the popularization of smart phones. For example, the YouTube website has tens of millions of new videos every day. For a video platform, as the number of videos increases, it is inevitable that more repeated videos exist, which causes many repetitions of videos hit by a user when searching for keywords, and affects the user experience on one hand, and the efficiency of video search on the other hand.
For the huge number of videos of the video platform, how to identify the repeated videos in the huge number of videos is of great significance.
Disclosure of Invention
The application aims to provide a repeated video identification method and a related device, which can realize the identification of repeated videos in a huge number of videos.
The application is realized as follows:
the application provides a repeated video identification method in a first aspect, which includes:
separating a first multimedia data stream of a video file to be identified;
extracting a first multimedia data feature set in the first multimedia data stream, wherein the first multimedia data feature set comprises a plurality of first multimedia data frames;
matching the first multimedia data frame with a second multimedia data frame of a comparison video file to obtain a matching sequence pair set, wherein the matching sequence pair set comprises a plurality of matching sequence pairs;
judging whether the proportion of the matching sequence pair set exceeds a preset threshold value or not;
if the specific gravity exceeds the preset threshold value, determining that the video file to be identified and the comparison video file are repeated;
and if the specific gravity does not exceed the preset threshold, determining that the video file to be identified and the comparison video file are not repeated.
Optionally, the determining whether the proportion occupied by the matching sequence pair set exceeds a preset threshold includes:
comparing the duration of the video file to be identified with the duration of the comparison video file to obtain a target video file, wherein the target video file is one of the video file to be identified and the comparison video file with shorter duration, or the target video file is one of the video file to be identified and the comparison video file with the same duration;
and judging whether the proportion of the duration of the matching sequence pair set occupying the duration of the target video file exceeds a preset threshold value or not.
Optionally, after obtaining the matching sequence pair set, before determining whether the proportion of the matching sequence pair set exceeds a preset threshold, the method further includes:
obtaining a valid matching frame sequence pair from a plurality of matching sequence pairs, wherein the matching sequence pair set comprises the valid matching frame sequence pair.
Optionally, the obtaining of valid matching frame sequence pairs of the matching sequence pairs includes:
respectively extracting a first frame number of the first multimedia data frame and a second frame number of the second multimedia data frame in each pair of the matching frame sequence pairs;
judging whether the first frame numbers and the second frame numbers of adjacent matching frame sequence pairs are in incremental correspondence or not;
and if the first frame number and the second frame number of the adjacent matching frame sequence pair are in incremental correspondence, determining that the first multimedia data frame and the second multimedia data frame are effective matching frame sequence pairs.
Optionally, after extracting a first frame number of the first multimedia data frame and a second frame number of the second multimedia data frame in each pair of the matching frame sequence pairs, respectively, the method further includes:
extracting the label of the first frame number of the first multimedia data frame in each pair of the matching frame sequence pairs as a coordinate value of a first coordinate axis;
extracting the label of the second frame number of the second multimedia data frame in each pair of the matched frame sequence pairs as a coordinate value of a second coordinate axis;
and establishing a target coordinate system by using the first coordinate axis and the second coordinate axis.
Optionally, after establishing the target coordinate system by using the first coordinate axis and the second coordinate axis, the method further includes:
if the first frame number of the first multimedia data frame in each pair of the matching frame sequence pairs is effectively matched with the second frame number of the second multimedia data frame, forming an identifier at a target coordinate point on the target coordinate system, wherein the target coordinate point takes the first frame number and the second frame number as coordinate values.
Optionally, the first multimedia data stream includes: a first video stream and/or a first audio stream;
the first multimedia data frame comprises: a first video frame and/or a first audio frame;
the second multimedia data frame comprises: a second video frame and/or a second audio frame.
Optionally, the first video frame includes a home decoration domain feature.
A second aspect of the present application provides a duplicate video identification apparatus, including:
the device comprises a separation unit, a recognition unit and a recognition unit, wherein the separation unit is used for separating a first multimedia data stream of a video file to be recognized;
an extracting unit, configured to extract a first multimedia data feature set in the first multimedia data stream, where the first multimedia data feature set includes a plurality of first multimedia data frames;
the matching unit is used for matching the first multimedia data frame with a second multimedia data frame of a comparison video file to obtain a matching sequence pair set, and the matching sequence pair set comprises a plurality of matching sequence pairs;
the judging unit is used for judging whether the proportion of the matching sequence pair set exceeds a preset threshold value or not;
the first determining unit is used for determining that the video file to be identified and the comparison video file are repeated if the specific gravity exceeds the preset threshold;
and the second determining unit is used for determining that the video file to be identified and the comparison video file are not repeated if the specific gravity does not exceed the preset threshold.
Optionally, when determining whether the proportion occupied by the matching sequence pair set exceeds a preset threshold, the determining unit is specifically configured to:
comparing the duration of the video file to be identified with the duration of the comparison video file to obtain a target video file, wherein the target video file is one of the video file to be identified and the comparison video file with shorter duration, or the target video file is one of the video file to be identified and the comparison video file with the same duration;
and judging whether the proportion of the duration of the matching sequence pair set occupying the duration of the target video file exceeds a preset threshold value or not.
Optionally, the apparatus further comprises:
an obtaining unit, configured to obtain an effective matching frame sequence pair from among a plurality of matching frame sequence pairs, where the matching frame sequence pair set includes the effective matching frame sequence pair.
Optionally, when obtaining valid matching frame sequence pairs in a plurality of matching sequence pairs, the obtaining unit is specifically configured to:
respectively extracting a first frame number of the first multimedia data frame and a second frame number of the second multimedia data frame in each pair of the matching frame sequence pairs;
judging whether the first frame numbers and the second frame numbers of adjacent matching frame sequence pairs are in incremental correspondence or not;
and if the first frame number and the second frame number of the adjacent matching frame sequence pair are in incremental correspondence, determining that the first multimedia data frame and the second multimedia data frame are effective matching frame sequence pairs.
Optionally, the apparatus further comprises:
the extracting unit is further used for extracting the label of the first frame number of the first multimedia data frame in each pair of the matching frame sequence pairs as a coordinate value of a first coordinate axis;
the extracting unit is further used for extracting the label of the second frame number of the second multimedia data frame in each pair of the matching frame sequence pairs as a coordinate value of a second coordinate axis;
and the establishing unit is used for establishing a target coordinate system by using the first coordinate axis and the second coordinate axis.
Optionally, the apparatus further comprises:
a forming unit, configured to form an identifier at a target coordinate point on the target coordinate system if the first frame number of the first multimedia data frame in each pair of the matching frame sequence pairs is effectively matched with the second frame number of the second multimedia data frame, where the target coordinate point uses the first frame number and the second frame number as coordinate values.
Optionally, the first multimedia data stream includes: a first video stream and/or a first audio stream;
the first multimedia data frame comprises: a first video frame and/or a first audio frame;
the second multimedia data frame comprises: a second video frame and/or a second audio frame.
Optionally, the first video frame includes a home decoration domain feature.
A third aspect of the present application provides a computer device comprising:
the system comprises a processor, a memory, a bus, an input/output interface and a wireless network interface;
the processor is connected with the memory, the input/output interface and the wireless network interface through a bus;
the memory stores a program;
the processor, when executing the program stored in the memory, implements the duplicate video identification method of any of the preceding first aspects.
A fourth aspect of the present application provides a computer-readable storage medium having stored therein instructions which, when executed on a computer, cause the computer to perform the duplicate video identification method of any one of the preceding first aspects.
A fifth aspect of the present application provides a computer program product which, when executed on a computer, causes the computer to perform the duplicate video identification method of any one of the preceding first aspects.
According to the technical scheme, the embodiment of the application has the following advantages:
the repeated video identification method extracts a first multimedia data feature set in a first multimedia data stream by separating the first multimedia data stream of a video file to be identified, wherein the first multimedia data feature set comprises a plurality of first multimedia data frames, and the first multimedia data feature set is a video feature set of the video file to be identified; then matching the first multimedia data frame with a second multimedia data frame of the comparison video file to obtain a matching sequence pair set, wherein the matching sequence pair set comprises a plurality of matching sequence pairs, and the matching sequence pair set is a repeated part of the video file to be identified and the comparison video file; judging whether the proportion of the matching sequence pair set exceeds a preset threshold, and if so, determining that the video file to be identified and the comparison video file are repeated; if the specific gravity does not exceed the preset threshold, it is determined that the video file to be identified and the comparison video file are not repeated, and the preset threshold is used as a judgment standard for judging whether the video file to be identified and the comparison video file are repeated or not, so that the size of the preset threshold is adjusted according to actual needs to adapt to requirements. Therefore, the repeated video identification method can identify the repeated videos in a large number of videos, so that the user experience is optimized on one hand, and the video searching efficiency is improved on the other hand.
Drawings
FIG. 1 is a schematic flow chart diagram illustrating an embodiment of a repetitive video recognition method according to the present application;
FIG. 2 is a schematic flow chart illustrating another embodiment of a repetitive video recognition method according to the present application;
FIG. 3 is a schematic structural diagram of an embodiment of a duplicate video recognition apparatus according to the present application;
FIG. 4 is a schematic structural diagram of another embodiment of a duplicate video recognition apparatus according to the present application;
FIG. 5 is a schematic structural diagram of an embodiment of a computer apparatus of the present application;
fig. 6 is a schematic diagram of an embodiment of a target coordinate system established in the repeated video recognition method of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
It will be understood that when an element is referred to as being "secured to" or "disposed on" another element, it can be directly on the other element or intervening elements may also be present. When an element is referred to as being "connected to" another element, it can be directly connected to the other element or intervening elements may also be present.
It should be noted that the terms of orientation such as left, right, up, down, etc. in the present embodiment are only relative concepts or reference to the normal use state of the product, and should not be considered as limiting.
The repeated video identification method can be deployed in any system needing repeated video identification, for example, the system is deployed in a server of a large video website and is used for performing repeated video identification on a large number of videos managed by the server, and further, a clear comprehensive comparison conclusion can be obtained for a manager to make a decision.
Referring to fig. 1, an embodiment of a repetitive video recognition method according to the present application includes:
101. and separating the first multimedia data stream of the video file to be identified.
The method comprises the steps of firstly separating a first multimedia data stream of a video file to be identified, wherein the first multimedia data stream comprises multimedia data of the video file to be identified, such as a video stream, an audio stream and the like. The method separates the first multimedia data in the video file to be identified, and is beneficial to the comparison and analysis of each type of multimedia data in the subsequent steps.
102. A first multimedia data feature set in the first multimedia data stream is extracted, wherein the first multimedia data feature set comprises a plurality of first multimedia data frames.
After the first multimedia data stream of the video file to be identified is separated instep 101, this step may extract a first multimedia data feature set in the first multimedia data stream, where the first multimedia data feature set includes a number of first multimedia data frames. For example, the first multimedia data frames are furniture (sofa, television cabinet) data frames reflecting the characteristics of the home decoration field, or building material (tile, ceiling) data frames reflecting the characteristics of the home decoration field, and the like, and the set of the first multimedia data frames is the first multimedia data characteristic set.
103. And matching the first multimedia data frame with a second multimedia data frame of the comparison video file to obtain a matching sequence pair set, wherein the matching sequence pair set comprises a plurality of matching sequence pairs.
In this step, the first multimedia data frame instep 102 is matched with the second multimedia data frame of the comparison video file, so as to determine how many frames of repeated multimedia data frames are between the first multimedia data frame representing the video file to be identified and the second multimedia data frame representing the comparison video file.
104. Judging whether the proportion of the matching sequence pair set exceeds a preset threshold, and executing thestep 105 if the proportion of the matching sequence pair set exceeds the preset threshold; if the specific gravity of the matching sequence pair set does not exceed the preset threshold,step 106 is executed.
For example, this step further determines whether the proportion of the duration corresponding to the matching sequence pair set instep 103 occupying the duration of the entire video file to be identified (or the comparison video file) exceeds a preset threshold, and if the proportion of the matching sequence pair set exceeds the preset threshold, it indicates that there are many first multimedia data frames in the video file to be identified that are the same as the second multimedia data frames in the comparison video file; if the proportion of the matching sequence pair set does not exceed the preset threshold, the fact that fewer (or nonexistent) first multimedia data frames exist in the video file to be identified is the same as second multimedia data frames of the comparison video file is indicated.
105. And determining that the video file to be identified is repeated with the comparison video file.
When it is determined instep 104 that the proportion of the matching sequence pair set exceeds the preset threshold, which indicates that there are more first multimedia data frames in the video file to be identified and the second multimedia data frames of the comparison video file are the same, this step determines that the video file to be identified and the comparison video file are repeated.
106. And determining that the video file to be identified and the comparison video file are not repeated.
When it is determined instep 104 that the proportion of the matching sequence pair set does not exceed the preset threshold, indicating that there are fewer (or none) first multimedia data frames in the video file to be identified and second multimedia data frames in the comparison video file are the same, this step determines that the video file to be identified and the comparison video file are not repeated.
Therefore, when the repeated video identification method is deployed in any system needing repeated video identification, for example, the system is deployed in a server of a large-scale video website, and is used for performing repeated video identification on a large number of videos managed by the server, so that a clear comprehensive comparison conclusion can be obtained for a manager to make a decision, the repeated videos can be accurately identified in the large number of videos, user experience is optimized, and video search efficiency is improved.
Referring to fig. 2, an embodiment of a repetitive video recognition method according to the present application includes:
201. and separating the first multimedia data stream of the video file to be identified.
The execution of this step is similar to step 101 in the embodiment of fig. 1, and the repeated parts are not described again here.
It should be noted that, many mature technologies exist in the prior art for separating audio and video in a video file, where the audio correspondence can be separately saved as an audio stream, and the video can be separately saved as a video stream, where the audio stream and the video stream both belong to one of multimedia data streams, and the first multimedia data stream of this embodiment may be an audio stream and/or a video stream, that is, the first multimedia data stream includes: a first video stream and/or a first audio stream.
202. A first multimedia data feature set in the first multimedia data stream is extracted, wherein the first multimedia data feature set comprises a plurality of first multimedia data frames.
The execution of this step is similar to thestep 102 in the embodiment of fig. 1, and repeated descriptions are omitted here.
It is worth noting that the extraction precision of the first multimedia data feature set directly affects the result of video repetition detection, and the step can be based on a local feature method — a scale space-based local feature point extraction and matching algorithm (SIFT) proposed by Lowe, which can show strong robustness under the influence of noise with great intensity and the change of brightness, visual angle and rotation, and compared with other currently popular feature point extraction algorithms, the feature point extracted by the SIFT algorithm has the best stability, and each SIFT feature point is described as a 128-dimensional feature vector. For a video file to be identified, the video file to be identified may be considered to be composed of n multimedia data frames, each multimedia data frame is subjected to feature point calculation to obtain a first multimedia data feature set in a first multimedia data stream, the first multimedia data feature set includes a plurality of first multimedia data frames, and the first multimedia data frames include: a first video frame and/or a first audio frame.
203. And matching the first multimedia data frame with a second multimedia data frame of the comparison video file to obtain a matching sequence pair set, wherein the matching sequence pair set comprises a plurality of matching sequence pairs.
The execution of this step is similar to step 103 in the embodiment of fig. 1, and the repeated parts are not described herein again.
For example, let the first multimedia data frame include f11, f12, f13, f14 … …; setting the second multimedia data frame to comprise f21, f22, f23 and f24 … …; one of the matching sequences was designated as F1(F11, F21), the other as F2(F12, F22), and the other as F3(F13, F23) … …; wherein F1, F2, F3, etc. respectively represent a matching sequence pair with sequence number 1, a matching sequence pair with sequence number 2, a matching sequence pair with sequence number 3, … …; and (F11, F21) in the F1(F11, F21) indicates that the first multimedia data frame with the frame number F11 in the video file to be recognized can be matched and repeated when the second multimedia data frame with the frame number F21 in the video file is compared. The second multimedia data frame includes: a second video frame and/or a second audio frame.
204. And obtaining effective matching frame sequence pairs in the plurality of matching sequence pairs, wherein the matching sequence pair set comprises the effective matching frame sequence pairs.
It is understood that the definition of the repeated video in this step is: the two video files not only have the matching sequence pair set duration exceeding the preset threshold, but also have the same playing sequence of the matching sequence pairs in the matching sequence pair set, that is, the forward playing and the reverse playing of the same video file can be considered as two different video files by the scheme. In view of this, this step needs to eliminate invalid matching frame sequence pairs of several matching sequence pairs in the matching sequence pair set, and only keep valid matching frame sequence pairs in the matching sequence pair set. The so-called invalid matching frame sequence pair is: for matching sequence pairs that may interfere with the process of defining the repeated video, for example, in matching sequence pairs of F1(F11, F21), F2(F12, F22) and F3(F13, F23) … …, if the matching sequence pair F2 is matched to F2(F12, F29), it is known that the first multimedia data frame with the frame number of F12 in the video file to be recognized matches the second multimedia data frame with the frame number of F29 in the comparison video file, however, the first multimedia data frame with the frame number of F12 in the matching sequence pair F2(F12, F29) and the second multimedia data frame with the frame number of F29 in the matching sequence pair set are determined to be non-repeated, it can be determined that this frame of multimedia data is edited, at least the playing sequence between the two video files is moved, and we can determine this matched multimedia data, but the process of matching the modified multimedia data frames in the playing time sequence becomes outlier matching, and the matching sequence of the outlier matching is not considered to be a repeated video frame; the multimedia data frame matching process which can match the multimedia data frames and is not modified in the playing time sequence becomes an effective matching.
Specifically, this step extracts the first frame number of the first multimedia data frame and the second frame number of the second multimedia data frame in each pair of matching frame sequence pairs, for example, extracts F11 and F21 in F1(F11, F21), extracts F12 and F29 in F2(F12, F29) (or F12 and F22 in F2(F12, F22) as another example), extracts F13 and F23, … … in F3(F13, F23) respectively; whether incremental correspondence exists between the first frame numbers and the second frame numbers of the adjacent matching frame sequence pairs or not is judged, for example, whether incremental correspondence exists between f11, f12 and f13 … … is judged, and it is obvious that whether incremental correspondence exists between the corresponding f21, f22 and f23 … … or not is also obvious, so that the first multimedia data frame and the second multimedia data frame can be considered to be a valid matching frame sequence pair at this time; however, if it is determined whether F11, F12, and F13 … … are in incremental correspondence, it is obvious that whether corresponding F21, F29, and F23 … … are in incremental correspondence is also determined, and it is obvious that the corresponding F21, F29, and F23 … … are not in incremental correspondence, then the matching sequence pair F2(F12, F29) is an invalid matching frame sequence pair and is non-duplicate, and the matching sequence pair F2(F12, F29) should be eliminated in this step.
205. And establishing a target coordinate system, and forming an identifier at a target coordinate point on the target coordinate system.
In order to make the expression instep 204 more specific, this step may correspondingly establish a target coordinate system for all matching sequence pairs in the matching sequence pair set according to a certain rule. Specifically, the index of the first frame number of the first multimedia data frame in each pair of matching frame sequence pairs is extracted as the coordinate value of the first coordinate axis, for example, the index of F11 in F1(F11, F21) is extracted as 1 (because "F1" of F11 represents the video file to be recognized), and 1 is taken as the coordinate value of the first coordinate axis; extracting the second frame number of the second multimedia data frame in each pair of matching frame sequence pairs as the coordinate value of the second coordinate axis, for example, the index of F21 in F1(F11, F21) is also 1 (because "F2" of F21 represents contrast video file), and 1 is the coordinate value of the second coordinate axis; and establishing a target coordinate system by using the first coordinate and the second coordinate, wherein each pair of matched frame sequence is a coordinate point on the target coordinate system.
Further, if the first frame number of the first multimedia data frame in each pair of matching frame sequence pairs is effectively matched with the second frame number of the second multimedia data frame, this step forms an identifier at the target coordinate point on the target coordinate system (please refer to fig. 6, where the matching frame sequence pairs effectively matched are identified at the target coordinate point by "+"), and the target coordinate point takes the first frame number and the second frame number as coordinate values; if the first frame number of the first multimedia data frame in each pair of matched frame sequence pairs is outlierly matched with the second frame number of the second multimedia data frame, this step forms an identifier at the target coordinate point on the target coordinate system (please refer to fig. 6, where the outlierly matched pair of matched frame sequences in fig. 6 is identified at its target coordinate point by "x"). If the matching frame sequence pair is a valid match, the coordinate points on the target coordinate system can be fitted to a monotonically increasing straight line, as shown in fig. 6; if the matching frame sequence pair is an outlier match, the coordinate points on the target coordinate system are a set of randomly scattered points.
206. Judging whether the proportion of the matching sequence pair set exceeds a preset threshold, and if so, executingstep 207; if the specific gravity of the matching sequence pair set does not exceed the preset threshold,step 208 is executed.
Specifically, the time length of the video file to be identified and the time length of the video file to be compared can be compared to obtain a target video file, wherein the target video file is one of the video file to be identified and the video file to be compared, which has a shorter time length; or, the target video file is one of the video files to be identified and the comparison video file when the time length is the same; then, the step further judges whether the proportion of the duration of the matching sequence pair set occupying the duration of the target video file exceeds the preset threshold value, namely, the similarity between the matching sequence pair set and the multimedia data frame of the target video file is calculated, if the durations of the two video files are different, the similarity is calculated on the basis of the video file with shorter duration, and if the proportion of the matching sequence pair set exceeding the preset threshold value, the situation that more first multimedia data frames exist in the video file to be identified and the second multimedia data frames of the comparison video file are the same is indicated; if the proportion of the matching sequence pair set does not exceed the preset threshold, the fact that fewer (or nonexistent) first multimedia data frames exist in the video file to be identified is the same as second multimedia data frames of the comparison video file is indicated.
207. And determining that the video file to be identified is repeated with the comparison video file.
When it is determined instep 206 that the proportion of the matching sequence pair set exceeds the preset threshold, which indicates that there are more first multimedia data frames in the video file to be identified and the second multimedia data frames in the comparison video file are the same, this step determines that the video file to be identified and the comparison video file are repeated.
208. And determining that the video file to be identified and the comparison video file are not repeated.
When it is determined instep 206 that the proportion of the matching sequence pair set does not exceed the preset threshold, indicating that there are fewer (or none) first multimedia data frames in the video file to be identified and second multimedia data frames in the comparison video file are the same, this step determines that the video file to be identified and the comparison video file are not repeated.
Therefore, the repeated video identification method has the advantages in time efficiency, more time may be consumed in feature extraction due to the adoption of local feature description, but in the feature matching process, the LSH hash statistics method can be adopted, and the matching process can be completed within the linear time complexity O (m + n). The actual measurement shows that the time consumption of the detection algorithm is about 1/5 of the original video length, and the application requirements can be met.
The above embodiment describes the duplicate video identification method of the present application, and the following describes the duplicate video identification device of the present application, please refer to fig. 3, an embodiment of the duplicate video identification device includes:
aseparation unit 301, configured to separate a first multimedia data stream of a video file to be identified;
an extractingunit 302, configured to extract a first multimedia data feature set in the first multimedia data stream, where the first multimedia data feature set includes a number of first multimedia data frames;
amatching unit 303, configured to match the first multimedia data frame with a second multimedia data frame of a comparison video file to obtain a matching sequence pair set, where the matching sequence pair set includes a plurality of matching sequence pairs;
a determiningunit 304, configured to determine whether a proportion occupied by the matching sequence pair set exceeds a preset threshold;
a first determiningunit 305, configured to determine that the video file to be identified and the comparison video file are repeated if the specific gravity exceeds the preset threshold;
a second determiningunit 306, configured to determine that the video file to be identified and the comparison video file are not repeated if the specific gravity does not exceed the preset threshold.
The operation performed by the video recognition apparatus according to the embodiment of the present application is similar to that performed in the embodiment of fig. 1, and is not repeated herein.
Therefore, when the repeated video identification method is deployed in any system needing repeated video identification, for example, the system is deployed in a server of a large-scale video website, the repeated video identification method is used for performing repeated video identification on a large number of videos managed by the server, and further, a clear comprehensive comparison conclusion can be obtained for a manager to make a decision, so that the repeated videos can be accurately identified in the large number of videos.
Referring to fig. 4, another embodiment of a duplicate video recognition apparatus includes:
aseparating unit 401, configured to separate a first multimedia data stream of a video file to be identified;
an extractingunit 402, configured to extract a first multimedia data feature set in the first multimedia data stream, where the first multimedia data feature set includes a number of first multimedia data frames;
amatching unit 403, configured to match the first multimedia data frame with a second multimedia data frame of a comparison video file to obtain a matching sequence pair set, where the matching sequence pair set includes a plurality of matching sequence pairs;
a determiningunit 404, configured to determine whether a proportion occupied by the matching sequence pair set exceeds a preset threshold;
a first determiningunit 405, configured to determine that the video file to be identified and the comparison video file are repeated if the specific gravity exceeds the preset threshold;
a second determiningunit 406, configured to determine that the video file to be identified and the comparison video file are not repeated if the specific gravity does not exceed the preset threshold.
Optionally, when determining whether the proportion occupied by the matching sequence pair set exceeds a preset threshold, the determiningunit 404 is specifically configured to:
comparing the duration of the video file to be identified with the duration of the comparison video file to obtain a target video file, wherein the target video file is one of the video file to be identified and the comparison video file with shorter duration, or the target video file is one of the video file to be identified and the comparison video file with the same duration;
and judging whether the proportion of the duration of the matching sequence pair set occupying the duration of the target video file exceeds a preset threshold value or not.
Optionally, the apparatus further comprises:
an obtainingunit 407, configured to obtain a valid matching frame sequence pair from a plurality of matching frame sequence pairs, where the set of matching frame pairs includes the valid matching frame sequence pair.
Optionally, the obtainingunit 407, when obtaining valid matching frame sequence pairs in a plurality of matching sequence pairs, is specifically configured to:
respectively extracting a first frame number of the first multimedia data frame and a second frame number of the second multimedia data frame in each pair of the matching frame sequence pairs;
judging whether the first frame numbers and the second frame numbers of adjacent matching frame sequence pairs are in incremental correspondence or not;
and if the first frame number and the second frame number of the adjacent matching frame sequence pair are in incremental correspondence, determining that the first multimedia data frame and the second multimedia data frame are effective matching frame sequence pairs.
Optionally, the apparatus further comprises:
an extractingunit 402, further configured to extract an index of the first frame number of the first multimedia data frame in each pair of the matching frame sequence pairs as a coordinate value of a first coordinate axis;
an extractingunit 402, further configured to extract a label of the second frame number of the second multimedia data frame in each pair of the matching frame sequence pairs as a coordinate value of a second coordinate axis;
an establishingunit 408 for establishing a target coordinate system using the first coordinate axis and the second coordinate axis.
Optionally, the apparatus further comprises:
a formingunit 409, configured to form an identifier at a target coordinate point on the target coordinate system if the first frame number of the first multimedia data frame in each pair of the matching frame sequence pairs is effectively matched with the second frame number of the second multimedia data frame, where the target coordinate point uses the first frame number and the second frame number as coordinate values.
Optionally, the first multimedia data stream includes: a first video stream and/or a first audio stream;
the first multimedia data frame comprises: a first video frame and/or a first audio frame;
the second multimedia data frame comprises: a second video frame and/or a second audio frame.
Optionally, the first video frame includes a home decoration domain feature.
The operation performed by the video recognition apparatus according to the embodiment of the present application is similar to that performed in the embodiment of fig. 2, and is not repeated herein.
Referring to fig. 5, a computer device in an embodiment of the present application is described below, where an embodiment of the computer device in the embodiment of the present application includes:
thecomputer device 500 may include one or more processors (CPUs) 501 and amemory 502, where thememory 502 stores one or more applications or data. Wherein thememory 502 is volatile storage or persistent storage. The program stored inmemory 502 may include one or more modules, each of which may include a sequence of instructions operating on a computer device. Still further, theprocessor 501 may be arranged in communication with thememory 502 to execute a series of instruction operations in thememory 502 on thecomputer device 500. Thecomputer device 500 may also include one or more wireless network interfaces 503, one or more input-output interfaces 504, and/or one or more operating systems, such as Windows Server, Mac OS, Unix, Linux, FreeBSD, etc. Theprocessor 501 may perform the operations performed in the embodiments shown in fig. 1 or fig. 2, which are not described herein again.
In the several embodiments provided in the embodiments of the present application, it should be understood by those skilled in the art that the disclosed system, apparatus and method can be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the unit is only one logical functional division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit. The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and the like.
The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims (10)

Translated fromChinese
1.一种重复视频识别方法,其特征在于,包括:1. a method for repeating video recognition, is characterized in that, comprises:分离出待识别视频文件的第一多媒体数据流;Separating the first multimedia data stream of the video file to be identified;提取所述第一多媒体数据流中的第一多媒体数据特征集合,所述第一多媒体数据特征集合包括若干第一多媒体数据帧;extracting a first multimedia data feature set in the first multimedia data stream, where the first multimedia data feature set includes several first multimedia data frames;将所述第一多媒体数据帧与对比视频文件的第二多媒体数据帧进行匹配,得到匹配序列对集合,所述匹配序列对集合包括若干匹配序列对;Matching the first multimedia data frame and the second multimedia data frame of the comparison video file to obtain a set of matching sequence pairs, the set of matching sequence pairs including several matching sequence pairs;判断所述匹配序列对集合所占的比重是否超过预设阈值;Judging whether the proportion of the matching sequence to the set exceeds a preset threshold;若所述比重超过所述预设阈值,则确定所述待识别视频文件与所述对比视频文件重复;If the proportion exceeds the preset threshold, then determine that the to-be-identified video file is duplicated with the comparison video file;若所述比重未超过所述预设阈值,则确定所述待识别视频文件与所述对比视频文件未重复。If the specific gravity does not exceed the preset threshold, it is determined that the to-be-identified video file and the comparison video file are not duplicated.2.根据权利要求1所述的重复视频识别方法,其特征在于,所述判断所述匹配序列对集合所占的比重是否超过预设阈值包括:2. The method for recognizing repeated videos according to claim 1, wherein the judging whether the proportion occupied by the matching sequence to the set exceeds a preset threshold value comprises:对比所述待识别视频文件与所述对比视频文件的时长,得到目标视频文件,所述目标视频文件为所述待识别视频文件与所述对比视频文件中的时长较短的一个,或,所述目标视频文件为所述待识别视频文件与所述对比视频文件中的时长一样时的其中一个;Compare the duration of the video file to be identified and the comparison video file to obtain a target video file, where the target video file is the shorter one of the duration of the video file to be identified and the comparison video file, or, The target video file is one of the time lengths of the video file to be identified and the comparison video file are the same;判断所述匹配序列对集合的时长占据所述目标视频文件的时长的比重是否超过预设阈值。It is judged whether the proportion of the duration of the matching sequence pair set to the duration of the target video file exceeds a preset threshold.3.根据权利要求1所述的重复视频识别方法,其特征在于,在得到匹配序列对集合之后,判断所述匹配序列对集合所占的比重是否超过预设阈值之前,所述方法还包括:3. repeated video recognition method according to claim 1, is characterized in that, after obtaining matching sequence to set, before judging whether the proportion that described matching sequence occupies to set exceeds preset threshold value, described method also comprises:获取若干所述匹配序列对中的有效匹配帧序列对,所述匹配序列对集合包括所述有效匹配帧序列对。Obtain valid matching frame sequence pairs in several matching sequence pairs, and the matching sequence pair set includes the valid matching frame sequence pairs.4.根据权利要求3所述的重复视频识别方法,其特征在于,所述获取若干所述匹配序列对中的有效匹配帧序列对包括:4. The method for recognizing repeated video according to claim 3, wherein the obtaining effective matching frame sequence pairs in several of the matching sequence pairs comprises:分别提取每一对所述匹配帧序列对中的所述第一多媒体数据帧的第一帧号与所述第二多媒体数据帧的第二帧号;Extracting the first frame number of the first multimedia data frame and the second frame number of the second multimedia data frame in each pair of the matched frame sequence pairs respectively;判断相邻匹配帧序列对的所述第一帧号之间与所述第二帧号之间是否递增对应;Determine whether there is an incremental correspondence between the first frame numbers and the second frame numbers of adjacent matching frame sequence pairs;若相邻匹配帧序列对的所述第一帧号之间与所述第二帧号之间是递增对应,则确定所述第一多媒体数据帧与所述第二多媒体数据帧为有效匹配帧序列对。If there is an incremental correspondence between the first frame number and the second frame number of the adjacent matching frame sequence pair, determine the first multimedia data frame and the second multimedia data frame is a valid matching frame sequence pair.5.根据权利要求4所述的重复视频识别方法,其特征在于,分别提取每一对所述匹配帧序列对中的所述第一多媒体数据帧的第一帧号与所述第二多媒体数据帧的第二帧号之后,所述方法还包括:5. The method for recognizing repetitive video according to claim 4, wherein the first frame number and the second frame number of the first multimedia data frame in each pair of the matched frame sequence pairs are extracted respectively. After the second frame number of the multimedia data frame, the method further includes:提取每一对所述匹配帧序列对中的所述第一多媒体数据帧的所述第一帧号的标号作为第一坐标轴的坐标值;extracting the label of the first frame number of the first multimedia data frame in each pair of the matched frame sequence pair as the coordinate value of the first coordinate axis;提取每一对所述匹配帧序列对中的所述第二多媒体数据帧的所述第二帧号的标号作为第二坐标轴的坐标值;extracting the label of the second frame number of the second multimedia data frame in each pair of the matched frame sequence pair as the coordinate value of the second coordinate axis;使用所述第一坐标轴和所述第二坐标轴建立目标坐标系。A target coordinate system is established using the first coordinate axis and the second coordinate axis.6.根据权利要求5所述的重复视频识别方法,其特征在于,在使用所述第一坐标轴和所述第二坐标轴建立目标坐标系之后,所述方法还包括:6. The method for recognizing repeated video according to claim 5, characterized in that, after establishing a target coordinate system using the first coordinate axis and the second coordinate axis, the method further comprises:若每一对所述匹配帧序列对中的所述第一多媒体数据帧的所述第一帧号与所述第二多媒体数据帧的所述第二帧号有效匹配,则在所述目标坐标系上的目标坐标点形成标识,所述目标坐标点以所述第一帧号与所述第二帧号为坐标值。If the first frame number of the first multimedia data frame in each pair of the matched frame sequence pair effectively matches the second frame number of the second multimedia data frame, then in The target coordinate point on the target coordinate system forms an identifier, and the target coordinate point takes the first frame number and the second frame number as coordinate values.7.根据权利要求1所述的重复视频识别方法,其特征在于,所述第一多媒体数据流包括:第一视频流和/或第一音频流;7. The repeated video identification method according to claim 1, wherein the first multimedia data stream comprises: a first video stream and/or a first audio stream;所述第一多媒体数据帧包括:第一视频帧和/或第一音频帧;The first multimedia data frame includes: a first video frame and/or a first audio frame;所述第二多媒体数据帧包括:第二视频帧和/或第二音频帧。The second multimedia data frame includes: a second video frame and/or a second audio frame.8.根据权利要求7所述的重复视频识别方法,其特征在于,所述第一视频帧包括家装领域特征。8 . The method for recognizing repeated videos according to claim 7 , wherein the first video frame includes characteristics in the field of home improvement. 9 .9.一种重复视频识别装置,其特征在于,包括:9. A repeating video recognition device is characterized in that, comprising:分离单元,用于分离出待识别视频文件的第一多媒体数据流;A separation unit, used to separate out the first multimedia data stream of the video file to be identified;提取单元,用于提取所述第一多媒体数据流中的第一多媒体数据特征集合,所述第一多媒体数据特征集合包括若干第一多媒体数据帧;an extraction unit, configured to extract a first multimedia data feature set in the first multimedia data stream, where the first multimedia data feature set includes several first multimedia data frames;匹配单元,用于将所述第一多媒体数据帧与对比视频文件的第二多媒体数据帧进行匹配,得到匹配序列对集合,所述匹配序列对集合包括若干匹配序列对;a matching unit, configured to match the first multimedia data frame with the second multimedia data frame of the comparison video file to obtain a set of matching sequence pairs, and the matching sequence pair set includes several matching sequence pairs;判断单元,用于判断所述匹配序列对集合所占的比重是否超过预设阈值;A judging unit for judging whether the proportion of the matching sequence to the set exceeds a preset threshold;第一确定单元,用于若所述比重超过所述预设阈值,则确定所述待识别视频文件与所述对比视频文件重复;a first determining unit, configured to determine that the to-be-identified video file is duplicated with the comparison video file if the proportion exceeds the preset threshold;第二确定单元,用于若所述比重未超过所述预设阈值,则确定所述待识别视频文件与所述对比视频文件未重复。A second determining unit, configured to determine that the to-be-identified video file and the comparison video file are not duplicated if the specific gravity does not exceed the preset threshold.10.一种计算机设备,其特征在于,包括:10. A computer equipment, characterized in that, comprising:处理器、存储器、总线、输入输出接口、无线网络接口;Processor, memory, bus, input and output interface, wireless network interface;所述处理器通过总线与所述存储器、所述输入输出接口、所述无线网络接口相连;The processor is connected with the memory, the input and output interface, and the wireless network interface through a bus;所述存储器中存储有程序;A program is stored in the memory;所述处理器执行所述存储器中存储的所述程序时,实现如权利要求1至8中任意一项所述重复视频识别方法。When the processor executes the program stored in the memory, the repetitive video recognition method according to any one of claims 1 to 8 is implemented.
CN202111298828.6A2021-11-042021-11-04 Repeated video recognition method and related devicePendingCN114155456A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202111298828.6ACN114155456A (en)2021-11-042021-11-04 Repeated video recognition method and related device

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202111298828.6ACN114155456A (en)2021-11-042021-11-04 Repeated video recognition method and related device

Publications (1)

Publication NumberPublication Date
CN114155456Atrue CN114155456A (en)2022-03-08

Family

ID=80459332

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202111298828.6APendingCN114155456A (en)2021-11-042021-11-04 Repeated video recognition method and related device

Country Status (1)

CountryLink
CN (1)CN114155456A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
WO2009150425A2 (en)*2008-06-102009-12-17Half Minute Media LtdAutomatic detection of repeating video sequences
CN101896906A (en)*2007-12-172010-11-24国际商业机器公司Based on the extraction of time slice and the robustness coupling of video finger print
CN102523482A (en)*2011-12-072012-06-27中山大学Advertisement monitoring technology based on video content and regression method
CN110324729A (en)*2019-07-182019-10-11北京奇艺世纪科技有限公司A kind of method, apparatus, electronic equipment and the medium of identification infringement video link
CN112560832A (en)*2021-03-012021-03-26腾讯科技(深圳)有限公司Video fingerprint generation method, video matching method, video fingerprint generation device and video matching device and computer equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN101896906A (en)*2007-12-172010-11-24国际商业机器公司Based on the extraction of time slice and the robustness coupling of video finger print
WO2009150425A2 (en)*2008-06-102009-12-17Half Minute Media LtdAutomatic detection of repeating video sequences
CN102523482A (en)*2011-12-072012-06-27中山大学Advertisement monitoring technology based on video content and regression method
CN110324729A (en)*2019-07-182019-10-11北京奇艺世纪科技有限公司A kind of method, apparatus, electronic equipment and the medium of identification infringement video link
CN112560832A (en)*2021-03-012021-03-26腾讯科技(深圳)有限公司Video fingerprint generation method, video matching method, video fingerprint generation device and video matching device and computer equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘守群等: "一种基于内容相似性的重复视频片段检测方法", 《中国科学技术大学学报》, vol. 40, no. 11, 30 November 2010 (2010-11-30), pages 1133 - 1134*

Similar Documents

PublicationPublication DateTitle
CN112990191B (en) A Shot Boundary Detection and Key Frame Extraction Method Based on Subtitle Video
EP3477506B1 (en)Video detection method, server and storage medium
CN101807208B (en)Method for quickly retrieving video fingerprints
Zuo et al.Multi-strategy tracking based text detection in scene videos
CN108427925B (en) A copy video detection method based on continuous copy frame sequence
CN111325245A (en)Duplicate image recognition method and device, electronic equipment and computer-readable storage medium
CN105389590B (en)Video clustering recommendation method and device
US10943098B2 (en)Automated and unsupervised curation of image datasets
CN111368867A (en)Archive classification method and system and computer readable storage medium
CN104063701B (en)Fast electric television stations TV station symbol recognition system and its implementation based on SURF words trees and template matches
CN104156362B (en)Large-scale image feature point matching method
Kulkarni et al.An effective content based video analysis and retrieval using pattern indexing techniques
CN114357248A (en)Video retrieval method, computer storage medium, electronic device, and computer program product
CN112070161B (en)Network attack event classification method, device, terminal and storage medium
CN114155456A (en) Repeated video recognition method and related device
CN111159996A (en)Short text set similarity comparison method and system based on improved text fingerprint algorithm
Paisitkriangkrai et al.Scalable clip-based near-duplicate video detection with ordinal measure
Dong et al.Adaptive Query Selection for Camouflaged Instance Segmentation
CN110705462B (en) Distributed video key frame extraction method based on Hadoop
CN114519893A (en)Few-sample action identification method capable of realizing stage-by-stage attention time sequence alignment
Gao et al.Shot-based similarity measure for content-based video summarization
Bhaumik et al.Real-time storyboard generation in videos using a probability distribution based threshold
CN105721933A (en)Method for creating advertisement video information base, advertisement video identification method, apparatus for creating advertisement video information base and advertisement video identification apparatus
CN117292303B (en)Method and device for judging segmented video type and electronic equipment
JP4768358B2 (en) Image search method

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination

[8]ページ先頭

©2009-2025 Movatter.jp