Self-adaptive video coding method based on sceneTechnical Field
The invention relates to the technical field related to video coding, in particular to a scene-based adaptive video coding method.
Background
For the current online video-on-demand service, in order to provide better video viewing experience under the conditions of bandwidth limitation and cost control, an OTT distribution platform generally generates a plurality of versions with combination of resolution and code rate for each source video according to a general coding configuration table (or called a coding ladder table), and then selects a proper version according to user bandwidth and a playing terminal. Such a coding configuration table only considers network conditions and terminal player limitations, but not video characteristics. Content complexity can vary greatly for different classes of video. For example, for higher complexity videos such as sports events, the code rate of the encoding configuration may be lower; for videos with low complexity, such as animation films, the code rate of the coding configuration may be high, which causes bandwidth waste. The most immediate result of this approach is inconsistent video quality seen by the user terminals.
Manufacturers represented by Netflix provide a frame based on perceptual video coding optimization, which can improve the video quality of users and save bandwidth, but needs extremely high computational complexity. The method comprises the following steps:
(1) dividing a source video into small coding units according to subjects, segments or scenes;
(2) using different code rate and resolution ratio combinations to compile a plurality of results with quality discrimination for each coding unit and obtain actual code rate and quality fraction;
(3) based on the discrete points of (code rate, quality fraction), finding the most approximate convex hull as the RD curve of the coding unit;
(4) the combination of the optimal code rate resolution is obtained through the RD curve, and the lowest required code rate can be obtained under certain set quality.
Therefore, this method actually obtains the optimal coding parameters by an enumeration method, that is, a series of code rates need to be set for each coding unit for each resolution, and a series of discrete points can be obtained after traversal coding. This method requires extremely high computational power to implement.
Disclosure of Invention
The present invention provides a scene-based adaptive video coding method with low computational complexity to overcome the above-mentioned deficiencies in the prior art.
In order to achieve the purpose, the invention adopts the following technical scheme:
a scene-based adaptive video coding method comprises an analyzer and a predictor, wherein the analyzer is used for determining a coding frame type and counting coding information of each frame; the predictor generates an RD curve of each scene according to the scene information and the coding statistical information, and outputs actual coding parameters according to user set parameters; the specific operation steps are as follows:
(1) a video encoder divides a source video into a series of scenes, and each scene is taken as a minimum coding unit;
(2) coding each coding unit according to the fixed GOP number, 0B frame, 1 reference frame and a fixed quantization parameter QP mode, and generating an intermediate file;
(3) counting the actual consumed bit number, skip block number and actual quality score of each frame code, and setting the skip block number of the idx P frame to be NUMidxMass fraction of Scoreidx;
(4) Calculating a theoretical code rate, and calculating each P frame in the GOP to obtain a data point;
(5) sorting all data points in the same scene from small to large according to code rates, processing the data points in sections at certain code rate intervals, screening and solving the average code rate and the average fraction of all the data points in each code rate section to obtain working points of the section, and fitting an RD curve;
(6) and the predictor adaptively generates the coding code rate for the coding unit according to the quality fraction set by the user.
The RD curve calculated by the method has high goodness of fit with the actual RD curve. The original method needs to carry out multiple coding (different code rates each time) on the same resolution ratio to obtain a series of working points so as to obtain an RD curve; the RD curve can be calculated only by once coding, and the calculation complexity is greatly reduced. In practical application, the quality consistency of coding scenes with different complexity can be achieved by using the method and the device to adaptively set the coding rate only by specifying the quality score and limiting the highest coding rate by a user without considering video content. The method for rapidly acquiring the coding unit RD curve is provided, the coding parameters can be determined in a self-adaptive mode according to the quality scores set by a user, the calculation complexity is low, and the method can be conveniently applied to the existing coding framework.
Preferably, the coding information of each frame comprises a fixed quantization parameter QP value, an actual consumed bit number, an actual quality score and the number of skip blocks; the video encoder is a general h.264 or h.265 encoder.
Preferably, in step (2),the generated intermediate file is as follows: the set range of the fixed quantization parameter QP is [ QPmin,QPmax]The fixed quantization parameter QP of each frame is a fixed value; the fixed quantization parameter QP value of the I-frame is set to QPminI.e. QPI=QPminThe fixed quantization parameter QP value of the subsequent P frame is QP according to step sizestepMake an incremental setting if the value is greater than QPmaxThen get back from QPminThe setup is started.
Preferably, in step (4), the specific calculation method is as follows: distributing the BIT number of the I frame to other P frames according to the proportion of skip blocks in each P frame in the same GOP, and setting the BIT number of the I frame as BITIThe BIT number consumed by the idx-th P frame is BITidx,The corrected BIT number is BIT'idxCalculating to obtain a theoretical code rate by combining the frame rate FPS of the video file; then:
Bitrateidx=BIT′idx*FPS
each P-frame in the GOP may be calculated to obtain a data point (bitate)idx,Scoreidx)。
Preferably, in step (5), the segment is divided into m segments, for all data points in each code rate segment, data points with a confidence of 80% are screened, and the average code rate and the average score of the data points are obtained to obtain the operating point (Bitrate ') of the segment'j,Score′j) And m sections obtain m working points, n data points are set in a certain code rate interval, and then the sum of Euclidean distances between each data point and other data points is respectively obtained:
selecting 80% of data points with the minimum Euclidean distance sum according to the confidence coefficient of 80%; monotonously processing m working points, namely ensuring that the Score is high when the Bitrate is high; the line connecting the m operating points can be regarded as an RD curve.
Preferably, in step (6), the average code rate Bitrate required by the coding unit can be calculated from the RD curveavgBitrate is usedavg1.5 as maximum bit Rate Bitratemax(ii) a If the maximum code rate does not exceed the maximum code rate set by the user, coding at the code rate; otherwise, coding is carried out according to the code rate set by the user.
The invention has the beneficial effects that: the method has the advantages of greatly reducing the computational complexity, achieving the quality consistency when coding scenes with different complexities, being capable of adaptively determining coding parameters according to the quality scores set by a user, having lower computational complexity and being conveniently applied to the existing coding framework.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
Detailed Description
The invention is further described with reference to the following figures and detailed description.
In the embodiment shown in fig. 1, a scene-based adaptive video coding method includes an analyzer and a predictor, where the analyzer is configured to determine a coding frame type and count coding information of each frame, where the coding information of each frame includes setting a fixed quantization parameter QP value, an actual consumed bit number, an actual quality score, and a skip block number; the predictor generates an RD curve of each scene according to the scene information and the coding statistical information, and outputs actual coding parameters according to user set parameters; the video encoder is a general h.264 or h.265 encoder; the specific operation steps are as follows:
(1) a video encoder divides a source video into a series of scenes, and each scene is taken as a minimum coding unit;
(2) coding each coding unit according to the fixed GOP number, 0B frame, 1 reference frame and a fixed quantization parameter QP mode, and generating an intermediate file;
the generated intermediate file is as follows: the set range of the fixed quantization parameter QP is [ QPmin,QPmax]For example, [20, 40 ] can be taken]The fixed quantization parameter QP of each frame is a fixed value; the fixed quantization parameter QP value of the I-frame is set to QPminI.e. QPI=QPminThe fixed quantization parameter QP value of the subsequent P frame is QP according to step sizestepMake an incremental setting (step size is typically 1) if the value is greater than QPmaxThen get back from QPminStart setting, i.e. QPPidt=QPmin+(idx*QPstep)/(QPmax-QPmin),PidxRefers to the idx-th P frame;
(3) counting the actual consumed bit number, skip block number and actual quality score of each frame code, and setting the skip block number of the idx P frame to be NUMidxMass fraction of Scoreidx;
(4) Calculating a theoretical code rate, and calculating each P frame in the GOP to obtain a data point;
the specific calculation method is as follows: distributing the BIT number of the I frame to other P frames according to the proportion of skip blocks in each P frame in the same GOP, and setting the BIT number of the I frame as BITIThe BIT number consumed by the idx-th P frame is BITidxBIT number after correction is BIT'idxCalculating to obtain a theoretical code rate by combining the frame rate FPS of the video file; then:
Bitrateidx=BIT′idx*FPS
each P-frame in the GOP may be calculated to obtain a data point (bitate)idx,Scoreidx);
(5) All data points (Bitrate) in the same scene are combinedidx,Scoreidx) Sorting according to the code rate from small to large, processing in sections at certain code rate intervals, screening all data points in each code rate section, solving the average code rate and the average fraction of the data points, obtaining the working point of the section, and fitting an RD curve;
for example, the code rate interval is 200kbps, the interval is set to be m segments, data points with a confidence of 80% are screened for all data points in each code rate segment, and the average code rate and the average score of the data points are obtained to obtain the operating point (bite ') in the segment'j,Score′j) m sections obtain m working points, n data points are set in a certain code rate interval, and then the sum of Euclidean distances between each data point and other data points is respectively obtained:
selecting 80% of data points with the minimum Euclidean distance sum according to the confidence coefficient of 80%; monotonously processing m working points, namely ensuring that the Score is high when the Bitrate is high; the connecting line of the m working points can be used as an RD curve;
(6) according to the quality fraction set by a user, the predictor adaptively generates a coding rate for the coding unit;
specifically, the RD curve can be used to calculate the average bit rate (Bitrate) required by the coding unitavgBitrate is usedavg1.5 as maximum bit Rate Bitratemax(ii) a If the maximum code rate does not exceed the maximum code rate set by the user, coding at the code rate; otherwise, coding is carried out according to the code rate set by the user.
The RD curve calculated by the method has high goodness of fit with the actual RD curve. The original method needs to carry out multiple coding (different code rates each time) on the same resolution ratio to obtain a series of working points so as to obtain an RD curve; the RD curve can be calculated only by once coding, and the calculation complexity is greatly reduced. In practical application, the quality consistency of coding scenes with different complexity can be achieved by using the method and the device to adaptively set the coding rate only by specifying the quality score and limiting the highest coding rate by a user without considering video content. The method for rapidly acquiring the coding unit RD curve is provided, the coding parameters can be determined in a self-adaptive mode according to the quality scores set by a user, the calculation complexity is low, and the method can be conveniently applied to the existing coding framework.