The present application is a divisional application of an invention patent application having an application number of 02817995.1, international application date 2002, 9/11, and an invention name of "adaptive filtering based on boundary strength".
Detailed Description
Conventional filtering processes consider a single re-rendered image frame at a time. Block-based video signal encoding techniques may use motion vectors to estimate motion of blocks of pixels. This motion vector information is available at both the encoder and decoder, but is not used in the conventional filtering process. For example, if two adjacent blocks share the same motion vector for the same reference image frame, it is possible (for a multi-reference frame system) that there is no significant difference between the image residuals for each block, and, accordingly, that they should not be filtered. Essentially, adjacent parts of the image have the same motion with respect to the same reference frame, and accordingly, no significant difference between these image residuals will be considered. In many cases, the block boundaries of these two neighboring blocks may have been filtered in the reference frame, and so should not be filtered again for the current frame. If a deblocking filter is used without consideration of the motion vector information, the conventional filtering process may repeatedly filter the same boundary from one frame to another. Such unnecessary filtering not only causes unnecessary blurring but also results in additional filter calculations.
Fig. 1 illustrates an image 12 that selectively filters blocking artifacts based on similarities between image blocks. It will be understood that: the image may equally use non-square blocks or any other set of pixels. The backups (kernels) between partial blocks 14 include blocking artifacts 18. In general, a blocking artifact is any image discontinuity that may arise between blocks 14 of an encoding and/or decoding process. A low pass filter or other filter may be used to reduce blocking artifacts present at the hosts of adjacent image blocks.
For example, blocking artifacts 24 exist between blocks 20 and 22. A low pass filter may be used at the register 26 between blocks 20 and 22 to remove or reduce the blocking artifacts 24. For example, the low pass filter selects a set of pixels 28 from both sides of the deposit 26. An average pixel value or any other statistical measure is derived from the set of pixels 28. Each individual pixel is then compared to the average pixel value. The average pixel value is then used to replace any pixels in the subgroup 28 that are outside the predetermined range of the average pixel value.
As previously described, if there are few or no blocking artifacts 24 between these neighboring pixels, then it may not be necessary to filter these groups of pixels 28, thereby causing blurring in the image. The skip mode filtering scheme may use motion estimation and/or compensation information for neighboring image blocks as a basis for selectively filtering. If the motion estimation and compensation information are very similar, the filtering may be skipped. This avoids unnecessary image blurring and greatly reduces the required number of filtering operations or any other suitable value.
For example, it may be determined during the encoding process that: adjacent image blocks 30 and 32 have similar encoding parameters. Accordingly, deblocking filtering may be skipped for groups of pixels 34 extending across the deposit 31 between adjacent blocks 30 and 32. Skip mode filtering may be used for any horizontal boundary, vertical boundary, or any boundary between adjacent blocks in the image 12.
Fig. 2 shows a reference frame 42, a reference frame 48, and a current frame 40 that is currently being encoded or decoded. The encoding parameters for blocks 44 and 46 are compared to determine whether deblocking filtering should be skipped between the two adjacent blocks 44 and 46. One of the coding parameters that may be compared is the Motion Vector (MV) for blocks 44 and 46.
Motion vector MV1 points from block 44 in current image frame 40 to the associated block 44' in reference image 42. Motion vector MV2 points from block 46 in current image frame 40 to the associated block 46' in reference frame 42. The skip mode filtering checks to see if the motion vectors MV1 and MV2 point to neighboring blocks in the same reference frame 42. Deblocking filtering may be skipped if these motion vectors point to neighboring blocks in the same reference frame (MV 1-MV 2). The motion vector information may be used, along with other coding information, to decide whether to skip deblocking filtering between the two image blocks 44 and 46.
More than one reference frame may be used during the encoding and decoding process. For example, there may be another reference frame 48. Neighboring blocks 44 and 46 may have motion vectors pointing to different reference frames. In one example, the decision to skip deblocking filtering depends on whether the motion vectors for the two neighboring blocks point to the same reference frame. For example, an image block 44 may have a motion vector 49 pointing to the reference frame 48 and an image block 46 may have a motion vector MV2 pointing to the reference frame 42. Deblocking filtering is not skipped in this example because motion vector 49 and MV2 point to different reference frames.
Fig. 3 illustrates another example of encoding parameters that may be used to decide whether to selectively skip deblocking filtering. As previously shown in fig. 2, the image block 44 from the image frame 40 is compared to the reference block 44' from the reference frame 42 pointed to by the motion vector MV 1. The residual block 44 "is output from the comparison between the image block 44 and the reference block 44'. A transform 50 is performed on the residual block 44 "to create a transform block 44" of transform coefficients. In one example, the transform 50 is a "discrete cosine transform". The transform block 44 "includes a d.c. component 52 and an a.c. component 53.
The d.c. component 52 refers to the lowest frequency transform coefficient in the image block 44. For example, the coefficient represents the average energy in the image block 44. The a.c. component 53 refers to the transform coefficients representing the higher frequency components in the image block 44. These transform coefficients represent, for example, large energy differences between pixels in the image block 44.
Fig. 4 shows the transformed residual blocks 44 "and 46". The d.c. components 52 from the two transform blocks 44 "and 46" are compared in a processor 54. If these d.c. components are the same or within a certain range of each other, the processor 54 informs the deblocking filter operation 56 to skip deblocking filtering between the two adjacent blocks 44 and 46's hosts. If these d.c. components 52 are not similar, then no skip notification is initiated and the deposit between blocks 44 and 46 is deblock filtered.
In one example, skip mode filtering may be incorporated into the H.26L coding scheme proposed by the International telecommunication Union, telecommunication sector (ITU-T). The h.26l scheme uses 4 × 4 integer "discrete cosine transform" (DCT) blocks. If desired, only the d.c. components of the two adjacent blocks may be examined. However, some limited low frequency a.c. coefficients can also be checked, especially when the image blocks are of a larger size (e.g. 9 x 9 or 16 x 16 blocks). For example, the upper d.c. component 52 and the three lower frequency a.c. transform coefficients 53 for block 44 "may be compared to the upper d.c. component 52 and the three lower frequency a.c. transform coefficients 53 for block 46". Different combinations of d.c. and/or any of these a.c. transform coefficients may be used to identify relative similarities between the two adjacent blocks 44 and 46.
The processor 54 may also receive other encoding parameters 55 generated during the encoding process. As previously described, these encoding parameters include motion vectors and reference frame information for neighboring blocks 44 and 46. The processor 54 may use some or all of these encoding parameters to determine whether to skip deblocking filtering between adjacent image blocks 44 and 46. Other encoding and transformation functions performed on the image may be implemented in the same processor 54 or in different processing circuits. If all or most of this encoding is performed in the same processor, the skip mode is simply enabled by setting the skip parameter in the filtering routine.
Fig. 5 shows how skip mode filtering can be used in a block-based motion compensated "Codec" (Codec) 60. The codec 60 is used for inter-frame coding. From block 62, the input video block from the current frame is fed to a comparator 64. The output of the frame buffer block 80 generates a reference block 81 from the estimated motion vectors (and possibly reference frame numbers). The difference between the input video block and the reference block 81 is transformed in block 66 and then quantized in block 68. The quantized transform block is encoded by a "variable length coder" (VLC) in block 70 and then transmitted, stored, etc.
The encoding portion of codec 60 reproduces the transformed and quantized image by first "inverse quantizing" (IQ) the transformed image in block 72. The inverse quantized image is then inverse transformed in block 74 to generate a reconstructed residual image. This reproduced residual block is then added to the reference block 81 in block 76 to generate a reproduced image block. Typically, the reconstructed image is loop filtered at block 78 to reduce blocking artifacts caused by the quantization and transform process. The filtered image is then buffered in block 80 to form a reference frame. The frame buffering in block 80 uses these reconstructed reference frames for motion estimation and compensation. In comparator 64, reference block 81 is compared to the input video block. The encoded image is output at node 71 from the encoding portion and then stored or transmitted.
In the decoder portion of the codec 60, a Variable Length Decoder (VLD) decodes the encoded image in block 82. The decoded picture is inverse quantized in block 84 and inverse transformed in block 86. The reconstructed residual image from block 86 is added to a reference block 91 in an aggregation block 88, after which the reconstructed residual image is loop filtered in a block 90 to reduce blocking artifacts and buffered as a reference frame in a block 92. From block 92 a reference block 91 is generated based on the received motion vector information. The loop filtered output from block 90 may optionally be post filtered in block 94 to further reduce image artifacts before being displayed as a video image in block 96. The skip mode filtering scheme may be performed in any combination of the various filtering functions in blocks 78, 90 and 94.
This motion estimation and compensation information available during video encoding is used to determine when to skip deblocking filtering in blocks 78, 90, and/or 94. Since these coding parameters are already generated during the encoding and decoding process, there are no additional coding parameters that have to be generated or transmitted specifically for skip mode filtering.
Fig. 6 shows in further detail how skip mode filtering may be used in filters 78, 90 and/or 94 in the encoder and decoder of fig. 5. First, an inter-block boundary between any two adjacent blocks "i" and "k" is identified in block 100. The two blocks may be horizontally adjacent in the image frame or vertically adjacent in the image frame. The decision block 102 compares the motion vector mv (j) for block j with the motion vector mv (k) for block k. It is first determined whether the two neighboring blocks j and k have the same motion vector pointing to the same reference frame. In other words, the motion vectors for these neighboring blocks point to neighboring blocks (mv (j) ═ mv (k)) in the same reference frame (ref (j) ═ ref (k)).
Then, it is determined whether the residual coefficients for the two neighboring blocks are similar. If there is no significant difference between the image residuals of these neighboring blocks (e.g., the two blocks j and k have the same or similar d.c. component (dc (j) dc (k))), then the deblocking filtering process in block 104 is skipped. Skip mode filtering then progresses to the next inter-block boundary in block 106 and the next comparison in decision block 102 is performed. Skip mode filtering may be performed for horizontally adjacent blocks and vertically adjacent blocks.
In one embodiment, only the reference frames and motion vector information for these neighboring image blocks are used to determine block jumps. In another embodiment, only these d.c. and/or a.c. residual coefficients are used to determine block hopping. In another embodiment, the motion vectors, reference frames and residual coefficients are used to determine block jumps.
The skip mode filtering scheme may be applied to chroma channels that are subsampled in space. For example, with a 4:2:0 color format sequence, skip mode filtering on block boundaries may rely only on the equality of the d.c. component and the motion vector on the luminance component of the image. If these motion vectors and d.c. components are the same, deblocking filtering is skipped for the luma and chroma components of these neighboring image blocks. In another embodiment, the motion vectors and d.c. components are considered separately for each luma and chroma component of the neighboring blocks. In this case, deblocking filtering may be performed with respect to luma or chroma components of neighboring blocks, while deblocking filtering is not performed with respect to other luma or chroma components of the same neighboring blocks.
Looking at fig. 7, a recently proposed technique by others in h.26l defines a "block strength" parameter for the loop filter to control the loop filtering process. Each block of the image has an intensity value associated with the block and controls the filtering performed on all four of its block boundaries. The block strength value is derived from the motion vectors and transform coefficients present in the bitstream. However, after considering using the block intensity values for all four edges of the block, the inventors recognized that: this results in blurring along some edges while removing some blocking artifacts at other edges.
In contrast to the block-by-block filtering approach, the present inventors recognized that: these filtering determinations should be made in an edge-by-edge manner along with other information. The other information may include, for example, information related to intra-coding of the block, information related to motion estimation of a block using residual information, information related to motion estimation of a block having no residual with sufficient difference, information related to a reference frame, and information related to motion vectors of neighboring blocks. One, two, three or four of these information features may be used to improve filtering performance in an edge-to-edge manner. The filtering can be modified according to different feature sets as desired.
With respect to each block boundary, a control parameter, i.e., boundary strength Bs, is preferably defined. Looking at FIG. 8, a pair of blocks that share a common boundary are referred to as "j" and "k". The first block 200 checks to see if either of the two blocks is intra coded. If any of these blocks are intra-coded, then the boundary strength is set to 3 at block 202. Block 200 determines whether the two blocks have not been motion predicted. If no motion prediction is used, the block is derived from the frame itself and, accordingly, filtering should be performed on the boundary. This is generally appropriate because the block boundaries that are intra-coded often include blocking artifacts.
If blocks j and k are to be predicted, at least in part, from previous or future frames, then blocks j and k are examined at block 204 to determine whether any coefficients are encoded. These coefficients may be, for example, discrete cosine transform coefficients. If any of blocks j and k include non-zero coefficients, then at least one of these blocks represents a prediction from a previous or future frame along with a modification to that block using these coefficients (commonly referred to as "residuals"). If either of blocks j and k includes non-zero coefficients (and predicted motion), then the boundary strength is set to 2 at block 206. This represents an event in which the images are predicted, but the prediction is corrected using the residuals. Accordingly, these images are likely to include blocking artifacts.
If blocks j and k are motion predicted and do not include non-zero coefficients (commonly referred to as "residuals"), then a determination at block 208 is made to verify that the pixels on either side of the boundary are quite different from each other. This can also be used to determine if these residuals are very small. If there is a significant difference, then there is a high probability of blocking artifacts. A determination is initially made to determine whether the two blocks use different reference frames, i.e., r (j) ≠ r (k). If blocks j and k are from two different reference frames, then the boundary strength is assigned a value of 1 at block 210. Alternatively, if the absolute difference of the motion vectors of the two image blocks is examined to determine whether they are greater than or equal to 1 pixel in the vertical or horizontal direction, i.e., | V (j, x) -V (k, x) | ≧ 1 pixel or | V (j, y) -V (k, y) | ≧ 1 pixel. Other thresholds may also be used as desired, including less than or greater than depending on the test used. If the absolute difference of these motion vectors is greater than or equal to 1, then the boundary strength is assigned a value of 1.
If two blocks j and k are motion predicted, have no residue, are based on the same frame and have insignificant difference, then the boundary strength value is assigned a value of 0. If the boundary strength value is assigned a value of 0, the boundary is not filtered, or is thus adaptively filtered to the value of the boundary strength. It will be understood that: if the boundary strength is zero, the system may perform filtering somewhat (if desired).
The values of the boundary strength (i.e. 1, 2 and 3) are used to control the pixel value adaptation range in the loop filter. Each different boundary strength may be the basis for a different filtering, if desired. For example, in some embodiments, three filters may be used, where the first filter is used when Bs is 1; when Bs is 2, a second filter is used; when Bs is 3, a third filter is used. It will be understood that: non-filtering may be performed by minimal filtering compared to other filtering that results in more significant differences. In the example shown in fig. 8, the larger the value for Bs, the larger the filtering. This filtering may be performed using any suitable technique, such as the methods described in the "joint Committee draft" (CD) of the "Joint video team" (JVT) of ISO/IEC MPEG and ITU-T VCEG (JVT-C167), or other known methods for filtering image artifacts.
Skip mode filtering may be used in any system that encodes or decodes multiple image frames. Such as a DVD player, a video recorder, or any system that transmits image data over a channel (e.g., on a television channel or over the internet). It will be understood that: the system may use the quantization parameter as a coding parameter-either alone or in combination with other coding parameters. Furthermore, it will be understood that: the system may not need to use the quantization parameter alone or at all for filtering purposes.
The skip mode filtering described above may be implemented using a dedicated processor system, a microcontroller, a programmable logic device, or a microprocessor that performs some or all of these operations. Some of the operations described above may be performed in software, and other operations may be performed in hardware.
For convenience, the operations are described as various interconnected functional blocks or distinct software modules. However, this is not essential, and there may be cases where: these functional blocks or modules are equally grouped together in a single logic device, program or operation with unclear boundaries. In any event, the functional blocks and software modules or described features can be implemented by themselves, or in combination with other operations in either hardware or software.
In some embodiments of the present invention, as shown in FIG. 9, image data 902 may be input to an image data encoding apparatus 904, the image data encoding apparatus 904 including an adaptive filtering portion as described above with respect to some embodiments of the present invention. The output from the image data encoding apparatus 904 is encoded image data, and thus it can be stored on any computer-readable storage medium 906. The storage medium may include, but is not limited to, a magnetic disk medium, a memory card medium, or a digital tape medium. The storage medium 906 may be used as a short-term buffer or as a long-term storage device. The encoded image data may be read from the storage medium 906 and decoded by an image data decoding device 908, the image data decoding device 908 including an adaptive filtering portion as described above with respect to some embodiments of the present invention. The decoded image data 910 may be provided for output to a display or other device.
In some embodiments of the present invention, as shown in FIG. 10, image data 1002 may be encoded and then the encoded image data may be stored on storage medium 1006. The basic procedures of the image data encoding device 1004, the storage medium 1006, and the image data decoding device 1008 are the same as those in fig. 9. In fig. 10, a Bs data encoding section 1012 receives values of boundary strengths Bs with respect to each block boundary and encoded by any data encoding method including DPCM, multi-value run-length encoding, transform encoding with lossless characteristic, and the like. The boundary strength Bs may be generated as described in fig. 8. The encoding boundary strength may then be stored in the storage medium 1006. In one example, the encoding boundary strength may be stored separately from the encoded image data. In another example, the encoded boundary strength and the encoded image data may be multiplexed prior to storage on the storage medium 1006.
The encoded boundary strength may be read from the storage medium 1006 and decoded by the Bs data decoding section 1014 to be input to the image data decoding apparatus 1008. When the adaptive filtering of the present invention is performed in the image data decoding apparatus 1008 using the decoded boundary strength, it may not be necessary to repeat the process described in fig. 8 to generate the boundary strength, which may save processing power for the adaptive filtering.
In some embodiments of the present invention, as shown in FIG. 11, image data 1102 may be input to an image data encoding apparatus 1104, the image data encoding apparatus 1104 including an adaptive filtering portion as described above with respect to some embodiments of the present invention. The output from the image data encoding device 1104 is encoded image data, which can then be transmitted over a network (e.g., a LAN, WAN, or the internet 1106). The encoded image data may be received and decoded by image data decoding means 1108, the image data decoding means 1108 also being in communication with the network 1106. The image data decoding means 1108 comprises an adaptive filtering portion as described above in relation to some embodiments of the present invention. The decoded image data 1110 may be provided for output to a display or other device.
In some embodiments of the present invention, as shown in FIG. 12, image data 1202 may be encoded and then the encoded image data may be transmitted over a network (e.g., LAN, WAN, or Internet 1206). The basic procedures of the image data encoding apparatus 1204 and the image data decoding apparatus 1208 are the same as those in fig. 11. In fig. 12, the Bs data encoding section 1212 receives a value of the boundary strength with respect to each block boundary and encoded by any data encoding method including DPCM, multi-value run-length encoding, transform encoding with lossless characteristic, and the like. The boundary strength Bs may be generated as described in fig. 8. The code boundary strength may then be transmitted over network 1206. In another example, the encoded boundary strength and the encoded image data may be multiplexed prior to transmission over network 1206.
The encoding boundary strength may be received from the network 1206 and decoded by the Bs data decoding section 1214 so as to be input to the image data decoding apparatus 1208. When the adaptive filtering of the present invention is performed in the image data decoding apparatus 1208 using the decoded boundary strength, it may not be necessary to repeat the process described in fig. 8 to generate the boundary strength, which may save processing power for the adaptive filtering.
Having described and illustrated the principles of the present invention in its preferred embodiments, it should be apparent that: the invention may be modified in arrangement and detail without departing from such principles. It has been stated that: all such modifications and variations are within the spirit and scope of the following claims.