Movatterモバイル変換


[0]ホーム

URL:


USRE40079E1 - Video encoding and decoding apparatus - Google Patents

Video encoding and decoding apparatus
Download PDF

Info

Publication number
USRE40079E1
USRE40079E1US10/986,574US98657404AUSRE40079EUS RE40079 E1USRE40079 E1US RE40079E1US 98657404 AUS98657404 AUS 98657404AUS RE40079 EUSRE40079 EUS RE40079E
Authority
US
United States
Prior art keywords
signal
arbitrary shape
motion compensation
alpha
compensation prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
US10/986,574
Inventor
Noboru Yamaguchi
Toshiaki Watanabe
Takashi Ida
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Toshiba Corp
Original Assignee
Toshiba Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Toshiba CorpfiledCriticalToshiba Corp
Priority to US10/986,574priorityCriticalpatent/USRE40079E1/en
Application grantedgrantedCritical
Publication of USRE40079E1publicationCriticalpatent/USRE40079E1/en
Anticipated expirationlegal-statusCritical
Expired - Fee Relatedlegal-statusCriticalCurrent

Links

Images

Classifications

Definitions

Landscapes

Abstract

An encoding apparatus includes an encoder for encoding an alpha-map signal for discriminating a background from an object of an input picture in motion compensation prediction (MV)+transform encoding which uses MV in a domain of each of N×N transform coefficients (n), a transform circuit for transforming Pf into n in accordance with the alpha-map signal, an inverse transform circuit for reconstructing Pf by inversely transforming n in accordance with the alpha-map signal, a selector for obtaining a motion compensation prediction value (p) in the mth layer (m=2 to M) by switching p in the mth layer and p in the (m−1)th layer for each n, the selector selecting p in the mth layer for n by which a quantized output (Q) in the (m−1)th layer is 0 and selecting p in the (m−1)th layer for n by which Q=1 or more, an adder for calculating a difference df between a prediction error signal in the mth layer and a dequantized output in the (m−1)th layer, and an encoder for encoding and outputting the quantized signal of df. This encoding apparatus realizes SNR scalability in M layers.

Description

This application is a division of Ser. No. 09/334,769, filed Jun. 16, 1999, now U.S. Pat. No. 6,256,346, which is a division of Ser. No. 09/111,751, filed Jul. 8, 1998, now U.S. Pat. No. 6,028,634, which is a division of Ser. No. 08/738,934, filed Oct. 24, 1996, now U.S. Pat. No. 5,818,531.
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to video encoding and decoding apparatuses for encoding a picture signal at a high efficiency and transmitting or storing the encoded signal and, more particularly, to video encoding and decoding apparatuses with a scalable function capable of scalable coding by which the resolution and the image quality can be changed into multiple layers.
2. Description of the Related Art
Generally, a picture signal is compression-encoded before being transmitted or stored because the signal has an enormous amount of information. To encode a picture signal at a high efficiency, pictures whose unit is a frame are divided into a plurality of blocks in units of a predetermined number of pixels. Orthogonal transform is performed for each block to separate the spacial frequency of a picture into frequency components. Each frequency component is obtained as a transform coefficient and encoded.
As one function of video encoding, a scalability function is demanded by which the image quality (SNR: Signal to Noise Ratio), the spacial resolution, and the time resolution can be changed step by step by partially decoding a bit stream.
The scalability function is incorporated into Video Part (IS13818-2) of MPEG2 which is standardized in ISO/IEC.
This scalability is realized by hierarchical encoding methods. The scalability includes an encoder and a decoder of SNR scalability and also includes an encoder and a decoder of spacial scalability.
In the encoder, layers are divided into a base layer (lower layer) whose image quality is low and an enhancement layer (upper layer) whose image quality is high.
In the base layer, data is encoded by MPEG1 or MPEC2. In the enhancement layer, the data encoded by the base layer is reconstructed and the reconstructed base layer data is subtracted from the enhancement layer data. Only the resulting error is quantized by a quantization step size smaller than the quantization step size in the base layer and encoded. That is, the data is more finely quantized and encoded. The resolution can be increased by adding the enhancement layer information to the base layer information, and this makes the transmission and storage of high-quality pictures feasible.
As described above, pictures are divided into the base layer and the enhancement layer, data encoded by the base layer is reconstructed, the reconstructed data is subtracted from the original data, and only the resulting error is quantized by a quantization step size smaller than the quantization step size in the base layer and encoded. Consequently, pictures can be encoded and decoded at a high resolution. This technique is called SNR scalability.
In the encoder, an input picture is supplied to the base layer and the enhancement layer. In the base layer, the input picture is so processed as to obtain an error from a motion compensation prediction value obtained from a picture of the previous frame, and the error is subjected to orthogonal transform (DCT). The transform coefficient is quantized and variable-length-decoded to obtain a base layer output. The quantized output is dequantized, subjected to inverse DCT, and added with the motion compensation prediction value of the previous frame, thereby obtaining a frame picture. Motion compensation prediction is performed on the basis of this frame picture to obtain the motion compensation prediction value of the previous frame.
In the enhancement layer, on the other hand, the input picture is delayed until the prediction value is obtained from the base layer, and processing is performed to obtain an error from a motion compensation prediction value in the enhancement layer obtained from the picture of the previous frame. The error is then subjected to orthogonal transform (DCT), and the transform coefficient is; corrected by using the dequantized output from the base layer, quantized, and variable-length-decoded, thereby obtaining an enhancement layer output. The quantized output is dequantized, added with the motion compensation prediction value of the previous frame obtained in the base layer, and subjected to inverse DCT. A frame picture is obtained by adding to the result of the inverse DCT the motion compensation prediction value of the previous frame obtained in the enhancement layer. Motion compensation prediction is performed on the basis of this frame picture to obtain a motion compensation prediction value of the previous frame in the enhancement layer.
In this way, video pictures can be encoded by using the SNR scalability. Note that although this SNR scalability is expressed by two layers, various SNR reconstructed pictures can be obtained by increasing the number of layers.
In the decoder, the variable-length decoded data of the enhancement layer and the variable-length encoded data of the base layer which are separately supplied are separately variable-length-decoded and dequantized. The two dequantized data are added, and the result is subjected to inverse DCT. The picture signal is restored by adding the motion compensation prediction value of the previous frame to the result of the inverse DCT. Also, motion compensation prediction is performed on the basis of a picture in an immediately previously frame obtained from the restored picture signal, thereby obtaining a motion compensation prediction value of the previous frame.
The foregoing are examples of encoding and decoding using the SNR scalability.
On the other hand, the spacial scalability is done on the basis of the spacial resolution, and encoding is separately performed in a base layer whose spacial resolution is low and an enhancement layer whose spacial resolution is high. In the base layer, encoding is performed by using a normal MPEG2 encoding method. In the enhancement layer, up-sampling (in which a high-resolution picture is formed by adding pixels such as average values between pixels of a low-resolution picture) is performed for the picture from the base layer to thereby form a picture having the same size as the enhancement layer. Prediction is adaptively performed on the basis of motion compensation prediction using the picture of the enhancement layer and motion compensation prediction using the up-sampled picture. Consequently, encoding can be performing at a high efficiency.
The spacial scalability exists in order to achieve backward compatibility by which, for example, a portion of a bit stream of MPEG2 can be extracted and decoded by MPEG1. That is, the spacial scalability is not a function capable of reconstructing pictures with various resolutions (reference: “Special Edition MPEG”, Television Magazine, Vol. 49, No. 4, pp. 458-463, 1995).
More specifically, the video encoding technology of MPEG2 aims to accomplish high-efficiency encoding of high-quality pictures and high-quality reconstruction of the encoded pictures. In this technology, pictures faithful to encoded pictures can be reconstructed.
Unfortunately, with the spread of multimedia, there is a demand for a reconstructing apparatus capable of fully decoding data of high-quality pictures encoded at a high efficiency, as a system on the reconstruction side. In addition, there are demands for a system such as a portable system which is only required to reconstruct pictures regardless of whether the image quality is high, and for a simplified system by which the system price is decreased.
To meet these demands, a picture is divided into, e.g., 8×8 pixel matrix blocks and DCT is performed in units of blocks. In this case, 8×8 transform coefficients are obtained. Although it is originally necessary to decode the data from the first low frequency component to the eighth low frequency component, the data is decoded from the first low frequency component to the fourth low frequency component or from the first low frequency component to the sixth low frequency component. In this manner decoding is simplified by restoring the picture by reconstructing the signal of 4×4 resolution or the signal of 6×6 resolution, rather than the signal of 8×8 resolution.
Unfortunately, when a picture which originally has 8×8 information is restored by using 4×4 or 6×6 information, a mismatch occurs between the restored value and the motion compensation prediction value, and errors are accumulated. This significantly degrades the picture. Therefore, it is an important subject to overcome this mismatch between the encoding side and the decoding side.
Note that as a method of converting the spacial resolution in order to control the difference between the spacial resolutions on the encoding side and the decoding side, there is another method, although the method is not standardized, by which the spacial resolution is made variable by inversely converting some coefficients of orthogonal transform (e.g., DCT (Discrete Cosine Transform)) by an order smaller than the original order.
Unfortunately, when motion compensation prediction is performed by using the resolution-converted picture, image quality degradation called a draft resulting from the motion compensation prediction occurs in the reconstructed picture (reference: Iwahashi et al., “Motion Compensation for Reducing Drift in Scalable Decoder”, Shingaku Giho IE94-97, 1994).
Accordingly, the method has a problem as a technique to overcome the mismatch between the encoding side and the decoding side.
On the other hand, the spacial scalability exists in order to achieve backward compatibility by which, for example, a portion of a bit stream of MPEG2 can be extracted and decoded by MPEG1. That is, the spacial scalability is not a function of capable of reconstructing pictures with various resolutions (reference: “Special Edition MPEG”, Television Magazine, Vol. 49, No. 4, pp. 458-463, 1995). Since hierarchical encoding is performed to realize the scalability function as described above, information is divisionally encoded and this decreases the coding efficiency.
A video encoding system belonging to a category called mid-level encoding is proposed in “J. Y. A. Wang et al., “Applying Mid-level Vision Techniques for Video Data Compression and Manipulation”, M.I.T. Media Lab. Tech. Report No. 263, February 1994”.
In this system, a background and an object are separately encoded. To separately encode the background and the object, an alpha-map signal which represents the shape of the object and the position of the object in a frame is necessary. An alpha-map signal of the background can be uniquely obtained from the alpha-map signal of the object.
In an encoding system like this, a picture with an arbitrary shape must be encoded. As a method of encoding an arbitrary-shape picture, there is an arbitrary-shape picture signal orthogonal transform method described in previously filed Japanese Patent Application No. 7-97073. In this orthogonal transform method, the values of pixels contained in a specific domain are separated from an input edge block signal by a separation circuit (SEP), and an average value calculation circuit (AVE) calculates an average value a of the separated pixel values.
If an alpha-map indicates a pixel in the specific domain, a selector (SEL) outputs the pixel value in the specific domain stored in a block memory (MEM). If the alpha-mad indicates another pixel, the selector outputs the average value a. The block signal thus processed is subjected to two-dimensional DCT to obtain transform coefficients for pixels in the specific domain.
On the other hand, inverse transform is accomplished by separating the pixel values in the specific domain from pixel values in the block obtained by performing inverse DCT for the transform coefficient.
As described above, in the scalable encoding method capable of dividing pictures into multiple layers, the coding efficiency is sometimes greatly decreased when video pictures are encoded. In addition, scalable encoding by which the resolution and the image quality can be made variable is also required in an arbitrary-shape picture encoding apparatus which separately encodes the background and the object. It is also necessary to improve the efficiency of motion compensation prediction encoding for an arbitrary-shape picture.
On the other hand, the mid-level encoding system has the advantage that a method of evenly arranging the internal average value of the object in the background can be realized with a few calculations. However, a step of pixel values is sometimes formed in the boundary between the object and the background. If DCT is performed in a case like this, a large quantity of high-frequency components are generated and so the amount of codes is not decreased.
SUMMARY OF THE INVENTION
It is an object of the present invention to provide an encoding apparatus and a decoding apparatus capable of improving the coding efficiency when video pictures are encoded by a scalable encoding method by which pictures can be divided into multiple layers.
It is another object of the present invention to provide a scalable encoding apparatus and a scalable decoding apparatus capable of mating the resolution and the image quality variable and improving the coding efficiency in an arbitrary-shape picture encoding apparatus which separately encodes a background and an object.
It is still another object of the present invention to improve the efficiency of motion compensation prediction encoding for arbitrary-shape pictures.
It is still another object of the present invention to alleviate the drawback that the code amount is not decreased due to the generation of a large quantity of high-frequency components when DCT is performed, even if a step of pixel values is formed in the boundary between an object and a background when a method of evenly arranging an internal average value of the object in the background is used.
According to the present invention, there is provided a video encoding apparatus comprising: an orthogonal transform circuit for orthogonally transforming an input picture signal to obtain a plurality of transform coefficients; a first local decoder for outputting first transform coefficients for a fine motion compensation prediction picture on the basis of a previous picture; a second local decoder for outputting second transform coefficients for a coarse motion compensation prediction picture on the basis of a current picture corresponding to the input picture signal; means for detecting a degree of motion compensation prediction in the second local decoder; a selector for selectively outputting the first and second transform coefficients in accordance with the degree of motion compensation prediction; a first calculator for calculating a difference between the transform coefficients of the orthogonal transform circuit and ones of the first and second transform coefficients which are selected by the selector, and outputting a motion compensation prediction error signal; a first quantizer for quantizing the motion compensation prediction error signal from the first adder an( outputting a first quantized motion compensation prediction error signal; a second calculator for calculating a difference between the second transform coefficients from the second local decoder and the transform coefficients from the orthogonal transform circuit, and outputting a second motion compensation prediction error signal; a second quantizer for quantizing the motion compensation prediction error signal from the second calculator, and outputting a second quantized motion compensation prediction error signal; and an encoder for encoding the first and second quantized motion compensation prediction error signals and outputting encoded signals.
According to the present invention, there is provided a video encoding apparatus comprising: an orthogonal transform circuit for dividing an input video signal into a plurality of blocks each containing N×N pixels and orthogonally transforming the input video signal in units of blocks to obtain a plurality of transform coefficients divided in spacial frequency bands; a first motion prediction processing section for performing motion compensation prediction processing for the plurality of transform coefficients in order to obtain an upper-layer motion compensation prediction signal having the number of data enough to obtain a high image quality; a second motion prediction processing section for performing motion compensation prediction processing for the plurality of transform coefficients in order to obtain a lower-layer motion compensation prediction signal upon reducing the number of data; a decision section for deciding in motion compensation on the basis of the lower-layer motion compensation prediction signal whether motion compensation prediction is correct; a selector for selecting the upper-layer motion compensation prediction signal in response to a decision representing a correct motion compensation prediction from the decision section, and the lower-layer motion compensation prediction signal in response to a decision representing an incorrect motion compensation prediction; and an encoder for encoding one of the upper-lawyer motion compensation prediction signal and the lower-layer motion compensation prediction signal which is selected by the selector.
According to the present invention, there is provided a video encoding apparatus for realizing SNR scalability in M layers, comprising: an orthogonal transform circuit for dividing an input video signal into a plurality of blocks each containing N×N pixels and orthogonally transforming the input video signal in units of blocks to obtain a plurality of transform coefficients divided in spacial frequency bands; a first motion compensation prediction processing section for performing motion compensation prediction processing for the plurality of transform coefficients in order to obtain an mth-layer (m=2 to M) motion compensation prediction signal; a second motion compensation prediction processing section for performing motion compensation prediction processing for the plurality of transform coefficients in order to obtain an (m−1)th-layer motion compensation prediction signal; switching means for selecting the mth-layer motion compensation prediction signal of the first motion compensation prediction processing section in order to obtain an mth-layer prediction value when a quantized output from the second motion compensation prediction processing section is 0, and switching between the mth-layer motion compensation prediction signal and the (m−1)th-layer motion compensation prediction signal in units of transform coefficients in order to select the (m−1)th-layer motion compensation prediction signal when the quantized output is not less than 1; means for calculating a difference signal between an (m−1)th-layer dequantized output from the second motion compensation prediction processing section and an mth-layer motion compensation prediction error signal obtained by a difference between the mth-layer motion compensation prediction signal and the transform coefficient from the orthogonal transform circuit; and encoding means for quantizing and encoding the difference signal to output an encoded bit stream.
According to the present invention, there is provided a video encoding/decoding system comprising: a video encoding apparatus for realizing SNR (Signal to Noise Ratio) scalability in M layers, which includes an orthogonal transform circuit for dividing an input video signal into a plurality of blocks each containing N×N pixels and orthogonally transforming the input video signal in units of blocks to obtain a plurality of transform coefficients divided in spacial frequency bands, a first motion compensation prediction processing section for performing motion compensation prediction processing for the plurality of transform coefficients in order to obtain an mth-layer (m=2 to M) motion compensation prediction signal, a second motion compensation prediction processing section for performing motion compensation prediction processing for the plurality of transform coefficients in order to obtain an (m−1)th-layer motion compensation prediction signal, switching means for selecting the mth-layer motion compensation prediction signal of the first motion compensation prediction processing section in order to obtain an mth-layer prediction value when a quantized output from the second motion compensation prediction processing section is 0, and switching between the mth-layer motion compensation prediction signal and the (m−1)th-layer motion compensation prediction signal in units of transform coefficients in order to select the (m−1)th-layer motion compensation prediction signal when the quantized output is not less than 1, means for calculating a difference signal between an (m−1)th-layer dequantized output from the second motion compensation prediction processing section and an mth-layer motion compensation prediction error signal obtained by a difference between the mth-layer motion compensation prediction signal and the transform coefficient from the orthogonal transform circuit, and encoding means for quantizing and encoding the difference signal to output an encoded bit stream; and a video decoding apparatus which includes means for extracting codes up to a code in the mth (m=2 to M) layer from the encoded bit stream from the video encoding apparatus, decoding means for decoding the codes of respective layers up to the mth layer, dequantization means for dequantizing, in the respective layers, the quantized values decoded by the decoding means, switching means for switching the mth-layer (m=2 to M) motion compensation prediction value and the (m−1)th-layer motion compensation prediction value in units of transform coefficients, and outputting the mth-layer motion compensation prediction value for the quantized output of 0 in the (m−1)th layer and the (m−1)th-layer motion compensation prediction value for the quantized output of not less than 1 in the (m−1)th layer in units of transform coefficients in order to obtain the mth-layer prediction value, and means for adding the mth-layer motion compensation prediction value and the (m−1)th-layer motion compensation prediction value to reconstruct the mth-layer motion compensation prediction error signal.
According to the present invention, there is provided a video encoding apparatus comprising: an orthogonal transform circuit for dividing an input video signal into a plurality of blocks each containing N×N pixels and orthogonally transforming an arbitrary-shape picture in units of blocks to obtain a plurality of transform coefficients; means for encoding and outputting an alpha-map signal for discriminating a background of a picture from an object thereof; means for calculating an average value of pixel values of an object portion using the alpha-map signal in units of blocks; means for assigning the average value to a background portion of the block; means for deciding using the alpha-map signal whether a pixel in the object is close to the background; means for compressing, about the average value, the pixel in the object decided to be close to the background; and means for orthogonally transforming each block to output an orthogonal transform coefficient.
Additional objects and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objects and advantages of the invention may be realized and obtained by means of the instrumentalities and combinations particularly pointed out in the appended claims.
BRIEF DESCRIPTION OF THE DRAWINGS
The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate presently preferred embodiments of the invention and, together with the general description given above and the detailed description of the preferred embodiments given below, serve to explain the principles of the invention.
FIG. 1 is a block diagram for explaining the present invention, showing the configuration of an encoding apparatus according to the first embodiment of the present invention;
FIG. 2 is a view for explaining the present invention, which explains a prediction value switching method to be applied to the present invention;
FIGS. 3A and 3B are block diagrams for explaining the present invention, showing the configurations of motion compensation prediction sections according to the first embodiment of the present invention;
FIG. 4 is a block diagram for explaining the present invention, showing the configuration of a decoding apparatus according to the first embodiment of the present invention;
FIG. 5 is a block diagram for explaining the present invention, showing the configuration of an encoding apparatus according to the second embodiment of the present invention;
FIGS. 6A and 6B are block diagrams for explaining the present invention, showing the configurations of motion compensation prediction sections according to the second embodiment of the present invention;
FIG. 7 is a block diagram for explaining the present invention, showing the configuration of a decoding apparatus according to the second embodiment of the present invention;
FIGS. 8A and 8B are block diagrams for explaining the present invention, showing the configurations of motion compensation prediction sections according to the third embodiment of the present invention;
FIG. 9 is a view for explaining the present invention, which illustrates an example of a quantization matrix used in the present invention;
FIG. 10 is a view for explaining the present invention, which illustrates an example of a quantization matrix used in the present invention;
FIG. 11 shows an example of a quantization matrix obtained for the example shown inFIG. 2;
FIG. 12 shows an example of a two-dimensional matrix which is divided into eight portions in each of a horizontal direction (h) and a vertical direction (v);
FIG. 13 shows a scan order for the example shown inFIG. 2;
FIG. 14 is a view for explaining the present invention, which explains the fourth embodiment of the present invention;
FIGS. 15A,15B, and15C are views for explaining an example of a video transmission system to which the video encoding apparatus and the video decoding apparatus according to the present invention are applied;
FIG. 16 is a view for explaining a modification of the second embodiment of the present invention, which is a graph showing an example in which an average value is arranged in a background;
FIG. 17 is a view for explaining another modification of the second embodiment of the present invention, which is a graph for explaining an example in which a step is decreased;
FIG. 18 is a view for explaining still another modification of the second embodiment of the present invention, which illustrates examples of block pixel values;
FIG. 19 is a view for explaining still another modification of the second embodiment of the present invention, which is a graph for explaining another example in which a step is decreased;
FIG. 20 is a view for explaining still another modification of the second embodiment of the present invention, which illustrates examples of block pixel values;
FIG. 21 is a block diagram showing an example of an encoding apparatus as still another modification of the second embodiment of the present invention; and
FIG. 22 is a block diagram showing an example of a decoding apparatus as still another modification of the second embodiment of the present invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
In the present invention, when motion compensation is to be performed in a transform coefficient domain in units of N×N transform coefficients, encoding in an upper layer (enhancement layer) is performed on the basis of an already decoded and quantized value of a lower layer (base layer). This realizes an encoding system which can perform encoding with little decrease in the encoding coefficient.
Also, in the above encoding apparatus of the present invention, orthogonal transform can be performed for a picture domain with an arbitrary shape in accordance with an alpha-map signal indicating the arbitrary-shape picture domain. Consequently, a reconstructed picture with a variable image quality can be obtained or an arbitrary-shape picture.
In the present invention, a frame memory is prepared for each of a background and one or more objects, and motion compensation prediction is performed for each of the background and the objects. This improves the efficiency of prediction for a portion hidden by overlapping of the objects.
Furthermore, the efficiency of motion compensation predictive encoding is improved by decreasing the range of motion vector detection in the boundary of an object.
Embodiments of the present invention will be described below with reference to the accompanying drawings.
The first embodiment of the present invention will be described with reference toFIGS. 1,2,3A,3B, and4. This embodiment is related to an encoding apparatus and a decoding apparatus which realize SNR scalability of M layers as a whole. The coding efficiency in the mth layer is improved by adaptively switching a motion compensation prediction signal in the mth layer and a motion compensation prediction signal in the (m−1)th layer. In the accompanying drawings, a base layer corresponds to the (m−1)th layer and an enhancement layer corresponds to the mth layer.
In the encoding apparatus shown inFIG. 1, an input signal is input to an orthogonal transform circuit, e.g.,DCT circuit100. The output terminal of the DCT circuit is connected to the input terminals ofadders110 and111. The other input terminal of theadder110 is connected to aselector300. The output terminal of theadder110 is connected to aquantizer130 via anadder120. The output terminal of thequantizer140 is connected to anoutput buffer160 via a variable-length encoder140 and amultiplexer150.
The output terminal of thequantizer130 is connected to a motion compensation prediction section (MCP)200 via adequantizer170 andadders180 and190. The output of the motioncompensation prediction section200 is connected selectively to theadders110 and120 by theselector300. Theencoding controller400 controls thequantizer130 and the variable-length encoder140 in accordance with the output signal from theoutput buffer160.
The output terminal of theadder111 is connected to the input terminal of thequantizer131 the output terminal of which is connected to is connected to theoutput buffer161 via a variable-length encoder141 and amultiplexer151.
The output terminal of thequantizer131 is connected to a motioncompensation prediction section201 of the enhancement layer via adequantizer171 and anadder191. The output terminal of the motioncompensation prediction section201 is connected to theselector300 andadders111 and191. Theencoding controller410 controls thequantizer131 and the variable-length encoder141 in accordance with the output signal from theoutput buffer161. Amotion vector detector500 receives theinput video signal10 and is connected to the motioncompensation prediction section200, motioncompensation prediction section201 and variable-length encoder141.
TheDCT circuit100 performs orthogonal transform (DCT) for aninput picture signal10 to obtain transform coefficients of individual frequency components. Theadder110 calculates the difference between the transform coefficient from theDCT circuit100 and one of an output (EMC) from the enhancement layer motioncompensation prediction section200 and an output (BMC) from the base layer motioncompensation prediction section201 which are selectively supplied via theselector300. Theadder120 calculates the difference between an output from theadder110 and an output from thedequantizer171.
Thequantizer130 quantizes an output from theadder120 in accordance with a quantization scale supplied from theencoding controller400. The variable-length encoder140 performs variable-length encoding for the quantized output from thequantizer130 and side information such as the quantization scale supplied from theencoding controller400.
Themultiplexer150 multiplexes the variable-length code of the quantized output and the variable-length code of the side information supplied from the variable-length encoder140. Theoutput buffer160 temporarily holds and outputs the data stream multiplexed by themultiplexer150.
Theencoding controller400 outputs information of an optimum quantization scale Q_scale on the basis of buffer capacity information from thebuffer160. Theencoding controller400 also supplies this information of the quantization scale Q13scale to the variable-length encoder140 as the side information, thereby causing thequantizer130 to perform quantization and the variable-length encoder140 to perform variable-length encoding.
Thedequantizer170 dequantizes the quantized output from thequantizer130 and outputs the result. Theadder180 adds the output from thedequantizer170 and the output from thedequantizer171. Theadder190 adds the output from theadder180 and a compensation prediction value selectively output from theselector300.
The motioncompensation prediction section200 calculates a motion compensation prediction value in the enhancement layer on the basis of the output from theadder180 and a motion vector detected by themotion vector detector500. When receiving the motion compensation prediction value calculated by the motioncompensation prediction section200 and the motion compensation prediction value calculated by the motioncompensation prediction section201, theselector300 selectively outputs one of these motion compensation prediction values in accordance with an output from abinarizing circuit310.
In the above configuration, theadder110, theadder120, thequantizer130, the variable-length encoder140, themultiplexer150, theoutput buffer160, thedequantizer170, theadder180, theadder190, the motioncompensation prediction section200, theselector300, and theencoding controller400 constitute the enhancement layer. Thequantizer170, theadder180, theadder190 and the motioncompensation prediction section200 construct a local decoder of the enhancement layer.
Themotion vector detector500 described above receives the same picture signal as the input picture signal to theDCT circuit100 and detects a motion vector from this picture signal. On the basis of the motion vector supplied from themotion vector detector500 and the sum output from theadder191, the motioncompensation prediction section201 performs motion compensation prediction and obtains a motion compensation prediction value (BMC) which is converted into a DCT coefficient.
Theadder111 calculates the difference between the output transform coefficient from theDCT circuit100 and the output motion compensation prediction value (BMC) from the motioncompensation prediction section201. Thequantizer131 quantizes the output from theadder111 in accordance with the quantization scale designated by theencoding controller410.
Thebinarizing circuit310 checks whether the quantized value output from thequantizer131 is “0”. If the value is “0”, thebinarizing circuit310 outputs “0”. If the value is not “0”, thebinarizing circuit310 outputs “1”. Thedequantizer171 performs dequantization in accordance with the quantization scale designated by theencoding controller410. Theadder191 adds the output from thedequantizer171 and the output from the motioncompensation prediction section201 and supplies the sum to the motioncompensation prediction section201.
The variable-length encoder141 performs variable-length encoding for the quantized output from thequantizer131 and the side information such as the quantization scale supplied from theencoding controller410. Themultiplexer151 multiplexes the variable-length code of the quantized output and the variable-length code of the side information supplied from the variable-length encoder141. Theoutput buffer161 temporarily holds and outputs the data stream multiplexed by themultiplexer151.
Theencoding controller410 outputs the information of the optimum quantization scale Q13scale on the basis of buffer capacity information from thebuffer161. Theencoding controller410 also supplies this information of the quantization scale Q13scale to the variable-length encoder141 as the side information, thereby causing thequantizer131 to perform quantization and the variable-length encoder141 to perform variable-length encoding.
Theadder111, thequantizer131, the variable-length encoder141, themultiplexer151, theoutput buffer161, thedequantizer171, theadder191, the motioncompensation prediction section201, thebinarizing circuit310, theencoding controller410, and themotion vector detector500 constitute the base layer. Thedequantizer171, theadder191 and the motioncompensation prediction section201 constitute a local decoder.
This apparatus with the above configuration operates as follows.
Theinput picture signal10 is supplied to theDCT circuit100 and themotion vector detector500. Themotion vector detector500 detects a motion vector from thepicture signal10 and supplies the detected vector to the motioncompensation prediction sections200 and201 and the variable-length encoder141.
Thepicture signal10 input to theDCT circuit100 is divided into blocks each having a size of N×N pixels and orthogonally transformed in units of N×N pixels by thisDCT circuit100. Consequently, N×N transform coefficients are obtained for each block. These transform coefficients are N×N transform coefficients obtained by separating the spacial frequency components of the picture into components ranging from a DC component to individual AC components.
These N×N transform coefficients obtained by theDCT circuit100 are supplied to theadder110 in the enhancement layer and theadder111 in the base layer.
In the base layer, theadder111 calculates the difference between the transform coefficient and the motion compensation prediction value (BMC) which is converted into a DCT coefficient and supplied from the motioncompensation prediction section201, thereby obtaining a prediction error signal. This prediction error signal is supplied to thequantizer131 to be quantized in accordance with the quantization scale Q13scale input by theencoding controller410. The quantized prediction error signal is supplied to the variable-length encoder141 anddequantizer171.
The variable-length encoder141 performs variable-length encoding for the quantized prediction error signal, the side information such as the quantization size supplied from theencoding controller410, and the motion vector information supplied from themotion vector detector500. This variable-length-encoded output is supplied to themultiplexer151 to be multiplexed thereby, and supplied to theoutput buffer161. Theoutput buffer161 outputs the multiplexed signal, as an encodedbit stream21, to a transmission line or a storage medium. Also, theoutput buffer161 feeds the capacity of the buffer back to theencoding controller410.
In accordance with the capacity information from the buffer, theencoding controller410 controls the output from thequantizer131 and outputs the quantization scale Q13scale to thequantizer131. This information of the quantization scale Q13scale is also supplied to the variable-length encoder141 as the side information.
Since theencoding controller410 controls the output from thequantizer131 in accordance with the capacity information from the buffer, theencoding controller410 can advance the quantization while controlling the quantization scale so that theoutput buffer161 does not overflow.
The information of the quantization scale Q13scale is variable-length-encoded as the side information by the variable-length encoder141 and multiplexed by themultiplexer151. The multiplexed signal is used as the output from the video encoding apparatus. Consequently, the quantization scale used in dequantization when the video decoding apparatus performs decoding can be obtained.
Meanwhile, the quantized value of the prediction error signal supplied to thedequantizer171 is dequantized and supplied to theadder191. Theadder191 adds the dequantized value to the motion compensation prediction value BMC and thereby calculates a reconstructed value in the transform coefficient domain. This value is supplied to the motioncompensation prediction section201.
In the enhancement layer, the output EMC from the motioncompensation prediction section200 of the enhancement layer and the output BMC from the motioncompensation prediction section201 of the base layer are adaptively and selectively output for each transform coefficient. That is, on the basis of an output BQ from thequantizer131 in the base layer, theselector300 adaptively and selectively outputs the output (EMC) from the motioncompensation prediction section200 of the enhancement layer and the output BMC from the motioncompensation prediction section201 of the base layer for each transform coefficient in accordance with a method to be described later.
Theadder110 calculates a prediction error signal between the transform coefficient of the input picture supplied from theDCT circuit100 and an output EP from theselector300 and supplies the signal to theadder120. Theadder120 calculates the difference between asignal30 of the dequantized value BQ supplied from thedequantizer171 and the output from theadder110 and supplies the difference as a difference value output signal EC to thequantizer130. This difference value output signal EC is the motion compensation prediction error signal.
Thequantizer130 quantizes the difference value output signal EC in accordance with the quantization scale Q13scale supplied from theencoding controller400 and supplies the quantized signal to the variable-length encoder10 and thedequantizer170.
The variable-length encoder140 performs variable-length encoding or the quantized motion compensation prediction error signal together with the side information and supplies the encoded signals to themultiplexer150. Themultiplexer150 multiplexes these signals and supplies the multiplexed signal to theoutput buffer160.
Theoutput buffer160 outputs the multiplexed signal to a transmission line or a storage medium as an encodedbit stream20 for the enhancement layer. Also, theoutput buffer160 feeds the capacity of the buffer back to theencoding controller400.
The quantized value supplied to thedequantizer170 is dequantized. Theadder180 adds the dequantized value to theoutput30 supplied from thedequantizer171 of the base layer, thereby reconstructing the prediction error signal.
Theadder190 adds the prediction error signal reconstructed by theadder180 to the motion compensation prediction value EMC and thereby calculates a reconstructed value in the transform coefficient domain. This reconstructed value is supplied to the motioncompensation prediction section200.
FIG. 2 shows a switching unit described in reference (T. K. Tan et al., “A Frequency Scalable Coding Scheme Employing Pyramid and Subband Techniques”, IEEE Trans. CAS for Video Technology, Vol. 4, No. 2, April 1994), which is an example of the switching unit optimally applicable to theselector300.
Referring toFIG. 2, thebinarizing circuit310 decides whether the value of the output BQ from thequantizer131 in the base layer is “0”. This decision result is supplied to theselector300. If the value of the output BQ from thequantizer131 is “0” theselector300 selects the transform coefficient output EMC from the enhancement layer motioncompensation prediction section200. If the value is “1”, theselector300 selects the transform coefficient output BMC from the base layer motioncompensation prediction section201.
That is, thebinarizing circuit310 outputs “0” when the value of the output BQ from thequantizer131 in the base layer is “0” and outputs “1” when the value is not “0”. Therefore, theselector300 is made to select EMC when the output from thebinarizing circuit310 is “0” and BMC when the output is “1”. Consequently, the transform coefficient output EMC from the motioncompensation prediction section200 in the enhancement layer is applied to a transform coefficient in a position where the output BQ from thequantizer131 is “0”, and the transform coefficient output BMC from the motioncompensation prediction section201 in the base layer is applied to a transform coefficient in a position where the output BQ from thequantizer131 is not “0”.
Thequantizer131 in the base layer receives the output from theadder111 and quantizes this output from theadder111. Theadder111 receives the output from theDCT circuit100 and the motion compensation prediction value obtained by the motioncompensation prediction section201 from a picture in an immediately previous frame, and calculates the difference between them. Therefore, if the calculated motion compensation prediction value is correct, the difference between the two values output from theadder111 is “0”.
Accordingly, of the quantized values as the output BQ from thequantizer131 in the base layer, coefficients (values in portions enclosed by the circles inFIG. 2) having values other than “0” are coefficients whose motion compensation prediction is incorrect.
If the motioncompensation prediction section200 performs motion compensation prediction by using the same motion vector as in the base layer supplied from themotion vector detector500, it is estimated that motion compensation prediction for coefficients (values in portions enclosed by the circles) in the enhancement layer in the same positions as in the base layer is incorrect.
Accordingly, theselector300 selects BMC for these coefficients.
On the other hand, it is estimated that motion compensation for other coefficients is correct. Therefore, theselector300 selects a prediction value in the enhancement layer with a smaller encoding deviation. Consequently, the signal EC encoded in the enhancement layer is used as the quantized error signal of the base layer when motion compensation prediction is incorrect, and as the motion compensation prediction error signal of the enhancement layer when motion compensation prediction is correct. This improves the coding efficiency of coefficients whose motion compensation prediction is incorrect.
Note that the technique disclosed in the reference cited above is based on the assumption that pictures having low resolutions are reconstructed in the base layer, and so low-frequency coefficients which are ¼ the transform coefficients calculated by theDCT circuit100 are separated and supplied to the base layer. As a consequence, the reliability of estimation for switching prediction for each transform coefficient is decreased due to an error produced by resolution conversion.
In this embodiment, on the other hand, the resolutions of the base layer and the enhancement layer are equal. Therefore, the embodiment is different from the technique disclosed in the reference cited above in that the accuracy of estimation is improved. A great advantage of the embodiment is a high image quality.
The configuration of the motioncompensation prediction sections200 and201 used in the apparatus of the present invention will be described below.
FIG. 3A is a block diagram showing the configuration of the motioncompensation prediction sections200 and201. Each of the motioncompensation prediction sections200 and201 consists of anIDCT circuit210, aframe memory220, amotion compensation circuit230, and aDCT circuit240.
TheIDCT circuit210 restores the reconstructed picture signal by performing inverse orthogonal transform (IDCT) for the output from theadder190 or191. Thefame memory220 holds the reconstructed picture signal obtained by this inverse orthogonal transform, as a reference picture, in units of frames. Themotion compensation circuit230 extracts a picture in a position indicated by a motion vector in units of blocks from the picture signals (reference pictures) stored in theframe memory220. TheDCT circuit240 performs orthogonal transform (DCT) for the extracted picture and outputs the result. Note that the motion vector is supplied from themotion vector detector500. In this configuration, a reconstructed value in a transform coefficient domain is inversely transformed into the reconstructed picture signal by theIDCT circuit210 and stored in theframe memory220. Themotion compensation circuit230 extracts a picture in a position “indicated by the motion vector in units of blocks from the reference pictures stored in theframe memory220, and supplies the extracted picture to theDCT circuit240. TheDCT circuit240 performs DCT for the supplied picture and outputs the result as a motion compensation prediction value in the DCT coefficient domain.
In this manner, the motion compensation prediction value in the DCT coefficient domain can be obtained.
The foregoing is the explanation of the encoding apparatus. The decoding apparatus will be described below.
FIG. 4 is a block diagram of the decoding apparatus according to the first embodiment of the present invention.
According to the present decoding apparatus, abuffer162 on the enhancement layer side receives a coded bit stream sent from the encoding apparatus. The output terminal of thebuffer162 is connected to variable-length encoder142 via ademultiplexer152 the output terminal of which is connected to adequantizer172. The output terminal of thedequantizer172 is connected to a motioncompensation prediction section202 viaadders181 and192. The output terminal of the motioncompensation prediction section202 is connected anadder192 via aselector300.
Thebuffer163 on the base layer side receives the encodedbit stream23 sent from the encoding apparatus. The output terminal of thebuffer163 is connected to a variable-length decoder143 via asegmentation circuit153. The output terminal of the variable-length decoder143 is connected to a motioncompensation prediction section203 via adequantizer173 and anadder193 and to the switch control terminal of aselector300 via abinarizing circuit310. The output terminal of the motioncompensation prediction section203 is connected to anadder193 and theselector300.
Theinput buffer162, thedemultiplexer152, the variable-length decoder142, thedequantizer172, theadders181 and192, theselector300, and the motioncompensation prediction section202 constitute an enhancement layer. Theinput buffer163, thedemultiplexer153, the variable-length decoder143, thedequantizer173, theadder193, thebinarizing circuit310, and the motioncompensation prediction section203 constitute a base layer.
Theinput buffer162 in the enhancement layer receives and temporarily holds an encoded multiplexedbit stream22 in the enhancement layer. Thedemultiplexer152 demultiplexes thebit stream22 obtained via theinput buffer162, i.e., demultiplexes the multiplexed signal into the original signals, thereby restoring encoded Information of side information and encoded information of a difference value output signal EC of a picture.
The variable-length decoder142 performs variable-length decoding for the encoded signals demultiplexed by thedemultiplexer152 to thereby restore the original side information and the difference value output signal EC of the picture. On the basis of the information of a quantization scale Q13scale of the restored side information, thedequantizer172 dequantizes the difference value output signal EC of the picture from the variable-length decoder142 and outputs the dequantized signal. Theadder181 adds the dequantized signal and the dequantized output from thedequantizer173 for the base layer.
Theadder192 adds the output from theadder181 and the output EP from theselector300 and outputs the sum. The motioncompensation prediction section202 receives the output from theadder192 and the decoded difference value output signal EC of the picture which is the output from the variable-length decoder143 for the base layer and obtains a motion compensation prediction value EMC. The output motion compensation prediction value EMC from the motioncompensation prediction section202 is used as anenhancement layer output40 and as one input to theselector300.
Theselector300 receives the output (motion compensation prediction value EMC) from the motioncompensation prediction section202 for the enhancement layer and the output from the motioncompensation prediction section203 for the base layer. In accordance with the output from thebinarizing circuit310, theselector300 selectively outputs one of these two inputs.
Theinput buffer163 receives and temporarily holds an encoded and multiplexedbit stream23 for the base layer. Thedemultiplexer153 demultiplexes thebit stream23 obtained via theinput buffer163, i.e., demultiplexes the multiplexed signal into the original signals, thereby restoring encoded information of the side information and encoded information of the difference value output signal EC of the picture.
The variable-length decoder143 performs variable-length decoding for the encoded signals demultiplexed by thedemultiplexer153 to thereby restore the original side information and the difference value output signal EC of the picture. On the basis of the information of the quantization scale Q13scale of the restored side information, thedequantizer173 dequantizes the difference value output signal EC of the picture from the variable-length decoder143 and supplies the dequantized signal to theadders181 and193. Theadder193 adds the dequantized signal and the motion compensation prediction value EMC supplied from the motioncompensation prediction section203 for the base layer.
The motioncompensation prediction section203 receives the output from theadder193 and the motion compensation prediction value EMC, which is the output of an immediately previous frame from thesection203, and obtains the motion compensation prediction value EMC of the current frame. The output motion compensation prediction value EMC from the motioncompensation prediction section203 is used as anoutput41 of the base layer and as the other input to theselector300.
The operation of the decoding apparatus with the above configuration will be described below. In this apparatus, the baselayer bit stream23 is supplied to theinput buffer163 and the enhancementlayer bit stream22 is supplied to theinput buffer162.
The input baselayer bit stream23 is stored in theinput buffer163 and supplied to thedemultiplexer153. Thedemultiplexer153 demultiplexes the signal in accordance with the type of the signal. That is, thebit stream23 is formed by multiplexing signals of the side information such as the quantized value of a transform coefficient, the motion vector, and the quantization scale. Upon receiving thebit stream23, therefore, thedemultiplexer153 demultiplexes the bit stream into the original codes such as the quantized value of the transform coefficient, the motion vector, and the quantization scale Q13scale in the side information.
The codes demultiplexed by thedemultiplexer153 are supplied to the variable-length decoder143 and decoded into signals of the quantized value of the transform coefficient, the motion vector, and the quantization scale Q13scale. Of the decoded signals, the motion vector is supplied to the motioncompensation prediction section203, and the quantized value of the transform coefficient and the quantization scale Q13scale are supplied to thedequantizer173. Thedequantizer173 dequantizes the quantized value of the transform coefficient in accordance with the quantization scale Q13scale and supplies the dequantized transform coefficient to theadder193.
Theadder193 adds the dequantized transform coefficient and the motion compensation prediction value in the transform coefficient domain supplied from the motioncompensation prediction section203, thereby calculating the reconstructed value in the transform coefficient domain.
This reconstructed value is supplied to the motioncompensation prediction section203. The configuration of the motioncompensation prediction section203 is as shown in FIG.3B. That is, the reconstructed value supplied from theadder193 is inversely orthogonally transformed by anIDCT circuit210 in tie motioncompensation prediction section203 and output as thereconstructed picture signal41. The signal is also stored in aframe memory220 in the motioncompensation prediction section203.
In the motioncompensation prediction section203, on the basis of the supplied motion vector described above, amotion compensation circuit230 extracts a picture in a position indicated by the motion vector in units of blocks from the picture signals (reference pictures) stored in theframe memory220. ADCT circuit240 performs orthogonal transform (DCT) for the extracted picture and outputs the result as a transform coefficient output BMC to theadder193 and theselector300.
Meanwhile, the enhancementlayer bit stream22 is supplied to the enhancement layer. Thisbit stream22 is stored in the enhancementlayer input buffer162 and supplied to thedemultiplexer152.
Thedemultiplexer152 demultiplexes thebit stream22. That is, thebit stream22 is formed by multiplexing signals of the side information such as the quantized value of a transform coefficient, the motion vector, and the quantization scale Q13scale. Upon receiving thebit stream22, therefore, thedemultiplexer152 demultiplexes the bit stream into the original codes such as the quantized value of the transform coefficient, the motion vector, and the quantization scale Q13scale.
The codes demultiplexed by thedemultiplexer152 are supplied to the variable-length decoder142 and decoded into signals of the quantized value of the transform coefficient, the motion vector, and the like. Of the decoded signals, the motion vector is supplied to the motioncompensation prediction section202, and the quantized value of the transform coefficient and the quantization scale Q13scale are supplied to thedequantizer172. Thedequantizer172 dequantizes the quantized value of the transform coefficient in correspondence with the quantization scale Q13scale and supplies the dequantized transform coefficient to theadder181. The dequantized value is added to adequantized value31 of the base layer supplied from thedequantizer173, aid the sum is supplied to theadder192.
Theadder193 adds the output from theadder181 and a signal EP supplied from theselector300 to thereby calculate the reconstructed value in the transform coefficient domain. This reconstructed value is supplied to the motioncompensation prediction section202. The configuration of the motioncompensation prediction section202 is as shown in FIG.3B. That is, the reconstructed value supplied from theadder193 is inversely orthogonally transformed by anIDCT circuit210 in the motioncompensation prediction section202 and output as areconstructed picture signal40. The signal is also stored in aframe memory220 in the motioncompensation prediction section202.
In the motioncompensation prediction section202, on the basis of the supplied motion vector described above amotion compensation circuit230 extracts a picture in a position indicated by the motion vector in units of blocks from the picture signals (reference pictures) stored in theframe memory220. ADCT circuit240 performs orthogonal transform (DCT) for the extracted picture and outputs the result as a transform coefficient output BMC to theadder193 and theselector300.
Theselector300 receives the decision result from thebinarizing circuit310 and selects one of BMC and EMC. That is, thebinarizing circuit310 receives an output BQ from the variable-length decoder143 and decides whether the value is “0”. This decision result is supplied to theselector300.
If the value of the output BQ from the variable-length decoder143 is “0”, the selector selects the transform coefficient output EMC from the motioncompensation prediction section202. If the value is “1”, theselector300 selects the transform coefficient output BMC from the motioncompensation prediction section203.
That is, thebinarizing circuit310 outputs “0” when the value of the output BQ from the variable-length decoder143 in the base layer is “0” and outputs “1” when the value is not “0”. Therefore, theselector300 is made to select EMC when the output from thebinarizing circuit310 is “0” and BMC when the output is “1”. Consequently, the transform coefficient output EMC from the motioncompensation prediction section202 in the enhancement layer is applied to a transform coefficient in a position where the output BQ from the variable-length decoder143 is “0”, and the transform coefficient output BMC from the motioncompensation prediction section203 in the base layer is applied to a transform coefficient in a position where the output BQ from the variable-length decoder143 is not “0”.
The output from the variable-length decoder143 in the base layer contains the motion compensation prediction error signal and the motion vector obtained on the encoding side. When the motion compensation prediction error signal and the motion vector are supplied to the motioncompensation prediction section203, the motioncompensation prediction section203 obtains the motion compensation prediction error between the picture of the immediately previous frame and the current picture.
Meanwhile, thebinarizing circuit310 receives the decoded base layer motion compensation prediction value signal from the variable-length decoder143. If the signal value is “0”, thebinarizing circuit310 outputs “0” to theselector300. If the signal value is not “0”, thebinarizing circuit310 outputs “1” to theselector300.
If the output from thebinarizing circuit310 is “0”, theselector300 selects the output EMC with a smaller encoding deviation from the enhancement layer motioncompensation prediction section203. If the output from thebinarizing circuit310 is “1”, theselector300 selects the transform coefficient output BMC with a larger encoding deviation from the base layer motioncompensation prediction section202.
Eventually, if the DCT coefficient error obtained by the base layer motion compensation prediction is “0”, the output from the motioncompensation prediction section202 which is the reconstructed value of the transform coefficient output EMC from the enhancement layer motioncompensation prediction section200 is selected. If the error is “1”, the output from the motioncompensation prediction section203 which is the reconstructed value of the transform coefficient output BMC from the base layer motioncompensation prediction section201 is selected.
This processing is analogous to the processing in the encoding apparatus. Accordingly, as the transform coefficient output of motion compensation prediction in the enhancement layer, as in the selection done on the encoding side, an output for the base layer is used in a portion where motion compensation prediction is incorrect, and an output for the enhancement layer with a smaller encoding deviation is used in a portion where the prediction is correct. Consequently, following this switching the encoding apparatus can smoothly reconstruct pictures.
In the first embodiment described above, each frame of a video picture is divided into matrix blocks each having a predetermined number (N×N) of pixels and orthogonally transformed to obtain transform coefficients of individual spacial frequency bands. For each of the N×N transform coefficients thus obtained, motion compensation is performed in the domain of the transform coefficient in upper and lower layers. When motion compensation is to be performed in this video encoding, whether motion compensation prediction is correct is checked on the basis of an already decoded and quantized value in the lower layer (base layer). If the motion compensation prediction is correct, the upper layer (enhancement layer) is encoded by using a motion compensation prediction value with a smaller encoding deviation obtained for the upper layer. If the motion compensation prediction is incorrect, the upper layer is encoded by using a motion compensation prediction value obtained for the lower layer (base layer) and having a larger encoding deviation than that for the enhancement layer. This improves the coding efficiency of a coefficient the motion compensation prediction for which is incorrect and thereby realizes an encoding system capable of encoding with little decrease in the coding efficiency.
The foregoing is an embodiment in which a whole video picture is efficiently encoded in the scalable encoding method. An embodiment in which the present invention is applied to arbitrary-shape picture encoding by which a background and an object in a video picture are separately encoded will be described below. This second embodiment of the present invention will be described with reference toFIGS. 5,6A,6B, and7. In this embodiment, the technique of the first embodiment is applied to pictures having arbitrary shapes represented by alpha-map signals.
FIG. 5 shows an encoding apparatus of the present invention as the second embodiment. The basic configuration of this encoding apparatus is the same as the encoding apparatus explained in the first embodiment. Accordingly, the same reference numerals as in the configurations shown inFIGS. 1 and 4 denote the same parts and a detailed description thereof will be omitted.
This configuration differs fromFIG. 1 in eight points; that is, an arbitrary shapeorthogonal transform circuit101 is provided instead of theDCT circuit100, inputs are received via aframe memory700, anencoding controller420 is provided instead of theencoding controller400, an encoding-controller430 is provided instead of theencoding controller410, a motioncompensation prediction section600 is provided instead of the motioncompensation prediction section200 for an enhancement layer, a motioncompensation prediction section601 is provided instead of the motioncompensation prediction section201, amotion vector detector510 is provided instead of themotion vector detector500, and amultiplexer155 is provided instead of themultiplexer151.
Theframe memory700 temporarily holds an input picture signal in units of frames. The arbitrary shapeorthogonal transform circuit101 extracts an object region from the pictures stored in theframe memory700 by referring to a separately supplied alpha-map. Thecircuit101 divides the rectangle region including the object region into blocks of a predetermined pixel size and performs DCT for each block.
Theencoding controller420 refers to the alpha-map and generates a quantization scale Q13scale, which gives an enhancement layer optimum quantization scale to output buffer capacity information from anoutput buffer160, and side information. Theencoding circuit430 refers to the alpha-map and generates a quantization scale Q_scale, which gives a base layer optimum quantization scale and side information to output buffer capacity information from anoutput buffer161, and side information.
The motioncompensation prediction section600 refers to the alpha-map and performs motion compensation prediction for a picture in the interest region part on the basis of a reconstructed value in a transform coefficient domain supplied from anadder190 and a reconstructed value in an immediately previous frame. The motioncompensation prediction section601 refers to the alpha-map and performs motion compensation prediction for the picture in the interest region part on the basis of a reconstructed value in the transform coefficient domain supplied from anadder191 and the reconstructed value in the immediately previous frame.
Themotion vector detector510 refers to the alpha-map and detects a motion vector in the picture in the interest region part from the pictures stored in theframe memory700.
Themultiplexer155 is provided for the base layer. Themultiplexer155 multiplexes a variable-length code of a prediction error signal from a variable-length encoder141, a variable-length code of side information such as mode information containing quantization scale information, a variable-length code of a motion vector, and a code (alpha-code) of a separately supplied alpha-map, and supplies the multiplexed signal to theoutput buffer161.
In this apparatus with the above configuration, aninput picture signal10 is temporarily stored in theframe memory700 and read out to the arbitrary shapeorthogonal transform circuit101 and themotion vector detector510. In addition to thepicture signal10, an alpha-map signal50 which is a map information signal for distinguishing a background portion from an object portion in a picture is input to the arbitrary shapeorthogonal transform circuit101.
This alpha-map signal can be acquired by applying, e.g., a chromakey technique. For example, in the case of an alpha-map for distinguishing a person (object) from a background, the image of the person is taken by the chromakey technique and binarized to obtain a bit-map picture in which the person image region is “1” and the background region is “0”. This picture can be used as an alpha-map.
The arbitrary-shapeorthogonal transform circuit101 refers to this alpha-map signal, checks where the object region of the picture is, divides the rectangle region including the object region into square blocks each consisting of N×N pixels, and orthogonally transforms each block to obtain N×N transform coefficients. As a technique to orthogonally transform an arbitrary-shape region of a picture by using an alpha-map, it is only necessary to use a technique established by the present inventors and disclosed in above-mentioned Japanese Patent Application No. 7-97073 which is already filed.
Although the explanation of the operation of the encoding apparatus according to the second embodiment has not finished, processing of decreasing a step will be described below as a modification.
In the method of the second embodiment described above, the average value of the object is arranged in the background. In addition to this processing, if pixel values of the object are compressed around the average value by a predetermined scaling coefficient, the step of a pixel value in the boundary between the object and the background can be decreased. Details of this processing will be described below.
To decrease the step of a pixel value in the boundary between the object and the background, pixel values of the object are compressed around the average value by a predetermined scaling coefficient. Examples of the method are illustrated inFIGS. 16 and 17. Although actual pictures are two-dimensional signals, one-dimensional signals are shown for simplicity. In these drawings, a pixel value is plotted on the ordinate, and a pixel row is plotted on the abscissa. The left-hand side of the position of a pixel row e is an object region, and the right-hand side is a background region.FIG. 16 shows a state in which a pixel value average value a of the object is arranged in the background portion.
FIG. 17 shows the result of compression around the pixel value average value a. Assuming the luminance before the compression is x and the luminance after the compression is y, the luminance y after the compression can be represented by where c is a constant between “0” and “1”.
y=cx(x−a)+a
Compression can be performed for all pixels x1 to x23 in an object shown in FIG.18. However, the step can also be decreased by compressing only the pixels x1 to x8 in the object close to the boundary to the background portion. Although an additional arithmetic operation is necessary to check whether a pixel is close to the boundary, the method has the advantage that the pixels x9 to x23 in the object not in contact with the boundary to the background are kept unchanged.
In this modification, it is decided that, of pixels in the object, those in contact with the background in any of the upper, lower, left, and right portions are pixels close to the boundary.
The foregoing is the modification by which pixel values in the object are compressed around the average value by a predetermined scaling coefficient in order to decrease the step of a pixel value in the boundary between the object and the background. However, the step of a pixel value in than boundary between the object and the background can also be decreased by processing the background portion. This modification will be described below.
FIG. 19 shows the modification of processing the background portion. In this modification, of pixels in the background, the values of pixels close the object region are so altered as to decrease the step. A practical example is shown in FIG.20. Referring toFIG. 20, xn (n=1 to 23) indicates a pixel value in the object and an (n=1 to 41) indicates a pixel value in the background. Before the processing, all pixel values an in the background are equal to a pixel average value a.
First, the values of background pixels a1 to a9 in contact with pixels in the object region in any of the upper, lower, left, and right portions are replaced with average values of the values of the contacting object pixels xn and their own pixel values. For example, the background pixel a1 is replaced with “(a1+x1)/2”, and the background pixel a3 is replaced with “(a3+x2+x3)/3”.
Subsequently, the background pixels a10 to a17 in contact with the background pixels a1 to a9 are similarly replaced. As an example, a10 is replaced with “(a10+a1)/2”. As the background pixel a1, the previously replaced value is used.
Likewise, the background pixels a18 to a24 are sequentially replaced. As a consequence, the steps of pixel values in the boundary between the object and the background are decreased.
When pixel values are altered as described above, the average value of the block changes. Therefore, pixel values in the background portion can also be corrected so that the original block average value remains unchanged. This correction is to add or subtract a predetermined value to or from all pixels in the background or to alter pixel values in the background far from the boundary in a direction opposite to the luminance direction in which pixel values close to the boundary are altered.
When pixel values in the object are altered, a picture close to an input picture can be obtained by restoring the portion before the alteration after the picture is decoded. For example, in the above method in which compression is performed around the average value, alteration is done as follows assuming that a decoded value of a pixel compressed in encoding is yd and a pixel value after the alteration is xd: where ad is the average value of the object or the background of the decoded picture or the average value of the whole block. Although yd often takes a value somewhat different from y due to an encoding/decoding distortion, xd close to x can be obtained by this alteration.
According to the encoding apparatus of the above modifications, as shown inFIG. 21, a picture signal is input to the input terminal of aswitch2003 one output terminal of which is connected to aswitch2009, acompressor2010 and anaverage circuit2011. The output terminal of thecompressor2010 is connected to the other input terminal of theswitch2009. The output terminal of theswitch2005 is connected to theswitch2005. The output terminal of theaverage circuit2011 is connected to thecompressor2010 and theswitch2005.
Adecision circuit2004 receives an alpha-map signal and the output terminal of thedecision circuit2004 is connected to the control terminal of theswitch2009. Anencoder2006 receives the alpha-map signal and aDCT circuit2017 receives a signal selected by theswitch2005.
In the encoding apparatus constructed as described above, theencoder2006 encodes an externally input alpha-map signal2001. Theswitch2003 receives the alpha-map signal2001 and apicture signal2002. On the basis of the alpha-map signal2001, theswitch2003 divides thepicture signal2002 into anobject picture2007 an abackground picture2008 and discards the background picture208. “Discarding” does not necessarily mean “sending the picture to some other place” but simply means that the picture is left unused after that.
Thedecision circuit2004 decides on the basis of the alpha-map signal2001 whether an interest pixel which is a pixel currently being processed in theobject picture2007 supplied via theswitch2003 is in contact with the background. Thedecision circuit2004 supplies adecision result2013 to theswitch2009.
Theaverage circuit2011 calculates anaverage value2012 of theobject picture2007 supplied via theswitch2003 and outputs theaverage value2012 to thecompressor2010 and theswitch2005. Thecompressor2010 compresses the amplitude of theobject picture2007 around theaverage value2012 to obtain acompressed picture2014 and outputs thecompressed picture2014 to theswitch2009.
Theswitch2009 receives thecompressed picture2014 and theobject picture2007 from theswitch2003 and refers to thedecision result2013 from thedecision circuit2004. If the interest pixel is a pixel in a portion in contact with the background, theswitch2009 selectively outputs thecompressed picture2014 as an encodedpicture2015 of the object. If the interest pixel is not in contact with the background, theswitch2009 selectively outputs theobject picture2007 as the encodedpicture2015 of the object.
Theswitch2005 receives the alpha-map signal2001, the encodedpicture2015 of the object supplied from theswitch2009, and theaverage value2012 calculated by theaverage circuit2011. On the basis of the input alpha-map signal2001, if the interest pixel which is a pixel currently being processed is the object, theswitch2005 selectively outputs the encodedpicture2015 of the object as an encodedpicture2016. If the interest pixel is the background, theswitch2005 selectively outputs theaverage value2012 as the encodedpicture2016.
TheDCT circuit2017 performs DCT for the output encodedpicture2016 from theswitch2005 and outputs atransform coefficient2018.
In the encoding apparatus with the above configuration, the alpha-map signal2001 and thepicture signal2002 are externally input. The alpha-map signal2011 is supplied to theswitch2003, thedecision circuit2004, theswitch2005, and theencoder2006. Thepicture signal2002 is supplied to theswitch2003.
On the basis of the input alpha-map signal2001, theswitch2003 divides thepicture signal2002 into theobject picture2007 and thebackground picture2008 and discards thebackground picture2008. As described previously, “discarding” does not necessarily mean “sending the picture to some other place” but simply means that the picture is left unused after that.
Theobject picture2007 separated by theswitch2003 is supplied to theswitch2009, thecompressor2010, and theaverage circuit2011. Theaverage circuit2011 calculates theaverage value2012 of the object picture and supplies theaverage value2012 to thecompressor2010 and theswitch2005. Thecompressor2010 compresses the amplitude of theobject picture2007 around theaverage value2012 and supplies thecompressed picture2014 obtained by this compression to theswitch2009.
Thedecision circuit2004 which has received the alpha-map signal2001 decides whether an interest pixel which is a pixel currently being processed is in contact with the background, and supplies thedecision result2013 to theswitch2009. If thedecision result2013 from thedecision circuit2004 indicates that the interest pixel is in contact with the background, theswitch2009 selectively outputs thecompressed picture2014 as the encodedpicture2015 of the object. If the interest pixel is not in contact with the background, theswitch2009 selectively outputs theobject picture2007 as the encodedpicture2015 of the object.
The output encodedpicture2015 of the object from theswitch2009 is supplied to theswitch2005. Theswitch2005 refers to the alpha-map signal2001 and, if the interest pixel is the object, selectively outputs the encodedpicture2015 of the object as the encodedpicture2016. If the interest pixel is the background, theswitch2009 selectively outputs theaverage value2012 as the encodedpicture2016.
The encodedpicture2016 output from theswitch2005 is supplied to theDCT circuit2017. TheDCT circuit2017 performs DCT for the encodedpicture2016 to obtain thetransform coefficient2018 and outputs thetransform coefficient2018 to the outside. The alpha-map signal2001 is encoded by theencoder2006 and output to the outside as an alpha-code2019.
Note that there is another method in which an alpha-map is encoded before a picture is encoded and the decoded signal is input to theswitches2003 and2005 and thedecision circuit2004. If a distortion occurs in encoding and decoding of an alpha-map, the alpha-map signals on the encoding and decoding sides can be made equal by this method.
FIG. 22 shows a decoding apparatus as a counterpart of the encoding apparatus in FIG.21. According to this decoding apparatus, adecoder2020 receives an encoded alpha-map signal. The output terminal of thedecoder2020 is connected to control terminals ofswitches2023 and2025 and adecision circuit2024. Aninverse DCT circuit2021 receives transform efficient2018 of the encoded picture. The output terminal of theinverse DCT circuit2021 is connected to adecompressor2030 and anaverage circuit2031. The output terminal of thececompressor2030 is connected to aswitch2029 together with the output terminal of theswitch2023. Theswitch2029 is connected to aswitch2036.
In the decoding apparatus as described above, thedecoder2020 receives the externally input alpha-code2019, decodes the alpha-code2019, and supplies a decoded alpha-map signal2022 to theswitch2023, thedecision circuit2024, and theswitch2025. Theinverse DCT circuit2021 performs inverse DCT for the externallyinput transform coefficient2018 to decode a picture and supplies the picture obtained by the decoding, i.e., a decodedpicture2026, to theswitch2023.
Thedecision circuit2024 decides on the basis of the alpha-map signal2022 decoded by thedecoder2020 whether an interest pixel in anobject picture2027 is in contact with the background. Thedecision circuit2024 outputs adecision result2034 to theswitch2029.
On the basis of the alpha-map signal2022 decoded by thedecoder2020, theswitch2023 divides the decodedpicture2026 supplied from theinverse DCT circuit2021 into theobject picture2027 and abackground picture2028. Theswitch2023 outputs theobject picture2027 to theswitch2029, thedecompressor2030, and theaverage circuit2031 and discards thebackground picture2028.
Theaverage circuit2031 calculates anaverage value2032 of theobject picture2027 supplied from theswitch2023 and outputs theaverage value2032 to thedecompressor2030. Thedecompressor2030 expands the amplitude of theobject picture2027 around theaverage value2032 to obtain an expandedpicture2033 and outputs the expandedpicture2033 to theswitch2029.
Of theobject picture2027 and the expandedpicture2033 thus input, the expandedpicture2033 is selectively output as a decodedpicture2035 of the object from theswitch2029 to theswitch2025, if theoutput decision result2034 from thedecision circuit2033 indicates that the interest pixel is in contact with the background. If the interest pixel is not in contact with the background, theswitch2029 selectively outputs theobject picture2027 as the decodedpicture2035 of the object to theswitch2025.
Theswitch2025 receives the decodedpicture2035 of the object and asignal2037 which is separately input as the background, and refers to the alpha-map signal2022. If the interest pixel is the object, theswitch2025 selectively outputs the decodedpicture2035 of the object as areconstructed picture2036 to the outside. If the interest pixel is the background, theswitch2025 selectively outputs thesignal2037 as thereconstructed picture2036 to the outside.
In the decoding apparatus with the above configuration, the alpha-code2019 and thetransform coefficient2018 are externally input. The alpha-code2019 is supplied to thedecoder2020, and thetransform coefficient2018 is supplied to theinverse DCT circuit2021.
Thedecoder2020 decodes the alpha-map signal2022 and outputs the decoded signal to theswitch2023, thedecision circuit2024, and theswitch2025. Theinverse DCT circuit2021 decodes the picture and supplies the decodedpicture2026 to theswitch2023.
On the basis of the alpha-map signal2022 decoded by thedecoder2020, theswitch2023 divides the decodedpicture2026 into theobject picture2027 and thebackground picture2028 and discards thebackground picture2028. Theobject picture2027 separated by theswitch2023 is supplied to theswitch2029, theexpander2030, and theaverage circuit2031.
Theaverage circuit2031 calculates theaverage value2032 of theobject picture2027 and supplies theaverage value2032 to theexpander2030.
Theexpander2030 expands the amplitude of theobject picture2027 around theaverage value2032 and supplies the expandedpicture2033 thus obtained to theswitch2029.
Thedecision circuit2024 decides whether an interest pixel in theobject picture2027 is in contact with the background and supplies thedecision result2034 to theswitch2029. If thedecision result2034 indicates that the interest pixel is in contact with the background, theswitch2029 selectively outputs the expandedpicture2033 as the decodedpicture2035 of the object. If the interest pixel is not in contact with the background, theswitch2029 selectively outputs theobject picture2027 as the decodedpicture2035 of the object.
The output decodedpicture2035 of the object from theswitch2029 is supplied to theswitch2025. Theswitch2025 refers to the alpha-map signal2022 and, if the interest pixel which is a pixel currently being processed is the object, outputs the decodedpicture2035 of the object as thereconstructed picture2036 to +the outside. If the interest pixel is the background, theswitch2029 selectively outputs thesignal2037, which is separately input as the background, as thereconstructed picture2036. Note that a reconstructed signal of a background picture which is separately encoded or a predetermined pattern is used as thebackground signal2037.
The foregoing are examples of the processing of decreasing the step.
The examples of the processing of decreasing the step have been described above, and a description will return to the subject of the second embodiment.
As already described above, the arbitrary shapeorthogonal transform circuit101 refers to the alpha-map signal, checks the interest region of the picture, divides the interest region of the picture into square blocks each consisting of N×N pixels, and orthogonally transforms each block to obtain N×N transform coefficients.
For a block containing the boundary between the object and the background, a transform coefficient for the object and a transform coefficient for the background are calculated. These transform coefficients are supplied toadders110 and111 for the enhancement layer and the base layer, respectively.
Upon receiving the transform coefficient, theadder111 of the base layer calculates a prediction error signal between this transform coefficient and a motion compensation prediction value (BMC) which is conversed into an orthogonal transform coefficient and supplied from the motioncompensation prediction section601. Theadder111 supplies the result to aquantizer131. Thequantizer131 quantizes the prediction error signal in accordance with the quantization scale Q13scale supplied from theencoding controller430 and supplies the quantized signal to the variable-length encoder141 and adequantizer171.
The variable-length encoder141 performs variable-length encoding for the quantized prediction error signal. The variable-length encoder141 also performs variable-length encoding for the side information such as the mode information containing the quantization scale information supplied from theencoding controller430 and the motion vector supplied from themotion vector detector510.
These variable-length codes obtained by the variable-length encoder141 are supplied to themultiplexer155. Themultiplexer155 multiplexes these variable-length codes together with an alpha-code55 which is encoded and supplied to themultiplexer155. Themultiplexer155 outputs the multiplexed signal to theoutput buffer161.
Theoutput buffer161 outputs the multiplexed signal as an encodedbit stream21 to a transmission line or a storage medium and also feeds the capacity of the buffer back to theencoding controller430. In accordance with this buffer capacity, theencoding controller430 generates the optimum quantization scale Q_scale.
The quantized Value of the prediction error signal supplied to thedequantizer171 is dequantized by thedequantizer171. Theadder191 adds the dequantized value to the motion compensation prediction value (BMC), thereby calculating a reconstructed value in the transform coefficient domain. The reconstructed value is supplied to the motioncompensation prediction section601.
In the enhancement layer, aselector300 performs selection on the basis of the value of an output (BQ) from thequantizer131 in the base layer. That is, theselector300 adaptively switches the output (EMC) from the motioncompensation prediction section600 in the enhancement layer and the output (BMC) from the motioncompensation prediction section601 in the base layer for each transform coefficient by using a method to be described later and outputs the selected input as EP.
More specifically, the output (BQ) from thequantizer131 in the base layer is supplied to abinarizing circuit310. If the value of BQ is “0”, thebinarizing circuit310 outputs “0” to theselector300. If the value is not “0”, thebinarizing circuit310 outputs “1” to theselector300.
If the output from thebinarizing circuit310 is “0”, the selector selectively outputs EMC as EP. If the output is BMC, theselector300 selectively outputs BMC as EP. Consequently, the transform coefficient output EMC from the motioncompensation prediction section600 in the enhancement layer is applied to a transform coefficient in a position where the output BQ from thequantizer131 is “0”, and the transform coefficient output BMC from the motioncompensation prediction section601 in the base layer is applied to a transform coefficient in a position where the output BQ from thequantizer131 is not “0”.
Thequantizer131 in the base layer receives and quantizes the output from theadder111. Theadder111 receives the output from the arbitrary-shapeorthogonal transform circuit101 and the motion compensation prediction value obtained from a picture in an immediately previous frame by the motioncompensation prediction section601 and calculates the difference between them. Therefore, if the motion compensation prediction value is correct, the difference between the two values output from theadder111 is “0”.
Accordingly, of the quantized values as the output BQ from thequantizer131 in the base layer, coefficients whose values are not “0” are coefficients representing that the motion compensation prediction is incorrect.
If the motioncompensation prediction section600 performs motion compensation prediction by using the same motion vector as in the base layer supplied from themotion vector detector510, it is estimated that motion compensation prediction for coefficients in the enhancement layer in the same positions as in the base layer is incorrect.
For these coefficients, therefore, theselector300 selects BMC as the output from the motioncompensation prediction section601 for the base layer.
On the other hand, since it is estimated that motion compensation for other coefficients is correct, theselector300 selects the prediction value in the enhancement layer with a smaller encoding distortion. Consequently, the signal EC encoded in the enhancement layer is the quantized error signal of the base layer if motion compensation prediction is incorrect, and is the motion compensation prediction error signal of the enhancement layer if motion compensation prediction is correct. This improves the efficiency of encoding for coefficients the motion compensation prediction for which is incorrect.
Theadder110 in the enhancement layer calculates a prediction error signal between the transform coefficient of the input picture supplied from the arbitrary-shapeorthogonal transform circuit101 and the output (EP) from theselector300 and supplies the result to an adder121.
The adder121 receives adequantized value30 of BQ supplied from thedequantizer171. Accordingly, the adder121 calculates the difference EC between thevalue30 and the output from theadder110 and supplies the difference EC as a prediction error signal to aquantizer130.
Thequantizer130 quantizes the signal EC by using the quantization scale Q13scale supplied from theencoding controller420 in accordance with the buffer capacity. Thequantizer130 supplies the quantized output to a variable-length encoder140 and adequantizer170. The variable-length encoder140 separately performs variable-length encoding for the quantized prediction error signal and the side information such as the mode information supplied from theencoding controller420 and supplies the variable-length codes to amultiplexer150.
Themultiplexer150 multiplexes these variable-length codes and supplies the multiplexed signal to theoutput buffer160. Theoutput buffer160 temporarily holds the signal and outputs the signal as an encodedbit stream20 to a transmission line or a storage medium. Also, theoutput buffer160 feeds the capacity of the buffer to theencoding controller420. Upon receiving the buffer capacity, theencoding controller420 generates the optimum quantization scale Q13scale corresponding to the capacity and supplies the quantization scale Q13scale to thequantizer130 and the variable-length encoder140.
The quantized value supplied from thequantizer130 to thedequantizer170 is dequantized. Anadder180 adds the dequantized value to theoutput30 supplied from thedequantizer171 in the base layer, thereby reconstructing the prediction error signal.
Theadder190 adds the prediction error signal reconstructed by theadder180 to the motion compensation prediction value (EP) supplied from theselector300, calculating a reconstructed value in the transform coefficient domain. The reconstructed value is supplied to the motioncompensation prediction section600.
FIG. 6A is a block diagram of the motioncompensation prediction sections600 and601. Each of the motioncompensation prediction sections600 and601 comprises an arbitrary-shape inverseorthogonal transform circuit610, aframe memory620, amotion compensation prediction630, and an arbitrary shapeorthogonal transform circuit640. The arbitrary shape inverseorthogonal transform circuit610 inversely orthogonally transforms a reconstructed picture signal as an input signal in accordance with an alpha-map signal. Theframe memory620 temporarily holds the inversely orthogonally transformed signal in units of frames. Themotion compensation prediction630 receives the information of a motion vector, extracts a picture in a position indicated by the motion vector in units of frames, and supplies the extracted picture to the arbitrary-shapeorthogonal transform circuit640. The arbitrary-shapeorthogonal transform circuit640 orthogonally transforms the supplied picture in accordance with the alpha-map signal. In other words, the arbitrary-shapeorthogonal transform circuit640 orthogonally transforms the motion compensation prediction value of an arbitrary shape, thereby obtaining a motion compensation prediction value in a transform coefficient domain.
In this configuration, a reconstructed value in a transform coefficient domain is supplied to the motioncompensation prediction sections600 and601. The arbitrary-shape inverseorthogonal transform circuit610 in this motion compensation prediction section inversely transforms the reconstructed value into a reconstructed picture signal in accordance with the alpha-map signal50 which is separately supplied. The reconstructed picture signal is stored in theframe memory620.
Of the reference pictures stored in theframe memory620, a picture in a position indicated by the motion vector is extracted by themotion compensation circuit630 in the motion compensation prediction section in units of blocks, and the extracted picture is supplied to the arbitrary shapeorthogonal transform circuit640 in the motion compensation prediction section. Upon receiving the blocks of the picture, the arbitrary shapeorthogonal transform circuit640 orthogonally transforms the picture blocks in accordance with the externally supplied alpha-map signal50, thereby orthogonally transforming the motion compensation prediction value of an arbitrary shape. Consequently, the arbitrary shapeorthogonal transform circuit640 can calculate and output the motion compensation prediction value in the transform coefficient domain. In a block containing the boundary between the object and the background, transform coefficients of both the object and the background are calculated.
From the reconstructed value in the transform coefficient domain, the motioncompensation prediction sections600 and601 calculate the motion compensation prediction values EMC and BMC in the transform coefficient domain and supplies the values in theselector300.
The foregoing is the explanation of the encoding apparatus of the second embodiment. The decoding apparatus of the second embodiment will be described below.
FIG. 7 is a block diagram of the decoding apparatus of the present invention.
This configuration differs fromFIG. 4 in three points; that is, a motioncompensation prediction section602 is provided instead of the motioncompensation prediction section203, a motioncompensation prediction section603 is provided instead of the motioncompensation prediction section203, and ademultiplexer157 is provided instead of thedemultiplexer153.
Each of the motioncompensation prediction sections602 and603 performs motion compensation prediction by referring to an alpha-map signal. Thedemultiplexer153 demultiplexes a quantized value of a transform coefficient, a motion vector, and side information such as a quantization scale and transfers the demultiplexed signals to the variable-length decoder143. Thedemultiplexer157 additionally has a function of demultiplexing an alpha-code and transferring the demultiplexed codes to an alpha-map decoding apparatus (not shown).
In this configuration, a baselayer bit stream23 which is formed by encoding and multiplexing a quantized value of a transform coefficient, a motion vector, side information such as a quantization scale, and an alpha-code is input to the input stage of the base layer. Thisbit stream23 is stored in an input buffer167 and supplied to thedemultiplexer157.
Thedemultiplexer157 demultiplexes the bit stream into the quantized value of the transform coefficient, the motion vector, the side information, and the alpha-code. Of these demultiplexed codes, the quantized value of the transform coefficient, the motion vector, and the side information are supplied to a variable-length decoder143 and decoded into signals of the quantized value of the transform coefficient, the motion vector, and the quantization scale. Note that a code (alpha-code)56 of an alpha-map signal is supplied to an alpha-map decoding apparatus (not shown) where the code is converted into the alpha-map signal, and the signal is supplied to the motioncompensation prediction sections602 and603.
Of the signals decoded by the variable-length decoder143, the quantized value of the transform coefficient is dequantized by adequantizer173 and supplied to anadder193. Theadder193 adds the dequantized transform coefficient and a motion compensation prediction value in a transform coefficient domain supplied from the motioncompensation prediction section603, thereby calculating a reconstructed value in the transform coefficient domain.
This reconstructed value is supplied to the motioncompensation prediction section603 and inversely transformed into a reconstructed picture signal by an arbitrary-shape inverseorthogonal transform circuit610. The signal is output as an output reconstructedpicture signal41 and stored in a frame memory620 (FIG.6B).
An enhancementlayer bit stream22 formed by encoding and multiplexing signals of a quantized value of a transform coefficient and side information such as a quantization scale is input to the input stage of the enhancement layer. Thebit stream22 is stored in aninput buffer162 and supplied to ademultiplexer152. Thedemultiplexer152 demultiplexes the bit stream into a code of the quantized value of the transform coefficient and a code of the side information.
The codes demultiplexed by thedemultiplexer152 are supplied to a variable-length decoder142 and decoded into signals of the quantized value of the transform coefficient and the quantization scale. The quantized value of the transform coefficient is dequantized by adequantizer172 and supplied to anadder181. Theadder181 adds this dequantized value to adequantized value31 supplied from thedequantizer173 of the base layer and supplies the sum to anadder192.
Theadder192 adds the output from theadder181 and the signal EP supplied from theselector300 and thereby calculates a reconstructed value in a transform coefficient domain. This reconstructed value is supplied to the motioncompensation prediction section602 and inversely transformed into a reconstructed picture signal by an arbitrary-shape inverse orthogonal transform circuit610 (FIG. 6B) provided in the motioncompensation prediction section602. The reconstructed picture signal is output as an output reconstructedpicture signal40 and stored in aframe memory620 provided in the motioncompensation prediction section602.
Of the reference pictures stored in theframe memory620, a picture in a position indicated by the motion vector is extracted by a motion compensation circuit630 (FIG. 6B) in the motion compensation prediction section in units of blocks, and the extracted picture is supplied to an arbitrary shapeorthogonal transform circuit640 in the motion compensation prediction section. The arbitrary-shapeorthogonal transform circuit640 orthogonally transforms the motion compensation prediction value of an arbitrary shape in accordance with an externally supplied alpha-map signal50, thereby calculating and outputting a motion compensation prediction value in a transform coefficient domain. For a block containing the boundary between the object and the background, transform coefficients for both the object and the background are calculated.
In this way, the motioncompensation prediction sections600 and601 calculate the motion compensation prediction values. EMC and BMC in the transform coefficient domain from the reconstructed value in the transform coefficient domain and supplies these values to theselector300.
One modification of the motioncompensation prediction sections600,601,602, and603 of the second embodiment will be described below with reference toFIGS. 8A and 8B. This modification is accomplished by expanding a background prediction system (e.g., Miyamoto et al.: “Adaptive Predictive Coding System Using Background Prediction”, PCSJ88,7-4, pp. 93-94, Watanabe et al.: “Adaptive Four-Difference—DCT Coding System”, PCSJ88,8-2, pp. 117-118), which is conventionally used to improve the efficiency of encoding of a background region hidden by the movement of an object, so that the system is also usable to overlapping of objects.
As shown inFIGS. 8A and 8B, the motion compensation prediction section comprises an arbitrary-shape inverseorthogonal transform circuit610, anSW circuit650, framememories621,622, and623, anSW circuit651, amotion compensation circuit630, and an arbitrary shapeorthogonal transform circuit640.
The arbitrary-shape inverseorthogonal transform circuit610 inversely transforms a reconstructed picture signal in accordance with an alpha-map signal. When a reconstructed value in a transform coefficient domain is input to the motion compensation prediction section, this reconstructed value is supplied to the arbitrary-shape inverseorthogonal transform circuit610 as one component of the motion compensation prediction section. Thecircuit610 inversely transforms the reconstructed value into a reconstructed picture signal in accordance with an alpha-map signal. This alpha-map signal is supplied from an alpha-map decoding apparatus (not shown) provided in the decoding system.
The reconstructed picture signal inversely transformed by the arbitrary-shape inverseorthogonal transform circuit610 is stored in one of theframe memories621,622, and623. TheSW circuit650 selects one of theframe memories621,622, and623 into which the signal is stored. For example, of theframe memories621,622, and623, theframe memories621 and622 are used to store object pictures, and theframe memory623 is used to store background pictures. The object frame memories are prepared for two frames to separately hold pictures of two different objects appearing in a frame. If three or more different objects exist, it is only necessary to prepare the number of object frame memories corresponding to the number of objects and allow theswitch circuit650 to select a corresponding frame memory.
TheSW circuit650 can store the reconstructed picture signal from the arbitrary-shape inverseorthogonal transform circuit610 in one of theframe memories621,622, and623 in accordance with an alpha-map signal by opening or closing the switch in accordance with the alpha-map signal.
The SW circuit551 opens or closes the switch in accordance with the alpha-map signal, thereby selecting one of theframe memories621,622, and623 in accordance with the alpha-map signal and reading out the reconstructed picture signal stored in that memory. Of the reconstructed picture signal (reference picture) read out from theframe memory621,622, or623 via theSW circuit651, themotion compensation circuit630 extracts a picture in a position indicated by a motion vector in units of blocks and supplies the extracted picture to the arbitrary-shapeorthogonal transform circuit640.
The arbitrary-shapeorthogonal transform circuit640 orthogonally transforms the reconstructed picture signal of the picture in the position indicated by the motion vector, which is read out from theframe memory621,622, or623 via theSW circuit651, on the basis of the alpha-map signal, thereby orthogonally transforming a motion compensation prediction value of a picture of an arbitrary shape indicated by the alpha-map signal. That is, thecircuit640 calculates and outputs the motion compensation prediction value in the transform coefficient domain.
In the configuration shown inFIGS. 8A and 8B, it is assumed that the alpha-map supplied to the arbitrary-shape inverseorthogonal transform circuit610 can specify one of a plurality of objects and a background to which a pixel belongs.
In this configuration, a reconstructed value in a transform coefficient domain is supplied to the motion compensation prediction section. A portion of this reconstructed value, i.e., a picture of an arbitrary shape indicated by an alpha-map is inversely orthogonally transformed into a reconstructed picture signal by the arbitrary-shape inverseorthogonal transform circuit610. The reconstructed picture signal is stored in one of theframe memories621,622, and623 for the objects and the background by theSW circuit650 in accordance with the alpha-map signal.
These stored signals are sequentially selected and read out by theSW circuit651 in accordance with the alpha-map signal and supplied to themotion compensation circuit630 where a motion compensation prediction value is calculated.
As described above, to calculate the motion compensation prediction value in themotion compensation circuit630, theSW circuit651 forms the motion compensation prediction value from the reference pictures stored in the frame memories. This improves the efficiency of encoding of a region which is hidden by overlapping of objects.
The motion compensation prediction value calculated by themotion compensation circuit630 is supplied to the arbitrary-shapeorthogonal transform circuit640 and orthogonally transformed on the basis of the alpha-map signal. The result is an orthogonal transform coefficient of the motion compensation prediction value of the picture of the arbitrary shape indicated by the alpha-map signal.
In this modification as described above, to obtain the motion compensation prediction value in themotion compensation circuit630, the motion compensation prediction value is formed for each of pictures stored in the frame memories which separately store pictures of the objects and the background in accordance with an alpha-map signal. This improves the efficiency of encoding of a region hidden by overlapping of the objects.
Other configurations of thequantizers130 and131 and thedequantizers170,171,172, and173 in the first and second embodiments will be described below with reference toFIGS. 2 and 9 to11.
Quantization matrices shown inFIGS. 9 and 10 are described in TM5 as a test model of MPEG2. In each ofFIGS. 9 and 10, the matrix is represented by a two-dimensional matrix in a horizontal direction (h) and a vertical direction (v) with respect to 8×8 transform coefficients.
The following equations show examples of quantization and dequantization using the quantization matrices inFIGS. 9 and 10.
Quantization:
level(v,h)=sign(coef(v,h))*|coef(v,h)|*(v,h)/(2*Q_scale)  (1)
Inverse quantization:
coef(v,h)=sign(level(v,h))(2*|level(v,h)|*e(v,h)/16+1)*Q_scale  (2)
where
    • coef(v,h): transform coefficient
    • level(v,h): quantized value
    • coef(v,h): transform coefficient (reconstructed value)
    • w(v,h): quantization matrix
    • Q13scale: quantization scale
The modification is related to a quantization matrix for changing the weight of a quantization step size for each transform coefficient. In the SNR scalability, quantization in the enhancement layer is performed more finely than in the base layer.
In the base layer, therefore, the quantization matrices as shown inFIGS. 9 and 10 are used to finely quantize low-frequency transform coefficients and roughly quantize high-frequency coefficients. If this is the case, the subjective evaluation often improves when encoding is performed at the same encoding rate as when encoding is performed with a fixed quantization step size. Also, the coding efficiency is increased by increasing the occurrence probability of 0 by enlarging the center dead zone in a quantizer. This improves the quality of reconstructed pictures at low rates.
In the enhancement layer, on the other hand, if high-frequency transform coefficients are roughly quantized, no fine textures are reconstructed to result in visual degradation. This also increases the influence of feedback quantization noise in high-frequency transform coefficients.
Accordingly, in the enhancement layer a quantization matrix is used only for a transform coefficient whose quantized value BQ in the base layer is not “0”.FIG. 11 shows an example of a quantization matrix obtained for the example shown in FIG.2. When this matrix is used, the quantization error of a transform coefficient Whose motion compensation prediction error is large is increased in the enhancement layer. However, a quantization error in a largely changing portion is inconspicuous due to the masking effect of visual characteristics, and the resulting visual degradation is little.
An example of transform to a one-dimensional sequence performed when a quantized transform coefficient is variable-length-encoded in the first and second embodiments will be described below with reference toFIGS. 2,12, and13. This transform to a one-dimensional sequence is generally done by using a transform method called zigzag scan shown in FIG.12.
FIG. 12 shows a two-dimensional matrix divided into eight portions in each of a horizontal direction (h) and a vertical direction (v). InFIG. 12, 8×8 transform coefficients are arranged in increasing order of numbers given in measures. Consequently, low-frequency transform coefficients are arranged first and high-frequency transform coefficients are arranged next. Therefore, the larger the ordinal number for a quantized value, the higher the probability of “0”, and this increases the coding efficiency when a combined event of the number of 0 runs and quantized values after the 0 runs is variable-length-encoded. This makes use of the properties that a lower-frequency coefficient has a higher electric power.
In this example, therefore, the scan order in the enhancement layer is such that transform coefficients in positions where the quantized value BQ of the base layer is not “0” are arranged before transform coefficients in positions where BQ is “0” so that the transform coefficients are arranged in increasing order of zigzag scan numbers.
That is, a transform coefficient in a position where BQ is “0” is a motion compensation prediction error signal in the enhancement layer, and a transform coefficient in a position where BQ is not “0” is a quantization error in the base layer. Accordingly, the above method is based on the assumption that the statistical properties of the two transform coefficients are different.FIG. 13 shows a scan order corresponding to the example shown in FIG.2. InFIG. 13, the order of numbers given in measures is the scan order.
In the above example, it is assumed that transform bases do not overlap between blocks. On the other hand, reference: “Ikuzawa et al., Video Encoding Using Motion Compensation Filter Bank Structure, PCSJ92,8-5, 1992” has proposed an encoding method using a motion compensation filter bank structure in which a decrease of the coding efficiency is little even when bases overlap each other because a transformed difference arrangement is used.
The concept of the above reference is applicable to a prediction encoding apparatus (transformed difference arrangement) in an orthogonal transform coefficient domain as in the present invention. Therefore, the motion compensation filter bank structure can be applied to the above example.
In the second embodiment described above, a frame of a video picture is orthogonally transformed by dividing the frame into matrices each having a predetermined number (N×N) of pixels to obtain transform coefficients for individual bands of the spacial frequency, and motion compensation is performed in a transform coefficient domain for each of the N×N transform coefficients in each of upper and lower layers. In this video encoding, motion compensation is performed for a picture in an interest region by using alpha-map information. When this motion compensation is performed, Whether motion compensation prediction is correct is checked on the basis of an already decoded and quantized value in the lower layer (base layer). If the motion compensation prediction is correct, encoding in the upper layer (enhancement layer) is performed by using a motion compensation prediction value obtained for the upper layer and having a smaller encoding distortion. If the motion compensation prediction is incorrect, encoding in the upper layer is done by using a motion compensation prediction value obtained for the lower layer (base layer) and having a larger encoding distortion than that in the enhancement layer. This improves the coding efficiency for a coefficient the motion compensation prediction for which is incorrect and realizes an encoding system capable of encoding with little decrease in the coding efficiency.
Accordingly, the resolution and the image quality can be varied in an arbitrary-shape picture encoding apparatus which separately encodes the background and the object. In addition, it is possible to provide scalable encoding and decoding apparatuses having a high coding efficiency.
The third embodiment will be described below with reference to FIG.14.
As shown inFIG. 14, in blocks (enclosed by the solid lines) containing the boundary between the object and the background, motion vectors are separately detected for the object and the background. Since the number of pixels of either the object or the background decreases, the influence of noise increases, and this decreases the reliability of the motion vector.
In blocks in the boundary, therefore, the motion vector detection range is made narrower than that in other blocks (indicated by the broken lines).
Also, the object in a current frame has moved from the object in a reference frame. Therefore, erroneous detection of a motion vector can be reduced by limiting the motion vector search range for the object to the inside of the object in the reference frame. Limiting the search range also has an effect of decreasing the amount of motion vector search calculations. Likewise, a motion vector is calculated from a background portion in the background.
As described above, a large error can be prevented by making the motion vector detection range in blocks in the boundary narrower than that in other blocks (indicated by the broken lines).
Finally, as an application of the present invention, an embodiment of a video transmission system to which the video encoding and decoding apparatuses of the present invention are applied will be described below with reference toFIGS. 15A to15C.
In this system as shown inFIG. 15A, an input video signal from acamera1002 of a personal computer (PC)1001 is encoded by a video encoding apparatus incorporated into thePC1001. The encoded data output from the video encoding apparatus is multiplexed with other data of sounds or information. The multiplexed data is transmitted by radio from aradio transceiver1003 and received by anotherradio transceiver1004.
The signal received by theradio transceiver1004 is demultiplexed into the encoded data of the video signal and the data of sounds or information. The encoded data of the video signal is decoded by a video decoding apparatus incorporated into a workstation (EWS)1005 and displayed on the display of theEWS1005.
An input video signal from acamera1006 of theEWS1005 is encoded in the same fashion as above by using a video encoding apparatus incorporated into theEWS1005. The encoded data of the video signal is multiplexed with other data of sounds or information. The multiplexed data is transmitted by radio from theradio transceiver1004 and received by theradio transceiver1003. The signal received by theradio transceiver1003 is demultiplexed into the encoded data of the video signal and the data of sounds or information. The encoded data of the video signal is decoded by a video decoding apparatus incorporated into thePC1001 and displayed on the display of thePC1001.
FIG. 15B is a block diagram schematically showing the video encoding apparatus incorporated into thePC1001 and theEWS1005 in FIG.15A.FIG. 15C is a block diagram schematically showing the video decoding apparatus incorporated into thePC1001 and theEWS1005 in FIG.15A.
The video encoding apparatus inFIG. 15B comprises an informationsource encoding section1102 which receives a picture signal from avideo input section1101 such as a camera and has anerror resilience processor1103, and a transmissionline encoding section1104. The informationsource encoding section1101 performs discrete cosine transform (DCT) for a prediction error signal and quantizes the formed DCT coefficient. The transmissionline encoding section1104 performs variable-length encoding, error detection for encoded data, and error correcting coding. The encoded data output from the transmissionline encoding section1104 is supplied to aradio transceiver1105 and transmitted. The processing in the datasource encoding section1101 and the variable-length encoding in the transmissionline encoding section1104 is done by applying processing methods such as explained in the embodiments of the present invention.
The video decoding apparatus shown inFIG. 15C comprises a transmissionline decoding section1202 and a datasource decoding section1203 having anerror resilience processor1204. The transmissionline decoding section1202 receives encoded data received by aradio transceiver1201 and performs processing which is the reverse of the processing done by the transmissionline encoding section1104 for the input encoded data. The datasource decoding section1203 receives the output signal from the transmissionline decoding section1202 and performs processing which is the reverse of the processing done by the datasource encoding section1102 for the input signal. The picture decoded by the datasource decoding section1203 is output by a video output section1025 such as a display.
The decoding processing in these sections is performed by applying processing methods such as explained in the embodiments of the present invention. As has been described above, the present invention accomplishes scalable encoding in which the quality of an arbitrary-shape picture can be varied step by step without largely decreasing the coding efficiency. Also, in the present invention, it is possible to decrease the amount of generated codes when DCT is performed for a picture of an arbitrary shape.
Additional advantages and modifications will readily occur to those skilled in the art. Therefore, the invention in its broader aspects is not limited to the specific details, representative devices, and illustrated examples shown and described herein. Accordingly, various modifications may be made without departing from the spirit or scope of the general inventive concept as defined by the appended claims and their equivalents.

Claims (51)

28. A computer system for video encoding, comprising:
an encoder configured to encode an alpha-map signal for discriminating a background of an input signal and at least an object thereof;
a motion compensation prediction encoder configured to encode an arbitrary shape picture of the object in accordance with the alpha map signal to obtain a coded arbitrary shape picture;
a local decoder configured to decode the coded arbitrary shape picture to reconstruct the arbitrary shape picture of the object and output a reconstructed picture signal representing the object;
first memory means for storing a background signal representing the background of the input signal; and
second memory means for storing the reconstructed picture signal representing the object, the reconstructed picture signal being read out from the second memory means to be used for motion compensation prediction encoding of the arbitrary shape picture of the object.
31. A computer system for video encoding, comprising:
an encoder configured to encode an alpha-map signal for discriminating a background of an input signal and a plurality of objects thereof,
a motion compensation prediction encoder configured to encode an arbitrary shape pictures of the objects in accordance with the alpha-map signal to obtain a plurality of coded arbitrary shape pictures;
a local decoder configured to decode the coded arbitrary shape pictures to reconstruct the arbitrary shape pictures of the objects and to output a plurality of reconstructed picture signals representing the objects; and
a plurality of second memory means for storing the reconstructed picture signals, respectively, the reconstructed picture signals being read out from the second memory means to be used for motion compensation prediction encoding of each of the arbitrary shape pictures of the objects.
36. A computer system for video encoding, comprising:
an encoder configured to encode an alpha-map signal for discriminating a background of an input signal and at least one object thereof;
a motion compensation prediction encoder configured to encode an arbitrary shape picture of the object in accordance with the alpha-map signal to obtain a coded arbitrary shape picture;
a local decoder configured to decode the coded arbitrary shape picture to reconstruct the arbitrary shape picture of the object and output a reconstructed picture signal representing the object;
a first memory configured to store a background signal representing the background of the input signal; and
a second memory configured to store the reconstructed picture signal representing the object, the reconstructed picture signal being read out from the second memory to be used for motion compensation prediction encoding of the arbitrary shape picture of the object.
39. A computer system for video encoding, comprising:
an encoder configured to encode an alpha-map signal for discriminating a background of an input signal and a plurality of objects thereto;
a motion compensation prediction encoder configured to encode arbitrary shape pictures of the objects in accordance with the alpha-map signal to obtain a plurality of coded arbitrary shape pictures;
a local decoder configured to decode the coded arbitrary shape pictures to reconstruct the arbitrary shape pictures of the objects and to output a plurality of reconstructed picture signals representing the objects;
a first memory configured to store a background signal representing the background of the input signal; and
a plurality of second memories configured to store the reconstructed picture signals representing the objects, respectively, the reconstructed picture signals being read out from the second memories to be used for motion compensation prediction encoding of each of the arbitrary shape pictures of the objects.
44. A method of encoding video, comprising the steps of:
encoding an alpha-map signal for discriminating a background of an input signal and at least one object thereof;
encoding an arbitrary shape picture of the object in accordance with the alpha-map signal to obtain a coded arbitrary shape picture;
decoding the coded arbitrary shape picture to reconstruct the arbitrary shape picture of the object and output a reconstructed picture signal representing the object;
storing a background signal representing the background of the input signal in a first memory;
storing the reconstructed picture signal in a second memory representing the object; selectively switching the first memory and the second memory in accordance with the alpha-map signal to store selectively the reconstructed picture signal and the background signal therein; and
reading the reconstructed picture signal out from the second memory to be used for motion compensation prediction encoding of the arbitrary shape picture of the object.
45. A method of encoding video, comprising the steps of:
encoding an alpha-map signal for discriminating a background of an input signal and at least one object thereof;
encoding an arbitrary shape picture of the object in accordance with the alpha-map signal to obtain a coded arbitrary shape picture;
decoding the coded arbitrary shape picture to reconstruct the arbitrary shape picture of the object and output a reconstructed picture signal representing the object;
storing in a background signal representing the background of the input signal in a first memory;
storing the reconstructed picture signal representing the object in a second memory;
selectively switching the first memory and the second memory in accordance with the alpha-map signal to read out selectively the reconstructed picture signal and the background signal therefrom; and
reading the reconstructed picture signal out from the second memory to be used for motion compensation prediction encoding of the arbitrary shape picture of the object.
46. A method of encoding video, comprising the steps of:
encoding an alpha-map signal for discriminating a background of an input signal and a plurality of objects thereof;
encoding arbitrary shape pictures of the objects in accordance with the alpha-map signal to obtain a plurality of coded arbitrary shape pictures;
decoding the coded arbitrary shape pictures to reconstruct the arbitrary shape pictures of the objects and to output a plurality of reconstructed picture signals representing the objects;
storing a background signal in a first memory;
storing the reconstructed picture signals representing the objects in at least one of second memories;
selectively switching the second memories in accordance with the alpha-map signal to selectively store the reconstructed picture signals corresponding to the objects therein; and
reading the reconstructed picture signals out from the at least one of second memories to be used for motion compensation prediction encoding of each of the arbitrary shape pictures of the objects.
47. A method of encoding video, comprising the steps of:
encoding an alpha-map signal for discriminating a background of an input signal and a plurality of objects thereof;
encoding arbitrary shape pictures of the objects in accordance with the alpha-map signal to obtain a plurality of coded arbitrary shape pictures;
decoding the coded arbitrary shape pictures to reconstruct the arbitrary shape pictures of the objects and to output a plurality of reconstructed picture signals representing the objects;
storing a background signal in a first memory;
storing the reconstructed picture signals representing the objects in at least one of second memories; and
selectively switching the second memories in accordance with the alpha-map signal to read out the reconstructed picture signals corresponding to the objects therefrom; and
reading the reconstructed picture signals out from the at least one of second memories to be used for motion compensation prediction encoding of each of the arbitrary shape pictures of the objects.
48. A method of encoding video, comprising the steps of:
encoding an alpha-map signal for discriminating a background of an input signal and a plurality of objects thereof;
encoding arbitrary shape pictures of the objects in accordance with the alpha-map signal to obtain a plurality of coded arbitrary shape pictures;
decoding the coded arbitrary shape pictures to reconstruct the arbitrary shape pictures of the objects and to output a plurality of reconstructed picture signals representing the objects;
storing a background signal in a first memory;
storing the reconstructed picture signals representing the objects in at least one of second memories;
reading the reconstructed picture signals out from the at least one of second memories to be used for motion compensation prediction encoding of each of the arbitrary shape pictures of the objects;
calculating a motion compensation prediction value based on the reading step and motion vector information; and
orthogonally transforming the motion compensation prediction value on the basis of the alpha-map signal to obtain an orthogonal transform coefficient of the motion compensation prediction value of a picture of the arbitrary shape indicated by the alpha-map signal.
50. A video encoding method comprising:
encoding an alpha-map signal for discriminating a background of an input signal and at least one object thereof;
encoding an arbitrary shape of the object in accordance with the alpha-map signal to obtain a coded arbitrary shape picture;
decoding the coded arbitrary shape picture to reconstruct the arbitrary shape picture of the object and output a reconstructed picture signal representing the object;
storing a background signal representing the background of the input signal in a first memory;
storing the reconstructed picture signal representing the object in a second memory; and
reading the reconstructed picture signal out from the second memory to be used for motion compensation prediction encoding of the arbitrary shape picture of the object,
wherein the motion compensation prediction encoding includes calculating a motion compensation prediction value based on the reading step and motion vector information.
51. A video encoding method comprising:
encoding an alpha-map signal for discriminating a background of an input signal and a plurality of objects thereof;
encoding arbitrary shape pictures of the objects in accordance with the alpha-map signal to obtain a plurality of coded arbitrary shape pictures;
decoding the coded arbitrary shape pictures to reconstruct the arbitrary shape pictures of the objects and to output a plurality of reconstructed picture signals representing the objects;
storing the background signal in a first memory;
storing the reconstructed picture signals representing the objects in at least one second memory; and
reading the reconstructed picture signals out from the at least one second memory to be used for motion compensation prediction encoding of each of the arbitrary shape pictures of the objects,
wherein the motion compensation prediction encoding includes calculating a motion compensation prediction value based on the reading step and motion vector information.
US10/986,5741995-10-272004-11-12Video encoding and decoding apparatusExpired - Fee RelatedUSRE40079E1 (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
US10/986,574USRE40079E1 (en)1995-10-272004-11-12Video encoding and decoding apparatus

Applications Claiming Priority (7)

Application NumberPriority DateFiling DateTitle
JP281029951995-10-27
JP15429696AJP3788823B2 (en)1995-10-271996-06-14 Moving picture encoding apparatus and moving picture decoding apparatus
US08/738,934US5818531A (en)1995-10-271996-10-24Video encoding and decoding apparatus
US09/111,751US6028634A (en)1995-10-271998-07-08Video encoding and decoding apparatus
US09/334,769US6256346B1 (en)1995-10-271999-06-16Video encoding and decoding apparatus
US09/783,549US6519285B2 (en)1995-10-272001-02-15Video encoding and decoding apparatus
US10/986,574USRE40079E1 (en)1995-10-272004-11-12Video encoding and decoding apparatus

Related Parent Applications (1)

Application NumberTitlePriority DateFiling Date
US09/783,549ReissueUS6519285B2 (en)1995-10-272001-02-15Video encoding and decoding apparatus

Publications (1)

Publication NumberPublication Date
USRE40079E1true USRE40079E1 (en)2008-02-19

Family

ID=26482618

Family Applications (5)

Application NumberTitlePriority DateFiling Date
US08/738,934Expired - LifetimeUS5818531A (en)1995-10-271996-10-24Video encoding and decoding apparatus
US09/111,751Expired - Fee RelatedUS6028634A (en)1995-10-271998-07-08Video encoding and decoding apparatus
US09/334,769Expired - Fee RelatedUS6256346B1 (en)1995-10-271999-06-16Video encoding and decoding apparatus
US09/783,549CeasedUS6519285B2 (en)1995-10-272001-02-15Video encoding and decoding apparatus
US10/986,574Expired - Fee RelatedUSRE40079E1 (en)1995-10-272004-11-12Video encoding and decoding apparatus

Family Applications Before (4)

Application NumberTitlePriority DateFiling Date
US08/738,934Expired - LifetimeUS5818531A (en)1995-10-271996-10-24Video encoding and decoding apparatus
US09/111,751Expired - Fee RelatedUS6028634A (en)1995-10-271998-07-08Video encoding and decoding apparatus
US09/334,769Expired - Fee RelatedUS6256346B1 (en)1995-10-271999-06-16Video encoding and decoding apparatus
US09/783,549CeasedUS6519285B2 (en)1995-10-272001-02-15Video encoding and decoding apparatus

Country Status (3)

CountryLink
US (5)US5818531A (en)
EP (3)EP1215911A3 (en)
JP (1)JP3788823B2 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20050244070A1 (en)*2002-02-192005-11-03Eisaburo ItakuraMoving picture distribution system, moving picture distribution device and method, recording medium, and program
US20120243610A1 (en)*2005-09-202012-09-27Rahul SaxenaDynamically configuring a video decoder cache for motion compensation

Families Citing this family (146)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US6005623A (en)*1994-06-081999-12-21Matsushita Electric Industrial Co., Ltd.Image conversion apparatus for transforming compressed image data of different resolutions wherein side information is scaled
WO1997016025A1 (en)*1995-10-201997-05-01Nokia Mobile Phones Ltd.Motion vector field coding
JP3788823B2 (en)1995-10-272006-06-21株式会社東芝 Moving picture encoding apparatus and moving picture decoding apparatus
US6957350B1 (en)1996-01-302005-10-18Dolby Laboratories Licensing CorporationEncrypted and watermarked temporal and resolution layering in advanced television
US6031575A (en)*1996-03-222000-02-29Sony CorporationMethod and apparatus for encoding an image signal, method and apparatus for decoding an image signal, and recording medium
JP3263807B2 (en)*1996-09-092002-03-11ソニー株式会社 Image encoding apparatus and image encoding method
IL119523A0 (en)*1996-10-301997-01-10Algotec Systems LtdData distribution system
TW358296B (en)1996-11-121999-05-11Matsushita Electric Industrial Co LtdDigital picture encoding method and digital picture encoding apparatus, digital picture decoding method and digital picture decoding apparatus, and data storage medium
DE69835388T2 (en)*1997-03-172007-07-19Sony Corp. Image encoder and image decoder
IL127274A (en)*1997-04-012006-06-11Sony CorpPicture coding device, picture coding method,picture decoding device, picture decoding method, and providing medium
JP3844844B2 (en)*1997-06-062006-11-15富士通株式会社 Moving picture coding apparatus and moving picture coding method
KR100373331B1 (en)*1997-07-022003-04-21주식회사 팬택앤큐리텔Scalable shape encoding/decoding system and method using scan interleaving method
KR100240770B1 (en)*1997-07-112000-01-15이형도Scalable coding apparatus and method for improving function of energy compensation/inverse-compensation
US5978048A (en)*1997-09-251999-11-02Daewoo Electronics Co., Inc.Method and apparatus for encoding a motion vector based on the number of valid reference motion vectors
JP3667105B2 (en)*1997-10-012005-07-06松下電器産業株式会社 Motion vector detection method and apparatus for implementing the method
US6795501B1 (en)*1997-11-052004-09-21Intel CorporationMulti-layer coder/decoder for producing quantization error signal samples
US5973743A (en)*1997-12-021999-10-26Daewoo Electronics Co., Ltd.Mode coding method and apparatus for use in an interlaced shape coder
US6731811B1 (en)1997-12-192004-05-04Voicecraft, Inc.Scalable predictive coding method and apparatus
AU1928999A (en)*1997-12-191999-07-12Kenneth RoseScalable predictive coding method and apparatus
US5995150A (en)*1998-02-201999-11-30Winbond Electronics Corporation AmericaDual compressed video bitstream camera for universal serial bus connection
EP0940774A3 (en)*1998-03-052000-07-05Matsushita Electric Industrial Co., Ltd.Motion vector coding and decoding apparatus and method
JP3898347B2 (en)*1998-06-302007-03-28富士通株式会社 Movie data control apparatus, movie data control method, and computer-readable recording medium on which movie data control program is recorded
US6275531B1 (en)*1998-07-232001-08-14Optivision, Inc.Scalable video coding method and apparatus
US6603883B1 (en)*1998-09-082003-08-05Canon Kabushiki KaishaImage processing apparatus including an image data encoder having at least two scalability modes and method therefor
US6546049B1 (en)*1998-10-052003-04-08Sarnoff CorporationParameterized quantization matrix adaptation for video encoding
KR100480751B1 (en)*1998-10-102005-05-16삼성전자주식회사Digital video encoding/decoding method and apparatus
US6499060B1 (en)1999-03-122002-12-24Microsoft CorporationMedia coding for loss recovery with remotely predicted data units
US6879634B1 (en)*1999-05-262005-04-12Bigband Networks Inc.Method and system for transmitting media streams over a variable bandwidth network
US6501797B1 (en)*1999-07-062002-12-31Koninklijke Phillips Electronics N.V.System and method for improved fine granular scalable video using base layer coding information
US6480547B1 (en)*1999-10-152002-11-12Koninklijke Philips Electronics N.V.System and method for encoding and decoding the residual signal for fine granular scalable video
US6639943B1 (en)1999-11-232003-10-28Koninklijke Philips Electronics N.V.Hybrid temporal-SNR fine granular scalability video coding
US6931060B1 (en)*1999-12-072005-08-16Intel CorporationVideo processing of a quantized base layer and one or more enhancement layers
US6826232B2 (en)*1999-12-202004-11-30Koninklijke Philips Electronics N.V.Fine granular scalable video with embedded DCT coding of the enhancement layer
KR100359821B1 (en)*2000-01-202002-11-07엘지전자 주식회사Method, Apparatus And Decoder For Motion Compensation Adaptive Image Re-compression
FR2806570B1 (en)2000-03-152002-05-17Thomson Multimedia Sa METHOD AND DEVICE FOR CODING VIDEO IMAGES
EP1279111A4 (en)*2000-04-072005-03-23Dolby Lab Licensing Corp IMPROVED TIME AND RESOLUTION STRUCTURE FOR ADVANCED TELEVISION
US6493387B1 (en)*2000-04-102002-12-10Samsung Electronics Co., Ltd.Moving picture coding/decoding method and apparatus having spatially scalable architecture and signal-to-noise ratio scalable architecture together
GB2362532B (en)*2000-05-152004-05-05Nokia Mobile Phones LtdVideo coding
US7751473B2 (en)*2000-05-152010-07-06Nokia CorporationVideo coding
JP2002010251A (en)*2000-06-192002-01-11Matsushita Electric Ind Co Ltd Video signal encoding device and video signal decoding device
FI109393B (en)2000-07-142002-07-15Nokia Corp Method for encoding media stream, a scalable and a terminal
US6621865B1 (en)2000-09-182003-09-16Powerlayer Microsystems, Inc.Method and system for encoding and decoding moving and still pictures
CN1254115C (en)2000-09-222006-04-26皇家菲利浦电子有限公司 Scalability of dual-loop motion compensation for fine particles
US6940905B2 (en)*2000-09-222005-09-06Koninklijke Philips Electronics N.V.Double-loop motion-compensation fine granular scalability
US20020080878A1 (en)*2000-10-122002-06-27Webcast Technologies, Inc.Video apparatus and method for digital video enhancement
US6961383B1 (en)*2000-11-222005-11-01At&T Corp.Scalable video encoder/decoder with drift control
KR100386639B1 (en)*2000-12-042003-06-02주식회사 오픈비주얼Method for decompression of images and video using regularized dequantizer
KR100952892B1 (en)*2000-12-062010-04-16리얼네트웍스 인코포레이티드 Intracoding method and apparatus of video data
DE10121259C2 (en)*2001-01-082003-07-24Siemens Ag Optimal SNR scalable video coding
JP2002315004A (en)*2001-04-092002-10-25Ntt Docomo Inc Image encoding method and apparatus, image decoding method and apparatus, and image processing system
US7209524B2 (en)2001-04-272007-04-24The Directv Group, Inc.Layered modulation for digital signals
US7502430B2 (en)2001-04-272009-03-10The Directv Group, Inc.Coherent averaging for measuring traveling wave tube amplifier nonlinearity
US7639759B2 (en)2001-04-272009-12-29The Directv Group, Inc.Carrier to noise ratio estimations from a received signal
US7423987B2 (en)*2001-04-272008-09-09The Directv Group, Inc.Feeder link configurations to support layered modulation for digital signals
US7471735B2 (en)*2001-04-272008-12-30The Directv Group, Inc.Maximizing power and spectral efficiencies for layered and conventional modulations
US7778365B2 (en)2001-04-272010-08-17The Directv Group, Inc.Satellite TWTA on-line non-linearity measurement
US7583728B2 (en)2002-10-252009-09-01The Directv Group, Inc.Equalizers for layered modulated and other signals
US8005035B2 (en)*2001-04-272011-08-23The Directv Group, Inc.Online output multiplexer filter measurement
US7822154B2 (en)2001-04-272010-10-26The Directv Group, Inc.Signal, interference and noise power measurement
US7184489B2 (en)2001-04-272007-02-27The Directv Group, Inc.Optimization technique for layered modulation
US7245671B1 (en)2001-04-272007-07-17The Directv Group, Inc.Preprocessing signal layers in a layered modulation digital signal system to use legacy receivers
US7483505B2 (en)2001-04-272009-01-27The Directv Group, Inc.Unblind equalizer architecture for digital communication systems
US7173981B1 (en)2001-04-272007-02-06The Directv Group, Inc.Dual layer signal processing in a layered modulation digital signal system
US7136418B2 (en)*2001-05-032006-11-14University Of WashingtonScalable and perceptually ranked signal coding and decoding
US7184548B2 (en)*2001-05-042007-02-27Hewlett-Packard Development Company, L.P.Encoding and decoding methods for secure scalable streaming and related systems
US7010043B2 (en)*2001-07-052006-03-07Sharp Laboratories Of America, Inc.Resolution scalable video coder for low latency
US7266150B2 (en)2001-07-112007-09-04Dolby Laboratories, Inc.Interpolation of video compression frames
WO2003010972A1 (en)*2001-07-262003-02-06Koninklijke Philips Electronics N.V.Generating a scalable coded video signal fr om a non-scalable coded video signal
EP2096871B1 (en)2001-09-142014-11-12NTT DoCoMo, Inc.Coding method, decoding method, coding apparatus, decoding apparatus, image processing system, coding program, and decoding program
JP2003116004A (en)*2001-10-042003-04-18Seiko Epson Corp Image file with transparency
EP1320216A1 (en)*2001-12-112003-06-18BRITISH TELECOMMUNICATIONS public limited companyMethod and device for multicast transmission
US20030118099A1 (en)*2001-12-202003-06-26Comer Mary LafuzeFine-grain scalable video encoder with conditional replacement
FR2834178A1 (en)*2001-12-202003-06-27Koninkl Philips Electronics NvVideo signal decoding process having base signal decoding/compensation reference image movement with second step selecting reference image decoded base/output signal.
US20030118113A1 (en)*2001-12-202003-06-26Comer Mary LafuzeFine-grain scalable video decoder with conditional replacement
US7391807B2 (en)*2002-04-242008-06-24Mitsubishi Electric Research Laboratories, Inc.Video transcoding of scalable multi-layer videos to single layer video
US6917713B2 (en)*2002-05-292005-07-12Koninklijke Philips Electronics N.V.System and method for enhancing videos from drift-free scalable bitstream
AU2003280499A1 (en)*2002-07-012004-01-19The Directv Group, Inc.Improving hierarchical 8psk performance
ES2604453T3 (en)2002-07-032017-03-07The Directv Group, Inc. Method and apparatus for layered modulation
US7529312B2 (en)*2002-10-252009-05-05The Directv Group, Inc.Layered modulation for terrestrial ATSC applications
ES2398213T3 (en)*2002-10-252013-03-14The Directv Group, Inc. Low complexity layer modulation signal processor
US7474710B2 (en)2002-10-252009-01-06The Directv Group, Inc.Amplitude and phase matching for layered modulation reception
US7463676B2 (en)2002-10-252008-12-09The Directv Group, Inc.On-line phase noise measurement for layered modulation
US6954501B2 (en)*2003-02-172005-10-11Xvd CorporationMethod and apparatus for object based motion compensation
JP4113044B2 (en)*2003-05-232008-07-02松下電器産業株式会社 Image encoding device
US7979273B2 (en)*2003-07-252011-07-12Sennheiser Electronic Gmbh & Co. KgMethod and apparatus for the digitization of and for the data compression of analog signals
JP4409897B2 (en)*2003-09-192010-02-03株式会社リコー Image processing apparatus, image processing method, program, and information recording medium
US7502429B2 (en)2003-10-102009-03-10The Directv Group, Inc.Equalization for traveling wave tube amplifier nonlinearity measurements
US7627184B2 (en)*2003-11-212009-12-01Nec CorporationContent distribution/reception device, content transmission/reception method, and content distribution/reception program
US7305031B2 (en)*2004-01-292007-12-04International Business Machines CorporationMethod and system for the error resilient transmission of predictively encoded signals
JP4613909B2 (en)*2004-02-202011-01-19日本電気株式会社 Image encoding method, apparatus thereof, and control program thereof
US20080144716A1 (en)*2004-03-112008-06-19Gerard De HaanMethod For Motion Vector Determination
JP2005286472A (en)*2004-03-292005-10-13Sanyo Electric Co LtdImage processing apparatus and image processing method
US20060012719A1 (en)*2004-07-122006-01-19Nokia CorporationSystem and method for motion prediction in scalable video coding
CN1985520A (en)*2004-07-152007-06-20三星电子株式会社Motion information encoding/decoding and scalable video encoding/decoding apparatus and method
KR100679018B1 (en)*2004-09-072007-02-05삼성전자주식회사 Multilayer video coding and decoding method, video encoder and decoder
US20060062308A1 (en)*2004-09-222006-03-23Carl StaelinProcessing video frames
JP4835855B2 (en)*2004-10-072011-12-14日本電気株式会社 Apparatus, method and program for moving picture encoding, and apparatus method and program for moving picture decoding
US20060078049A1 (en)*2004-10-132006-04-13Nokia CorporationMethod and system for entropy coding/decoding of a video bit stream for fine granularity scalability
DE102004059993B4 (en)*2004-10-152006-08-31Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for generating a coded video sequence using interlayer motion data prediction, and computer program and computer readable medium
KR100664929B1 (en)2004-10-212007-01-04삼성전자주식회사 Method and apparatus for efficiently compressing motion vectors in multi-layered video coder
JP2008526077A (en)*2004-12-222008-07-17エヌエックスピー ビー ヴィ Video stream changing device
US8634413B2 (en)2004-12-302014-01-21Microsoft CorporationUse of frame caching to improve packet loss recovery
KR100913088B1 (en)*2005-01-212009-08-21엘지전자 주식회사Method and apparatus for encoding/decoding video signal using prediction information of intra-mode macro blocks of base layer
US7929606B2 (en)*2005-01-212011-04-19Lg Electronics Inc.Method and apparatus for encoding/decoding video signal using block prediction information
KR101233854B1 (en)*2005-02-182013-02-15톰슨 라이센싱Method for deriving coding information for high resolution pictures from low resolution pictures and coding and decoding devices implementing said method
US7995656B2 (en)*2005-03-102011-08-09Qualcomm IncorporatedScalable video coding with two layer encoding and single layer decoding
US7725799B2 (en)*2005-03-312010-05-25Qualcomm IncorporatedPower savings in hierarchically coded modulation
EP1711018A1 (en)*2005-04-082006-10-11Thomson LicensingMethod and apparatus for encoding video pictures, and method and apparatus for decoding video pictures
JP4618676B2 (en)2005-04-282011-01-26株式会社リコー Structured document code transfer method, image processing system, server device, program, and information recording medium
US8619860B2 (en)*2005-05-032013-12-31Qualcomm IncorporatedSystem and method for scalable encoding and decoding of multimedia data using multiple layers
WO2007014216A2 (en)2005-07-222007-02-01Cernium CorporationDirected attention digital video recordation
KR100891662B1 (en)*2005-10-052009-04-02엘지전자 주식회사Method for decoding and encoding a video signal
KR20070096751A (en)*2006-03-242007-10-02엘지전자 주식회사 Method and apparatus for coding / decoding image data
KR20070038396A (en)*2005-10-052007-04-10엘지전자 주식회사 Method of encoding and decoding video signal
EP1932363B1 (en)*2005-10-052016-05-18LG Electronics Inc.Method and apparatus for reconstructing image blocks
EP1977391A2 (en)2006-01-032008-10-08ATI Technologies Inc.Image analyser and adaptive image scaling circuit and methods
WO2007077116A1 (en)*2006-01-052007-07-12Thomson LicensingInter-layer motion prediction method
WO2007114641A1 (en)*2006-04-032007-10-11Lg Electronics, Inc.Method and apparatus for decoding/encoding of a scalable video signal
EP1879399A1 (en)2006-07-122008-01-16THOMSON LicensingMethod for deriving motion data for high resolution pictures from motion data of low resolution pictures and coding and decoding devices implementing said method
CN101366213A (en)*2006-07-212009-02-11维德约股份有限公司System and method for jitter buffer reduction in scalable coding
JP4787100B2 (en)*2006-07-272011-10-05パナソニック株式会社 Image encoding device
US20080095228A1 (en)*2006-10-202008-04-24Nokia CorporationSystem and method for providing picture output indications in video coding
JP4525692B2 (en)*2007-03-272010-08-18株式会社日立製作所 Image processing apparatus, image processing method, and image display apparatus
KR101365596B1 (en)*2007-09-142014-03-12삼성전자주식회사Video encoding apparatus and method and video decoding apparatus and method
US20090096927A1 (en)*2007-10-152009-04-16Camp Jr William OSystem and method for video coding using variable compression and object motion tracking
KR101375663B1 (en)*2007-12-062014-04-03삼성전자주식회사Method and apparatus for encoding/decoding image hierarchically
KR101446771B1 (en)*2008-01-302014-10-06삼성전자주식회사 Image coding apparatus and image decoding apparatus
KR101426271B1 (en)*2008-03-042014-08-06삼성전자주식회사Method and apparatus for Video encoding and decoding
WO2010057170A1 (en)2008-11-172010-05-20Cernium CorporationAnalytics-modulated coding of surveillance video
US20100278232A1 (en)*2009-05-042010-11-04Sehoon YeaMethod Coding Multi-Layered Depth Images
KR101885382B1 (en)*2009-07-062018-08-03톰슨 라이센싱Methods and apparatus for spatially varying residue coding
MX2012011846A (en)2010-04-132012-11-30Samsung Electronics Co LtdVideo-encoding method and video-encodi.
JP5707745B2 (en)*2010-06-082015-04-30ソニー株式会社 Image stabilization apparatus, image stabilization method, and program
JP2011259205A (en)*2010-06-092011-12-22Sony CorpImage decoding device, image encoding device, and method and program thereof
GB2481856A (en)*2010-07-092012-01-11British Broadcasting CorpPicture coding using weighted predictions in the transform domain
GB2501517A (en)*2012-04-272013-10-30Canon KkScalable Encoding and Decoding of a Digital Image
US20140086346A1 (en)*2012-09-212014-03-27Samsung Electronics Co., Ltd.Method and system for removal of baseline wander and power-line interference
EP2982117B1 (en)*2013-04-052018-07-25VID SCALE, Inc.Inter-layer reference picture enhancement for multiple layer video coding
CA2908853C (en)2013-04-082019-01-15Arris Technology, Inc.Signaling for addition or removal of layers in video coding
US9667988B2 (en)*2014-04-232017-05-30Samsung Electronics Co., Ltd.Method and apparatus for reducing redundancy in residue signel in video data compression
EP3146724A1 (en)2014-05-212017-03-29ARRIS Enterprises LLCSignaling and selection for the enhancement of layers in scalable video
CA3083172C (en)2014-05-212022-01-25Arris Enterprises LlcIndividual buffer management in transport of scalable video
US10136194B2 (en)*2016-07-062018-11-20Cisco Technology, Inc.Streaming piracy detection method and system
US11399180B1 (en)*2019-04-092022-07-26Apple Inc.Video encoder with quantization control
US11915432B2 (en)2020-01-162024-02-27Samsung Electronics Co., Ltd.Method and apparatus for tracking target
GB2626828A (en)*2023-08-292024-08-07V Nova Int LtdProcessing of residuals in video coding

Citations (27)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US4833535A (en)1987-02-041989-05-23Kabushiki Kaisha ToshibaImage transmission apparatus
US4969039A (en)1987-07-011990-11-06Nec CorporationImage processing system operable in cooperation with a recording medium
US4982285A (en)1989-04-271991-01-01Victor Company Of Japan, Ltd.Apparatus for adaptive inter-frame predictive encoding of video signal
US5001561A (en)1990-05-021991-03-19At&T Bell LaboratoriesEmbedded coding system for video signals
WO1992006563A2 (en)1990-10-091992-04-16N.V. Philips' GloeilampenfabriekenCoding system for digital signals corresponding to television pictures and corresponding decoding system
US5274453A (en)*1990-09-031993-12-28Canon Kabushiki KaishaImage processing system
EP0589504A1 (en)1992-09-221994-03-30Koninklijke KPN N.V.System comprising at least one encoder for coding a digital signal and at least one decoder for decoding a digital signal
EP0595403A1 (en)1992-10-281994-05-04Laboratoires D'electronique Philips S.A.S.Device for coding digital signals representative of images and corresponding decoding device
US5337049A (en)1991-10-011994-08-09Kabushiki Kaisha ToshibaEfficient coding signal processor
EP0634871A2 (en)1993-07-131995-01-18AT&T Corp.Scalable encoding and decoding of high-resolution progressive video
EP0634872A2 (en)1993-07-121995-01-18Sony CorporationProcessing digital video data
JPH0759095A (en)1993-08-101995-03-03Sony CorpTransmitter and receiver for digital picture signal
JPH0779440A (en)1993-07-121995-03-20Sony CorpTransmitter and receiver for digital picture signal
EP0644695A2 (en)1993-09-211995-03-22AT&T Corp.Spatially scalable video encoding and decoding
US5420638A (en)1992-04-141995-05-30U.S. Philips CorporationSubassembly for coding images with refresh correction of the data to be coded, and subassembly for decording signals representing these images and previously coded by means of a subassembly of the former kind
US5528299A (en)1990-10-091996-06-18U.S. Philips CorporationCoding system for digital signals corresponding to television pictures and corresponding decoding system
US5537440A (en)1994-01-071996-07-16Motorola, Inc.Efficient transcoding device and method
US5592228A (en)1993-03-041997-01-07Kabushiki Kaisha ToshibaVideo encoder using global motion estimation and polygonal patch motion estimation
JPH09182084A (en)1995-10-271997-07-11Toshiba Corp Moving picture coding apparatus and moving picture decoding apparatus
US5686956A (en)1994-12-281997-11-11Hyundai Electronics Industries Co., Ltd.Object-by background information coding apparatus and method
US5706367A (en)*1993-07-121998-01-06Sony CorporationTransmitter and receiver for separating a digital video signal into a background plane and a plurality of motion planes
US5767911A (en)1994-12-201998-06-16Matsushita Electric Industrial Co., Ltd.Object-based digital image predictive coding transfer method and apparatus, and decoding apparatus
US5805221A (en)1994-10-311998-09-08Daewoo Electronics Co., Ltd.Video signal coding system employing segmentation technique
US5812787A (en)1995-06-301998-09-22Intel CorporationVideo coding scheme with foreground/background separation
US5870146A (en)1997-01-211999-02-09Multilink, IncorporatedDevice and method for digital video transcoding
US5978514A (en)1994-11-101999-11-02Kabushiki Kaisha ToshibaImage data coding and decoding system for efficiently compressing information using the shape and position of the image content
US6154495A (en)1995-09-292000-11-28Kabushiki Kaisha ToshibaVideo coding and video decoding apparatus for changing a resolution conversion according to a reduction ratio setting information signal

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US5521628A (en)1993-08-301996-05-28Lumonics CorporationLaser system for simultaneously marking multiple parts

Patent Citations (32)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US4833535A (en)1987-02-041989-05-23Kabushiki Kaisha ToshibaImage transmission apparatus
US4969039A (en)1987-07-011990-11-06Nec CorporationImage processing system operable in cooperation with a recording medium
US4982285A (en)1989-04-271991-01-01Victor Company Of Japan, Ltd.Apparatus for adaptive inter-frame predictive encoding of video signal
US5001561A (en)1990-05-021991-03-19At&T Bell LaboratoriesEmbedded coding system for video signals
US5274453A (en)*1990-09-031993-12-28Canon Kabushiki KaishaImage processing system
WO1992006563A2 (en)1990-10-091992-04-16N.V. Philips' GloeilampenfabriekenCoding system for digital signals corresponding to television pictures and corresponding decoding system
US5528299A (en)1990-10-091996-06-18U.S. Philips CorporationCoding system for digital signals corresponding to television pictures and corresponding decoding system
US5337049A (en)1991-10-011994-08-09Kabushiki Kaisha ToshibaEfficient coding signal processor
US5420638A (en)1992-04-141995-05-30U.S. Philips CorporationSubassembly for coding images with refresh correction of the data to be coded, and subassembly for decording signals representing these images and previously coded by means of a subassembly of the former kind
US5510787A (en)1992-09-221996-04-23Koninklijke Ptt Nederland N.V.System comprising at least one encoder for coding a digital signal and at least one decoder for decoding a digital signal, and encoder and decoder for use in the system according to the invention
EP0589504A1 (en)1992-09-221994-03-30Koninklijke KPN N.V.System comprising at least one encoder for coding a digital signal and at least one decoder for decoding a digital signal
US5500677A (en)1992-10-281996-03-19U.S. Philips CorporationDevice for encoding digital signals representing images, and corresponding decoding device
EP0595403A1 (en)1992-10-281994-05-04Laboratoires D'electronique Philips S.A.S.Device for coding digital signals representative of images and corresponding decoding device
US5592228A (en)1993-03-041997-01-07Kabushiki Kaisha ToshibaVideo encoder using global motion estimation and polygonal patch motion estimation
JPH0779440A (en)1993-07-121995-03-20Sony CorpTransmitter and receiver for digital picture signal
EP0634872A2 (en)1993-07-121995-01-18Sony CorporationProcessing digital video data
US5706367A (en)*1993-07-121998-01-06Sony CorporationTransmitter and receiver for separating a digital video signal into a background plane and a plurality of motion planes
EP0634871A2 (en)1993-07-131995-01-18AT&T Corp.Scalable encoding and decoding of high-resolution progressive video
JPH0759095A (en)1993-08-101995-03-03Sony CorpTransmitter and receiver for digital picture signal
EP0644695A2 (en)1993-09-211995-03-22AT&T Corp.Spatially scalable video encoding and decoding
US5537440A (en)1994-01-071996-07-16Motorola, Inc.Efficient transcoding device and method
US5805221A (en)1994-10-311998-09-08Daewoo Electronics Co., Ltd.Video signal coding system employing segmentation technique
US5978514A (en)1994-11-101999-11-02Kabushiki Kaisha ToshibaImage data coding and decoding system for efficiently compressing information using the shape and position of the image content
US5767911A (en)1994-12-201998-06-16Matsushita Electric Industrial Co., Ltd.Object-based digital image predictive coding transfer method and apparatus, and decoding apparatus
US5686956A (en)1994-12-281997-11-11Hyundai Electronics Industries Co., Ltd.Object-by background information coding apparatus and method
US5812787A (en)1995-06-301998-09-22Intel CorporationVideo coding scheme with foreground/background separation
US6154495A (en)1995-09-292000-11-28Kabushiki Kaisha ToshibaVideo coding and video decoding apparatus for changing a resolution conversion according to a reduction ratio setting information signal
US5818531A (en)1995-10-271998-10-06Kabushiki Kaisha ToshibaVideo encoding and decoding apparatus
JPH09182084A (en)1995-10-271997-07-11Toshiba Corp Moving picture coding apparatus and moving picture decoding apparatus
US6028634A (en)1995-10-272000-02-22Kabushiki Kaisha ToshibaVideo encoding and decoding apparatus
US6256346B1 (en)1995-10-272001-07-03Kabushiki Kaisha ToshibaVideo encoding and decoding apparatus
US5870146A (en)1997-01-211999-02-09Multilink, IncorporatedDevice and method for digital video transcoding

Non-Patent Citations (18)

* Cited by examiner, † Cited by third party
Title
Andre Kaup, et al., "Efficient Prediction of Uncovered Background in Interframe Coding Using Spatial Extrapolation", Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP), vol. 5, XP-000533767, Apr. 19, 1994, pp. (V-501)-(V-504).
Edward H. Adelson, et al., "Layered Representations for Vision and Video", Representation of Visual Scenes, XP-010150995, Jun. 1995, pp. 3-9.
Fukuhara Takahiro, et al., "A study of H.263 video coding using object-oriented schemes", Collection of Lectures and Papers in Conventions of the Institute of Electronics, Information Communication Engineers, Sep. 1995, p. 98.
H. G. Musmann et al., European Transactions on Telecommunications and Related Technologies, "Coding Algorithms and VLSI Implementations for Digital TV and HDTV Satellite Broadcasting," vol. 4, No. 1, Jan. 1, 1993, pp. 11-21.
Hideyoshi Tominaga, "On the Studies of Multimedia Document Architecture for B-ISDN", Research Report of Information Processing Society, vol. 93, No. 82, 93-AVM-2, Jun. 18, 1993, pp. 33-40.
L. Vandendorpe, Signal Processing: Image Communication, "Hierarchical Transform and Subband Coding of Video Signals," vol. 4, No. 3, Jun. 1, 1992, pp. 245-262.
Laura Teodosio, et al., "Salient Video Stills: Content and Context Preserved", Proceedings of First ACM International Conference on Multimedia, XP-000670793, Aug. 1, 1993, pp. 39-46.
Minoru Etoh, et al., "An Image Coding Scheme Using Layered Representation and Multiple Templates", Technical Report of IEICE., Mar. 1995, pp. 99-106.
P. J. Tourtier et al., Signal Processing: Image Communication, "Motion Compensated Subband Coding Schemes for Compatible High Definition TV Coding," vol. 4, No. 4/5, Aug. 1, 1992, pp. 325-344.
Peter Gerken, "Object-Based Analysis-Synthesis Coding of Image Sequences at Very Low Bit Rates", IEEE Transactions on Circuits and Systems for Video Technology, vol. 4, No. 3, XP-000460755, Jun. 1, 1994 pp. 228-235.
Shinya Kadono, et al., "An Image Coding Scheme Using Opacity Information", Research Report of Information Processing Society, Jul. 14, 1995, pp. 1-8.
T. K. Tan et al., "A Frequency Scalable Coding Scheme Employing Pyramide and Subband Techniques," IEEE Transactions On Circuits and Systems for Video Technology, vol. 4, No. 2, Apr. 1994, pp. 203-207.
T. K. Tan et al., IEEE Transactions on Circuits and Systems for Video Technology, "A Frequency Scalable Coding Scheme Employing Pyramid and Subband Techniques," vol. 4, No. 2, Apr. 1, 1994, pp. 203-207.
T. Watanabe et al., Proceedings of PCSJ88, 8-2, "DCT Coding with 4 Types of Adaptive Prediction," 1988, pp. 117-118.
Wolfgang Guse, et al., "Effective Exploitation of Background Memory for Coding of Moving Video using Object Mask Generation", Proceedings of the SPIE, Visual Communications and Image Processing '90, vol. 1360, XP-000374164, Oct. 1, 1990, pp. 512-523.
Y. Miyamoto et al., Proceedings of PCSJ88, 7-4, "An Adaptive Prediction Coding Using Background Prediction," 1988, pp. 93-94.
Yasuyuki Nakajima, et al., "3-2 MPEG Video Coding", Television Society Journal, vol. 49, No. 4, Apr. 20, 1995, pp. 435 and 458-463.
Yoshiyuki Yashima, "Compression Technology of Motion Video-MPEG2-", ITE Technical Report of Television Society, vol. 18, No. 47 AIT94-12, Sep. 14, 1994, pp. 7-12 and (with English Abstract).

Cited By (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20050244070A1 (en)*2002-02-192005-11-03Eisaburo ItakuraMoving picture distribution system, moving picture distribution device and method, recording medium, and program
US7639882B2 (en)*2002-02-192009-12-29Sony CorporationMoving picture distribution system, moving picture distribution device and method, recording medium, and program
US20120243610A1 (en)*2005-09-202012-09-27Rahul SaxenaDynamically configuring a video decoder cache for motion compensation
US8867609B2 (en)*2005-09-202014-10-21Intel CorporationDynamically configuring a video decoder cache for motion compensation

Also Published As

Publication numberPublication date
US5818531A (en)1998-10-06
EP1821545A2 (en)2007-08-22
EP0771119A3 (en)1999-07-14
US6256346B1 (en)2001-07-03
US20020009141A1 (en)2002-01-24
EP1215911A3 (en)2006-05-10
EP0771119A2 (en)1997-05-02
JPH09182084A (en)1997-07-11
EP1215911A2 (en)2002-06-19
US6519285B2 (en)2003-02-11
US6028634A (en)2000-02-22
JP3788823B2 (en)2006-06-21

Similar Documents

PublicationPublication DateTitle
USRE40079E1 (en)Video encoding and decoding apparatus
US10250885B2 (en)System and method for intracoding video data
US9420279B2 (en)Rate control method for multi-layered video coding, and video encoding apparatus and video signal processing apparatus using the rate control method
US8532187B2 (en)Method and apparatus for scalably encoding/decoding video signal
KR100253931B1 (en) Method and apparatus for decoding digital image sequence
JP3888597B2 (en) Motion compensation coding apparatus and motion compensation coding / decoding method
KR100781629B1 (en) A method for reducing the memory required for decompression by storing compressed information using DCT base technology and a decoder for implementing the method
US20040136457A1 (en)Method and system for supercompression of compressed digital video
EP0847204A2 (en)A multi-standard video decompression device
JPH07288474A (en) Vector quantization coding device and decoding device
US7177356B2 (en)Spatially transcoding a video stream
JP2002223443A (en)Transcoding method and transcoder
US20030231796A1 (en)Method and system for optimizing image sharpness during coding and image enhancement
JP2025511537A (en) Method, apparatus and system for encoding and decoding tensors - Patents.com
JP2025511538A (en) Method, apparatus and system for encoding and decoding tensors - Patents.com
JP3576660B2 (en) Image encoding device and image decoding device
US20050129319A1 (en)Fast discrete wavelet encoding apparatus and method for encoding a still image at a high speed based on energy of each block
JP2025512264A (en) Method, apparatus and system for encoding and decoding tensors - Patents.com
US8428116B2 (en)Moving picture encoding device, method, program, and moving picture decoding device, method, and program
JP3914214B2 (en) Image coding apparatus and image decoding apparatus
JP2001238220A (en) Moving picture coding apparatus and moving picture coding method
JP2001231049A (en) Moving picture decoding apparatus and moving picture decoding method
JP3869303B2 (en) Image decoding method and apparatus
KR100293369B1 (en)Digital video compression coding and decoding system using shape adaptive selection and thereof method
EP0859519A2 (en)Video coder using implicit or explicit prediction for image coding

Legal Events

DateCodeTitleDescription
FPAYFee payment

Year of fee payment:8

REMIMaintenance fee reminder mailed
LAPSLapse for failure to pay maintenance fees

[8]ページ先頭

©2009-2025 Movatter.jp