Movatterモバイル変換


[0]ホーム

URL:


CN1137212A - reconfigurable processing stage - Google Patents

reconfigurable processing stage
Download PDF

Info

Publication number
CN1137212A
CN1137212ACN95103246ACN95103246ACN1137212ACN 1137212 ACN1137212 ACN 1137212ACN 95103246 ACN95103246 ACN 95103246ACN 95103246 ACN95103246 ACN 95103246ACN 1137212 ACN1137212 ACN 1137212A
Authority
CN
China
Prior art keywords
data
token
level
signal
output
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN95103246A
Other languages
Chinese (zh)
Inventor
阿德里安·P·怀斯
安东尼·M·琼斯
马丁·W·萨瑟安
威廉·P·罗宾斯
安东尼·P·J·克莱顿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Discovision Associates
Original Assignee
Discovision Associates
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from GB9405914Aexternal-prioritypatent/GB9405914D0/en
Application filed by Discovision AssociatesfiledCriticalDiscovision Associates
Publication of CN1137212ApublicationCriticalpatent/CN1137212A/en
Pendinglegal-statusCriticalCurrent

Links

Images

Classifications

Landscapes

Abstract

A multi-standard video decompression apparatus has a plurality of stages interconnected by a two-wire interface arranged as a pipeline processor. Control and data tokens are passed in token format through a single two-wire interface, a token decode circuit at some stage for identifying certain tokens as control tokens associated with that stage and passing unrecognized control tokens along the pipeline. Reconfiguration processing circuitry is located within selected stages and is operative to identify control tokens to reconfigure such stages to manipulate the identified data tokens, while providing various unique support subsystem circuitry and processing techniques.

Description

Treating stage capable of reconfigurating
The invention belongs to the image processing technique field, relate to a kind of visual decompression technique that image transmits receiving terminal, particularly about a kind for the treatment of stage capable of reconfigurating devices and methods therefor.
The present invention is in order to improve the method and apparatus of decompression, it be used for input signal to the multitude of different ways coding decompress (or) decoding. For illustrating that hereinafter selected embodiment relates to the decoding of multiple picture coding standard, particularly this embodiment relates to and is called JPEG (static image coding standard), MPEG (dynamic image coding standard) and H.261 any knows the decoding of standard in (videophone coding standard).
Serial flow waterline treatment system of the present invention comprises a single two-wire (two_wire) bus, be used for the form with control token and data token, transmit the interactive interface token of unique special use to many adaptation decompression circuit and analog, form a reconfigurable pipeline processor.
An existing technological system has been described in the U.S. Patent No. 5216724. This device comprises a plurality of computing modules, always has the computing module of 4 Parallel coupled in a preferred embodiment. In each such computing module a processor is arranged, binary channels memory, temporary transient (scratch_pad) memory, and an arbitration organ. The first bus coupling computing module is to the device of supervising the cooking. Device comprises that one shares memory, and it passes through the second bus coupling on primary processor and computing module.
U.S. Patent No. 4,785,349 disclose for transmitting, for reach the complete form with the video frame rate decoding of routine at CD (CD) medium record. When compression, each zone of a frame is analyzed separately, fills coding (fill coding) method to select to optimize for regional, and the regional decoding time is estimated to optimize compression threshold. The region description code character of reception and registration area size and position installs to the first segmentation of data flow. Pass on the area filling code of the regional picture dot amplitude of expression, be fitted together and be put in other segmentation of data flow according to the type group of filler code. The data flow segmentation according to separately statistical distribution respectively with Variable Length Code and formatted formation Frame. Owing to added auxiliary data, the byte number of every frame has lost meaning, auxiliary data determines that by reversing frame sequence analysis minimum pause is arranged when making the CD playback with the average number that a selection is provided, thereby walks around CD the unpredictable of cyclophysis of delaying time under searching modes. Decoder comprises variable length decoder, and it is to respectively each data flow segmentation of variable-length decoding of the statistical information sensitivity in the code stream. The regional location data are to derive from the region description data, be used for a plurality of regional special decoder with the area filling code simultaneously, these decoders are by (for example detecting the filler code type, relative, absolute, two minutes (dyad) with difference DPCM) select, and the regional picture dot of decoding is stored just subsequently the demonstration in bitmap.
United States Patent (USP) NO.4,922,341 disclose the pictorial data of model of place assistance (Scene-model-assisted) method reduction digital television signal, utilize the method, the picture intelligence that provides is provided at a time to be encoded, and at the encoded upper frame that moment t-1 comes from scene, reveal in image memory as the reference frame, thereby frame to the information of frame by an amplification factor, a displacement factor and suitably the Quadtree Partition structure of acquisition consist of. After initialization system, the visual halftoning of the brightness value of the gray scale scale value of one unified regulation or expression definition, be written into the coding image memory of transmitter and the decoded picture memory of receiver memory, in this identical mode whole picture dots processed. The image memory of encoder and the image memory of decoder the two all with the operation of self feedback system, the content of image memory in the coder-decoder in this way, can read by the piece of variable-size, can be exaggerated to be greater than or less than 1 luminance factor, and can write back to image memory with offset address, whereby, the piece of variable-size is formed according to the quaternary tree data structure.
United States Patent (USP) NO.5,122,875 disclose the encoding/decoding device of a HDTV (high-definition television) signal. This device comprises the compressor circuit that the high sharpness video source signal is worked, with the layering code word CW of video data that level ground expression compression is provided with define the relevant code word T of the data type that is represented by code word CW. One preference circuit is had an effect to code word CW and T, resolves code word CW and is divided into high and low preferential codeword sequence, here the higher lower compressed video data of relative importance during the corresponding reformat image of high and low preferential codeword sequence difference. One transport processor, the high and low preferential codeword sequence of response form respectively height and the low preferential transfer block of high and low preferential code word, and each transfer block comprises a stature, code word CW and error detection check bits. Transfer block separately delivers to the forward error checking circuit so that the wrong check-up data that obtains adding. Then, high and low prioritized data is sent to modulator, there on the quadrature amplitude modulation separately carrier wave in order to transmit.
United States Patent (USP) NO.5,146,325 disclose the video decompression system of a pair of pictorial data decoding contracting of having compressed. Strange and the idol field of vision signal is compacted into basic frame (intraframe) and intermediate frame (interframe) compact model individually there, then staggered the transmission. Decompressed independently in strange and idol field. Strange when effectively decompressing in interval/when even field data can not obtain, idol/strange field data was used for substituting this can not obtain strange/even field data. The strange field data of even summation that decompresses independently reaches carries out relatively substituting of field to the data that can not obtain, and may reduce visual display delayed, brings benefit when system's startup and change passage.
United States Patent (USP) NO.5,168,356 disclose a system for coding video signal, and it is included as the signal transmission, and the coding video frequency data segmentation is formed the transfer block device, and with the advantage that a data is provided, the transfer block form has strengthened the recovery capability of signal in receiver. When occur transmitting loss of data or when wrong, from leader (Header) data, receiver can determine to enter the reentry point of data flow. By embed the auxiliary leader of carrying in the coding video frequency data of each transfer block, reentry point can be utilized to greatest extent.
United States Patent (USP) NO.5,168,375 disclose one processes the method for image data samples field, and it provides one or more functions of extraction (decimation), interpolation and sharpening, and this is to rely on as the matrixing processor that adopts in the JPEG compressibility to realize. In extraction and two kinds of processing of interpolation, the data sampling piece is by discrete even cosine transform (DECT), and after this frequency item number is changed. To reduce extracting situation lower frequency item number, subsequently by inverse transformation generate a size reduction sampled point matrix notation original data block. In the interpolation situation, additional null value frequency component is inserted into the frequency component matrix, and then inverse transformation generates a data sampling collection that has increased but do not increase spectral bandwidth. In the sharpening situation, rely on convolution or filtering operation, the conversion that comprises data in the frequency domain and filtering core to multiply each other to finish. Here inverse transformation is provided and causes a sampled data piece collection of processing, and these pieces are superimposed, then is to preserve to specify sampling and abandon unnecessary sampling from overlapping zone. Be linear-phase filtering, the space representation of nuclear is modified by reducing number of components, and with 0 number that is filled into the sampling that equals data block, sets up subsequently the discrete strange cosine transform (DOCT) of filling nuclear matrix.
United States Patent (USP) NO.5,175,617 disclose the system and method that transmits the logmap video image by the analog channel telephone wire of limit band. The tissue of picture dot is designed to and human eye sensor geometry attitude coupling in the logmap image, and it is comparatively concentrated to have a centre picture dot. Transmitter is divided into some passages with frequency band, and one or two picture dots of each channel allocation. For example the telephone wire of one 3 kilo hertzs of sound qualities is divided into 768 passages by about 3.9 hertz of intervals, and each passage comprises the carrier wave of two quadratures, thereby two picture dots of every passage portability. Some passage is left special rate-aided signal, and these rate-aided signals are so that receiver detects phase place and the amplitude of receiving signal. If sensor and picture dot directly are attached to one group of oscillator, and receiver can receive each passage continuously, and then receiver does not need with transmitter synchronous. A kind of FFT (fast Fourier transformation) algorithm is finished fast discrete and is approached continuous situation, and receiver is synchronized with the first frame there, and then each frame period obtains frame in succession. What the frame period compared is relatively low in the sampling period, so in case the first frame is detected, receiver is that very difficult lost frames are synchronous. The visual telephone of experimental per second 4 frames transmits, and 1440 picture dot Logmap image is adopted orthogonal coding, has obtained the effective speed that surpasses per second 40,000 (bit).
United States Patent (USP) NO.5,185,819 disclose a video compression system, and it has a strange idol vision signal, and they are compressed into the compact model of basic frame (intrafrme) sequence and intermediate frame (inlerframe) sequence independently. The strange field of compressed data and idol field are interleaved for transmission independently, and namely the even field of basic frame packed data appears at the centre of strange packed data successive field of basic frame. The sequence that interweaves supply with receiver with the inlet point of twice number in signal, for decoding but do not increase and transmit the data total amount.
United States Patent (USP) NO.5,212,742 disclose the apparatus and method of a video data Real Time Compression/decompression. This device comprises a plurality of computing modules, in a preferred embodiment, add up to the parallel connection of 4 computing modules, each computing module has a processor, the binary channels memory, temporarily (scratch-pad) memory and an arbitration organ, the first bus connects computing module and primary processor. At last, this device comprises one and shares memory, and it links primary processor and computing module by the second bus. The method is managed the visual distribution portion of each processor operations.
United States Patent (USP) NO.5,231, the system and method for 484 disclosed encoders for the ISO/IEC mpeg standard of realizing being applicable to suggestion, comprise three collaborative parts or subsystem, the digital moving video sequence that enters is carried out different phases should be processed, image in the sequence is distributed binary digit, and the quantization transform coefficient suitably of the different image regions in video sequence, so that distribute to the visual quality that this visual figure place can obtain optimum.
United States Patent (USP) NO.5,267,334 disclose the method for removing the frame redundancy in the computer system of motion video sequence. This method comprises the variation that detects the first scene in the motion video sequence, and produces the first key frame that comprises the first visual full scene information. In a preferred embodiment, this first key frame is called " forward " (" forward-facing ") key frame or basic frame (intraframe), and usually present with the CCITT compressed video data forms, then compression is processed, generate at least one intermediate compression frame, the intermediate compression frame of this at least one frame comprises the information different from the first image, although at least one images time is closelyed follow this first image in the motion video sequence. This at least one frame is called as intermediate frame. At last, detect the second scene signals in the motion video sequence and produce the second key frame, the whole scene information that this key frame comprises is for showing that just in time at the image of last time of the second scene changes this frame is called " backward " (" backward-facing ") key frame. At least one intermediate compression frame of the first key frame and this connects together and realizes playing forward, and the second key frame and intermediate compression frame reverse link are reverse-play. When image was forward play, this basic frame also may be used to produce complete scene information. When sequence when oppositely performing in a radio or TV programme, key frame is used to produce complete scene information backward.
United States Patent (USP) NO.5,267,513 disclose the first circuit arrangement, it comprises prior art image pyramid (image-pyramid) level of given number, with the second circuit device of the new motion vector level that comprises same given number together, realize cost-effectively, have minimum system's processing delay and/or use minimum system delay, and/or use the real-time hierarchical motion analysis (HMA) of minimum hardware configuration. Especially, this first and second circuit arrangements response is the pictorial data of relative fine definition. It from one with relatively high frame rate (for example, per second 30 frames) the continuous given picture dot-density that occurs, the ongoing list entries of image-Frame, after certain treatment system postpones, obtain the continuous given picture dot-density with same given frame rate generation, the ongoing output sequence of vector-Frame. Each vector-Frame shows the picture motion that occurs between the sequential chart picture frame.
United States Patent (USP) NO.5,283,646 disclose a method and apparatus of realizing the real-time video coded system. This system can accurately send the figure place that every frame needs, and only encodes time a time when image, upgrades the quantization step that is used for quantization parameter, and quantization parameter is for example described, the image that a width of cloth will transmit at communication port. The data section of being divided into, every section comprises many. Piece is encoded, and for example, with the DCT coding, is produced as the coefficient sequence of each piece. These coefficients can be quantized, and according to quantization step, the figure place that data of description requires can vary widely. Transmit when finishing in every segment data, for the associated hop count of having selected of specific data sets, cumulative actual expansion figure place is compared with the requirement expansion number that adds up. Then an images is for example described with many sections by system, and required data bits is that target readjusts quantization step. To upgrading quantization step and determining that the distribution of the position of requirement can have various describing methods.
Chong, the paper that Yong (Zhong Yong) M. writes, a data flow structure for digital image processing, Wescon (U.S. West Electron exhibition and exchanging meeting) technicism collected works: in No.2 October/November 1984, disclose one and be the specially designed system for real-time signal processing of image processing. More specifically, disclose the data flow architecture of a based on token, its token width is fixed as a word, and a fixed width address field is arranged. System comprises the many same stream handle that connects to ring-type. Token comprises a data field, a control field and a mark. The tag field of token is further divided into processor field and identification field. The processor field is used for guiding token to arrive correct data flow processor, makes data flow processor how know deal with data and identification field is used for indicating data. In this mode, identification field plays instruction to the data stream handle. System guides each token to a specific data stream handle with module No. (MN). Such as the MN coupling of MN and a specific order, then data are carried out suitable operation. As not being identified, token is directed to output data bus.
Paper, Kimori (wood is gloomy), the people such as S. write, and with the flow-line equipment action principle of self-timing circuit, the solid-state circuit of IEEE J.,Vol 23, and No.1, discloses a flexibly streamline that the self-timing circuit is arranged in February, 1988. This asynchronous pipeline is comprised of many pipeline stages. Each streamline comprises many pipeline stages. Each pipeline stages comprises one group of input data latch, and the combinational logic circuit to this streamline certain logic operations is and then carried out in the back. Some data latches and be provided simultaneously with a triggering signal that the associated data transfer control of this grade circuit produces. Data transfer control line-internal links chain of formation, and the sending and receiving signal line transmits by the interconnect signal mode data between this chain control Continuous Flow pipeline stage. In addition, usually in every grade, provide the operation of a decoder to select to finish in operand at the corresponding levels. Can also in previous stage, arrange a decoder, process with the decoding of pre decoding complexity, and relax the critical critical path problem of logic circuit. Any centralized Control has been eliminated in the flexibility of streamline, is determined by the decision of a complete global portions because all between submodule are worked in coordination, and in addition, each submodule is spontaneous data buffering and the self-timing data transfer control finished at one time. At last, be increasing the flexibility of streamline, interted vacant level between the level of having used, is reliable to guarantee that data between the level transmit.
The present invention relates to an improved pipeline system, there is an input in this system, export and the many processing levels between input and output for one, this many processing level is with two line interface interior bonds, in order to transmit control token and/or the data token of token and general matching unit form along streamline, for with streamline in all level interfaces and with streamline in selected level mutual, for the control data in processing level and/or the control-data function of combination, so that the processing level in the streamline has strengthened the flexibility aspect configuration and processing. According to the present invention, some are processed level and can be reconfigured when the identification of at least one token of response. These process one of level can be an initial code decoder, and it receives input and generation and/or some tokens of conversion.
The present invention also relates to an improved pipeline system, it has a spatial decoder system that is used for video data, comprise a Huffman (Huffman) decoder, a data index and an ALU, and a microcode ROM, storage is wherein arranged respectively, be used for each program of many different image compression/go compression standard, these programs are selected with token, make the many different image standards of processing become easy.
For example and further specify the present invention, want reference diagram to describe now. The description of the drawings
Figure 1 shows that 6 cycles to 6 level production lines of the various combination of 2 kinds of internal control signals;
Fig. 2 a and 2b illustrate a streamline, and wherein each step comprises the auxiliary data memory. They give a kind of mode, and the level of streamline can " compression " and " expansion " with the delay in streamline in this mode;
Fig. 3 a (1), 3a (2), 3b (1), 3b (2) illustrate data transfer control between the pipeline stages of the preferred embodiment that uses two-wire interface and multi-phase clock;
Fig. 4 is a block diagram, and this figure demonstrates the basic embodiment of the pipeline stages of introducing two-wire transmission control, and shows 22 coherent pipeline processes levels with two wires transfer control;
Fig. 5 a and 5b describe the example of a sequential chart altogether, and this figure shown at timing signal, input and output data and be used for relation between the pipeline stages internal control signal shown in Figure 4;
Fig. 6 is the block diagram of the example of a pipeline stages, and it keeps its state under the control of extension bit;
Fig. 7 is the block diagram of pipeline stages, the activation data word of these pipeline stages decoder stage;
Fig. 8 a, 8b form block diagram jointly, are presented at the use of two-wire transfer control in " data Replica (data duplica-tion) " pipeline stages example;
Fig. 9 a, 9b describe an example of sequential chart jointly, and this figure shows diphasic clock, Double wire transmission control signal and be used in other internal data and control signal among Fig. 8 a, the 8b embodiment;
Figure 10 is a reconfigurable block diagram of processing level;
Figure 11 is the spatial decoder block diagram;
Figure 12 is the temporal decoder block diagram;
Figure 13 is video format device block diagram;
Figure 14 a~c shows with the in the present invention various arrangements of memory piece:
Figure 14 a is the memory Transformation Graphs, and it has shown the first arrangement of macro block;
Figure 14 b is the memory converter, and it has shown the second arrangement of macro block;
Figure 14 c is the memory Transformation Graphs, and it has shown another arrangement of macro block;
What Figure 15 had shown Venn diagram may show selective value;
Figure 16 has shown the in the present invention variable-length of pictorial data of usefulness;
Figure 17 is the block diagram that comprises the temporal decoder of predictive filter;
Figure 18 is the expression of predictive filtering process graphical;
Figure 19 is the general expression of macroblock structure;
Figure 20 has shown the general block diagram of detector for initial code;
The example of Figure 21 explanation numeric data code in data flow;
Figure 22 is the descriptive markup generator, the decoding pointer, and the leader generator adds the block diagram of word generator and output latch Relations Among;
Figure 23 is the block diagram of spatial decoder DRAM interface;
Figure 24 is the block diagram of writing the alternate buffering device;
Figure 25 is that explanation is from the schematic diagram of just processed prediction data skew;
Figure 26 is that the explanation prediction data has been offset the schematic diagram of (1,1);
Figure 27 is the block diagram of the routine analyzer state machine of explanation Dorothy Holman decoder and spatial decoder;
Figure 28 is explanation predictive filtering block diagram;
Figure 29 shows typical decode system;
Figure 30 has shown JPEG still image decoder;
Figure 31 has shown the JPEG Video Decoder;
Figure 32 has shown multi-standard video decoder;
Figure 33 has shown the starting and ending of token;
Figure 34 has shown token address and data field;
Figure 35 has shown the token that surpasses 8 bit wides at interface;
Figure 36 has shown macroblock structure;
Figure 37 has shown the two-wire interface agreement;
Figure 38 has shown the position of outside two-wire interface;
Figure 39 has shown clock analysis figure;
Figure 40 has shown the two-wire interface sequential;
Figure 41 has shown the access structure example;
Figure 42 has shown and has read the transmission cycle;
Figure 43 has shown the initial sequential of access;
Figure 44 has shown the access example of writing transmission with 2;
Figure 45 has shown and has read the transmission cycle;
Figure 46 has shown the transmission cycle of writing;
Figure 47 has shown the refresh cycle;
Figure 48 has shown 32 bit data bus and the dark DRAM (9 row addresses) in 256k position;
Figure 49 has shown the timing parameters of any gating signal;
Figure 50 has shown the timing parameters between any two gating signals;
Figure 51 has shown the timing parameters between bus and the gating;
Figure 52 has shown the timing parameters between bus and the gating;
Figure 53 has shown that MPI reads sequential;
Figure 54 has shown that MPI writes sequential;
Figure 55 has shown the tissue of a large amount of integers in the memory Transformation Graphs;
Figure 56 has shown typical decode clock mode;
Figure 57 has shown the input clock requirement;
Figure 58 has shown spatial decoder;
Figure 59 has shown the input and output of input circuit;
Figure 60 has shown the agreement of encoder port like those shown;
Figure 61 has shown detector for initial code;
Figure 62 has shown the detection of initial code and has converted token to;
Figure 63 has shown that initial code inspection device transmits token;
Figure 64 has shown overlapping MPEG initial code (byte location);
Figure 65 has shown overlapping MPEG initial code (non-byte location);
Figure 66 has shown the redirect between two video sequences;
Figure 67 has shown the sequence that additional token inserts;
Figure 68 has shown that decoder starts control;
Figure 69 allows traffic queuing before having shown output;
Figure 70 display space decoder buffer;
Figure 71 has shown buffer pointer;
Figure 72 has shown the video decomposer;
Figure 73 has shown the structure of an image;
Figure 74 has shown the structure of 4: 2: 2 macro block;
Figure 75 has shown from pel ones computing macro block size;
Figure 76 has shown the space decoding;
Figure 77 has shown the H.261 general survey of re-quantization;
Figure 78 has shown the general survey of JPEG re-quantization;
Figure 79 has shown the general survey of MPEG re-quantization;
Figure 80 has shown the memory conversion of quantization table;
Figure 81 has shown the general survey of JPEG baseline sequential structure;
Figure 82 has shown the jpeg picture of token;
Figure 83 has shown temporal decoder;
Figure 84 has shown the image buffer explanation;
Figure 85 has shown MPEG image sequence (m=3);
Figure 86 has shown how " I " image stores and export;
Figure 87 has shown how " P " image forms, storage and output;
Figure 88 has shown how " B " image forms and export;
Figure 89 has shown " P " pixel format;
Figure 90 has shown and has H.261 predicted form;
H.261 Figure 91 has shown " sequence ";
Figure 92 has shown H.261 syntactic level;
Figure 93 has shown H.261 image layer;
Figure 94 has shown the H.261 arrangement of piece group;
H.261 Figure 95 has shown " sheet " layer;
Figure 96 has shown the H.261 arrangement of macro block;
Figure 97 has shown the H.261 sequence of piece;
Figure 98 has shown H.261 macroblock layer;
Figure 99 has shown the H.261 arrangement of picture dot in piece;
Figure 100 has shown the level of MPEG syntax;
Figure 101 has shown the MPEG sequence layer;
Figure 102 shows the MPEG group of image layer;
Figure 103 has shown MPEG image layer;
Figure 104 has shown MPEG " sheet " layer;
Figure 105 has shown the sequence of mpeg block;
Figure 106 has shown the MPEG macroblock layer;
Figure 107 has shown " OPEN GOP ";
Figure 108 has shown the example of access structure;
Figure 109 has shown the initial sequential of access;
Figure 110 has shown one fast page of read cycle;
Figure 111 has shown one fast page of write cycle time;
Figure 112 has shown the refresh cycle;
Figure 113 has shown extraction row and column address from chip address;
Figure 114 has shown the timing parameters of any gating signal;
Figure 115 has shown the timing parameters between any two gating signals;
Figure 116 has shown the timing parameters between bus and gating;
Figure 117 has shown the timing parameters between a bus and the gating;
Figure 118 has shown Dorothy Holman decoder and analyzer;
H.261 Figure 119 has shown and MPEG AC coefficient decoding process figure;
Figure 120 has shown JPEG (AC and DC) coefficient decoding block diagram;
Figure 121 has shown JPEG (AC and DC) coefficient decoding process figure;
Figure 122 has shown Dorothy Holman token formatter;
Figure 123 has shown token formatter block diagram;
H.261 Figure 124 has shown and has decoded with MPEG AC coefficient;
Figure 125 has shown the interface of Dorothy Holman ALU;
Figure 126 has shown the basic structure of Dorothy Holman ALU;
Figure 127 has shown buffer-manager;
Figure 128 has shown the block diagram of imodel and hsppk;
Figure 129 has shown the imex state diagram;
Figure 130 illustrates the startup of buffer;
Figure 131 shows a DRAM interface;
Figure 132 has shown and has write the alternate buffering device;
Figure 133 has shown an arithmetic block;
Figure 134 has shown iq piece figure;
Figure 135 has shown the iqca state machine;
Figure 136 has shown IDCT one-dimensional transform algorithm;
Figure 137 has shown an IDCT one-dimensional transform structure;
Figure 138 has shown a token streams block diagram;
Figure 139 has shown the structure of a calibrated bolck;
Figure 140 is tests microprocessor access block diagram;
Figure 141 has shown the one-dimensional transform macrostructure;
Figure 142 has shown a time decoder block diagram;
Figure 143 has shown the structure of a two-wire interface level;
Figure 144 has shown the address generator block diagram;
Figure 145 has shown the pixel skew of piece;
Figure 146 has shown the multichannel predictive filter;
Figure 147 has shown single predictive filter;
Figure 148 has shown the one-dimensional prediction wave filter;
Figure 149 has shown the piece of pixel;
Figure 150 has shown the structure of read pointer;
Figure 151 has shown that piece and pixel depart from;
Figure 152 has shown the prediction example;
Figure 153 has shown the read cycle;
Figure 154 has shown write cycle time;
Figure 155 has shown the block diagram that uses the top layer register of timing base;
Figure 156 has shown the control that increases the number of presenting;
Figure 157 has shown buffer management state machine (fully);
Figure 158 has shown the major cycle of state machine;
Figure 159 has shown the buffer 0 (22 * 18 macro block) that comprises the SIF image;
Figure 160 has shown theSIF component 0 with display window;
Figure 161 has shown the pixel format of an example of storage block address;
Figure 162 has shown the buffer 0 (22 * 18 macro block) that comprises the SIF image;
Figure 163 has shown an address computation example;
Figure 164 has shown a writing address generator state machine;
Figure 165 has shown cutting apart of data path;
Figure 166 has shown the operation in two cycles of data path;
Figure 167 has shown thatmode 1 carries out filtering;
Figure 168 has shown horizontal up-sampler data path;
Figure 169 has shown the structure of color space conversion device.
Description of the invention
Use briefly general term, the invention provides an input, many processing between output and the input and output grade. This many processing level is with two line interface interior bonds, in order to transmit token along streamline, control token and/or data token with general matching unit form, for with streamline in all level interfaces and with streamline in selected level mutual, in order to process the middle control of level, the control-data function of data and/or combination. Utilize these, the processing level in the streamline has strengthened the flexibility aspect configuration and processing.
Each processing level can comprise main and auxiliary memory in the streamline. Level in the streamline can reconfigure according to the identification of selected token. Some tokens in the streamline are dynamic self-adaptings, and in order to finish function, the position can be relevant with the processing level, also can have nothing to do.
According to the present invention, in a streamline machine, some tokens can be changed by the interface of some grades, and some tokens can be mutual with all processing level in the streamline, and perhaps only to process level mutual rather than all with above-mentioned some. Token in the streamline can be mutual with adjacent processing level, and is perhaps mutual with non-conterminous processing level, and token can reconfigure the processing level. These tokens can be that the position is relevant to some function in streamline, can be location independents to other functions.
Token and the combination of reconfigurable processing level provide the basic structural unit of pipeline system. The historical decision of processing that can be processed by this alternately the level past of processing level in token and the streamline. Token can be with bright its feature of address field list, and with one process level alternately can be with these address fields decisions.
According to the present invention, in an improved pipeline machine, each token in these tokens can comprise an extension bits, and extension bits shows some additional words of existence in the token, and the last word of marker token. The address field variable-length also can be the Huffman coding.
Token can processed level produce in improved pipeline machine. Token can comprise the data that are sent to the processing level or not have data. Some token can be identified as data token, provide data to the processing level in the streamline, and other tokens is identified as controlling token, only make the processing level in the streamline reach desired regulation, and these regulations comprise processes reconfiguring of level. Some tokens can provide to the processing level in the streamline data and stipulate both in addition. Some above-mentioned token can be identified the coding standard to processing level in the streamline, and other tokens can irrespectively be worked with any coding standard in processing level. Token can be changed continuously by the processing level in the streamline.
According to the present invention, the mutual flexibility of token and the cooperation of processing level are processed a grade multi-purpose variation for inherent structure in the streamline makes and are become easily, and the flexibility of token is expanded system and/or become easy. Token can have the ability to make the interior many functions of any processing level in the streamline to become convenient. The streamline token can or based on hardware or based on software. Therefore, token makes that more effective utilization of system bandwidth becomes easy in the streamline. Token can provide data and control simultaneously to the processing level in the streamline.
The present invention can comprise a pipeline processor, in order to process the bit stream of the many absolute codings that are arranged to the single serial bit stream of digit order number, the control code that processor has this absolute coding to serial bit stream in the corresponding data of carrying, and use many levels of being made interior bonds by two line interfaces. The further feature of processor represents with detector for initial code, the single serial bit stream of detector response produces control token and the data token that is applied to two line interfaces, identify some token as the control token that is suitable for that grade with a token decode circuit that is placed on some grade, the token that is not identified is passed through along streamline, and with a reconfigurable decoding and control token of having identified of APU response, reconfigure an a specific order and remove to process a data token of having identified.
Pipelined machine also can comprise the first and second registers, and the first register is placed on the position as the input of decoding and analytical equipment, and the second register is placed on the position as the output of decoding and analytical equipment. Processing one of level can be a spatial decoder, and the second level is a token generator, for generation of the control token and the data token that pass through along two line interfaces. Token decode device is placed in the spatial decoder in order to identify some token as the control token that is fit to spatial decoder, and makes and become the first codec format following data token in control token back to carry out the space decoding in order to reshuffle spatial decoder.
An other level can be temporal decoder, be arranged in the downstream of streamline space decoder, remove to identify the token that those are suitable as temporal decoder control with second a token decode device that is placed in the temporal decoder, carry out time decoder at the data token of controlling the token back and make and become the first codec format following. Temporal decoder can utilize a reconfigurable predictive filter. With the reconfigurable predictive filter of prediction token.
Data can 8 * 8 data blocks be that unit moves along two line interfaces in temporal decoder, in order to provide address device along block boundary retrieval and stored data piece. Address device can be crossed block boundary storage and retrieves data blocks. Address device is rearranged some above-mentioned pieces as the pictorial data that shows usefulness. These stored and the data block retrieved can greater than and/or less than 8 * 8 block of pixel data. Also can provide line unit the output display of temporal decoder or write back to a video memory unit. Codec format can be still image form or motion video form.
According to the present invention, process level and also can comprise an action recognition device that the level configuration is processed in token decode device and the realization of a response token decode device of token address decoding. In pipeline processor some are processed the processing level that level has many usefulness two line interface bus interior bonds, control token and data token are arranged by these two line interfaces. A token decode circuit is placed on some and processes the control token that level is used for identifying suitable this grade of some token conduct, also for the control token that is not identified is transmitted along streamline. The first input latch circuit can be placed on and be positioned on two line interfaces of processing the level front, and the second output latch circuit can be placed on follows on two line interfaces of processing the level back. The token decode link tester is crossed the first input and latch and is connected to two line interfaces. The predetermined process level can comprise a decoding circuit that is connected to a tentation data storing apparatus, utilize this decoding circuit, each is processed level and only takes state of activation when this level comprises when a predetermined level makes the activation signal pattern, and remains on active mode until this level comprises a predetermined level deexcitation pattern.
The present invention also provides according to many different image compression/go option and installment method of the system of compression standard deal with data in the digital image information treatment system. Image standard can comprise JPEG, and MPEG and/or H.261, or the combination of other any standard and these image standards do not break away from the spirit and scope of the present invention in any case. According to the present invention, system can comprise a spatial decoder that is used for video data, and a Huffman decoder arranged, a data index and the ALU with microcode ROM, each program storage, that be used for many different image compression/go compression standard is arranged among the ROM respectively, these programs are selected with the Interface Matching unit of a token form, become easy so that process many image standards. According to the present invention, a multi-standard system can utilize token to operate, no matter what selected image standard is. To all various image standards, can utilize token to arrange as all purpose communication in the system. System can represent its feature with one step of standard token more than, and many standard tokens are used for the different coding data flow that is arranged to single string data stream is mapped to a single decoder. Decoder use relevant with standard and with irrelevant hardware and the token that mixes of standard. System also can comprise an address generator device, for handle and the associated data macro of different image standards are arranged to a public volume addressing figure classification chart.
Above institute's other purpose of the present invention of mediating a settlement and advantage can become apparent by being described in more detail below.
In the explanation of the back of enforcement of the present invention, following clauses and subclauses are frequent uses, so following vocabulary is made general definition:
Vocabulary
Piece: 8 row * 8 row PEL matrix or 64 DCT coefficients (source quantizes or inverse quantization)
Colourity (component): relevant matrix, piece or the single pixel of primary colours in one of two kinds of colour difference signals of expression bit stream that coexists in institute's definition status, colour difference signal represents with symbol Cr and Cb.
The expression of coding: the expression of data element in its coding form.
The video bit stream of coding: the in the present note coded representation of defined serial one or more image.
The subsequence of coding: the order that image transmits and decodes. This order does not need with display order identical.
Component: matrix, piece or signal pixel, in its 3 matrixes that come the self-forming image (brightness and 2 colourities) one.
Compression: reduce figure place in order to represent data item.
Decoder: the entity that decoding is processed.
Decoding (processing): this illustrates that defined processing is to read the input coding bit stream and generate decoded picture or audio sample.
Display order: shown decoded picture order. Generally, the order that represents in this input with decoder is identical.
Coding (processing): a kind of processing does not describe in detail in the present note. Read the stream of input imagery or audio sample and generate one such as this defined efficient coding bit stream to be described.
Based encode: the used information of coding of macro block and image is only from macro block and image.
Brightness (component): the coexist matrix of the relation of the primary colours under institute's definition mode in the bit stream of the signal of expression monochromatic specification, piece or single pixel, brightness represents with symbol Y.
Macro block: 48 * 8 brightness data pieces and 2 (being 4: 2: 0 chroma formats) 4 (being 4: 2: 2 chroma formats) or 8 (being 4: 4: 4 chroma formats) are equivalent to 8 * 8 chroma data pieces from 16 * 16 parts of image brightness composition, macro block some the time refer to pixel data, some the time refer to resemble other data of halting in the macro block leader of syntax of numerical value and the definition of these declaratives, have the general skill person in this area for one, see that from context its usage is clearly.
Motion compensation: using motion vector is in order to improve the forecasting efficiency of pixel value, prediction be with motion vector in the past and/or in the future reference picture biasing is provided. Reference picture comprises the pels values of decoding that is used to form predicted error signal of front.
Motion vector: be used for the two-dimensional vector of motion compensation, a biasing of the coordinate of coordinate position in the reference picture from current image is provided for this reason.
Non-based encode: the coding of macro block or image was both also used macro block and the visual information that occurs in At All Other Times with the information of oneself.
Pixel: the element of image.
Image: original, the pictorial data of coding or reconstruct. Image original or reconstruct is comprised of 38 figure place rectangular matrix of showing brightness and 2 carrier chrominance signals, to progressive, an image is equal to a frame, and to interlaced video, image can represent or according to its context, is expressed as field, top or the field, the end of a frame with a frame.
Prediction: the estimation of the data element of pixel value or current decoding is provided with a fallout predictor.
Reconfigurable processing level (RPS): be a step level, in this grade in response to the token reconstruct self of identification to realize various operations.
A series of macro blocks
Token: a general suitable unit, it represents control and/or data function with the form of interactive interface packets of information.
Initial code (system and video): be embedded in unique 32 bit codes in the coding stream. They are included in the sign of some structure in the coding syntax as several purposes.
Variable length code; VLC: the process stated to coding is assigned to short coded word to the high event of frequency, and the low event of frequency number is assigned to long coded word.
Video order: a series of one or more images.
The detailed description of embodiment
As the explanation of the essential characteristic of pipeline system used in the preferred embodiment of the present invention, Fig. 1 is the instance graph in six cycles of six level production lines greatly simplified. (as described in following illustrating in greater detail, the preferred embodiment of this streamline comprises unshowned useful feature among some Fig. 1).
Referring now to accompanying drawing,, wherein same reference number in each figure of accompanying drawing TYP or corresponding element, particularly Fig. 1 the block diagram in six cycles in the example of the present invention is shown. Every row box example illustrates a cycle, and each different level is used respectively A~F mark. Each shade box shows corresponding grade of remain valid data, i.e. processed data in a pipeline stages. After processing (can comprise and only not carry out the simple transmission that data are calculated), valid data are transferred out as effective output data.
It should be noted that the actual flow waterline use can more than or less than six pipeline stages. Be interpreted as the present invention and can use the pipeline stages of any amount. Therefore, data can be processed in multistage, and the processing time not at the same level can be different.
Except clock and data-signal (below will illustrate), streamline comprises (VALID) (ACCEPT) signal of signal and " reception " of two transmission of control signals-" effectively ". These signals are in order to control data transfer in the streamline. In the legend as the useful signal of the upstream in two row that connect adjacent level along forward or downward direction from each pipeline stages by and arrive immediate neighboring devices. This device can be other pipeline stages or some other system. For example, final pipeline stages can pass to its data following treatment circuit. In the legend as the reception signal that connects adjacent level two row middle and lower reaches, along other direction that makes progress by and reach a upper device.
The typical data pipeline system that the present invention is actual to be used has following one or more features in a preferred embodiment:
1, this streamline is " retractility is arranged ", so that delay on the specific stream pipeline stage may be disturbed minimum to other pipeline stages. Pipeline stages subsequently can continue to process, and this just means that interruption has been got through in the data flow delay-level after. Equally, above-mentioned pipeline stages also can work on may locating. In this case, any interruption can be removed from data flow in every possible place in the data flow.
2, the control signal of mediation streamline is to organize like this: they only propagate into immediate contiguous pipeline stages. For the flow direction signal identical with data flow, this is back to back next stage. For the signal opposite with the data flow flow direction, this is back to back upper level.
3, the data in the streamline are encoded in this wise, so that can process many dissimilar data in streamline. This coding is suitable for the packet of variable-size and needn't knows in advance the size of bag.
4, the auxiliary operation relevant with the explanation data type is as much as possible little.
5, concerning each pipeline stages, it is possible only identifying its data type that requires the needed minimal number of function. Yet it also should be able to pass to next stage with all data types, although it unidentified they. This just makes the communication between the non-conterminous pipeline stages become possibility.
Although not shown in Fig. 1, some data wires are arranged, i.e. single line or some parallel lines, their form the data bus of also introducing and drawing from each pipeline stages. As following illustrate in more detail, on these data wires, transfer of data is advanced, is transferred out pipeline stages and transmits between pipeline stages.
Should be pointed out that first-class pipeline stage can be by any type of said apparatus receive data and control signal. For example, the receiving circuit of digital image transmission system, other streamline or similar device. On the other hand, it self can be created in total data processed in the streamline or partial data. In fact as described below, " level " can comprise any treatment circuit, said treatment circuit comprises does not have system at all (being in order to pass through data) or whole system (for example, another streamline even a plurality of system or a plurality of streamline), and it can produce, changes and delete desired data.
When pipeline stages comprises along the downward valid data that transmit of streamline, show that VALID (effectively) signal of data validity only need to be transferred to immediate next pipeline stages again, and needn't pass fartherly. Therefore, two line interfaces are included in the system between every pair of pipeline stages. This comprises two line interfaces between preposition parts and the first order, and the interface, two wires of rearmounted parts and rear class, if such front and back parts are arranged, and data will transmit between they and streamline.
Each signal, i.e. " reception " and " effectively " has a high and low level value. These values are abbreviated as respectively " H " and " L ". In the invention of implementing, the prevailing application of streamline, most typical is digital. In such implemented in digital, high value is passable, for example is logical one, and low value can be logical zero. Yet, this system is not limited to implemented in digital, in analog realization, high value can be a voltage or be higher than other the similar quantity that (or being lower than) sets threshold value, and low value is with the corresponding signal indication that is lower than (or being higher than) this value or another threshold value. Concerning digital application, the present invention can utilize any known technology, finishes such as CMOS, ambipolar etc.
Needn't realize with each other storage device and cabling the storage of useful signal. Even also be like this in digital embodiment. The indication that whole requirements are data " validity " is stored with data. Only as an example, in the digital TV image that represents with digital value, as stipulating that in international standard CCIR601 some particular value is unallowed. In this system, use the sampled value of 8 bit representing images of binary number. Value of zero and 255 cannot be used.
If such image is processed in the streamline of setting up in the present invention's practice, one (for example zero) then may using in these numerical value points out that the data in a specific order in streamline are invalid. Therefore, any non-zero all should be thought effectively. In this example, do not have the special lock storage that can be identified and store relevant data " effectively ", however, data validity still is stored with data.
As shown in Figure 1, " H " or " L " that the state that enters every grade useful signal is used on the arrow that refers to the right on top points out. Therefore, the useful signal that enters the B level from the A level is low level, and is high level from the useful signal that the D level enters the E level. " H " or " L " that the state that enters every grade reception signal is used on the arrow that refers to left of bottom points out. Therefore, the reception signal that enters the D level from the E level is high level, and is low level from the reception signal that the device that connects the streamline downstream enters the F level.
Every when instantly connecing level and entering the reception signal that connects adjacent levels on it and be high level, during a cycle, (the following describes) data and be transferred to another level from one-level. If receiving signal between two-stage is low level, then not transmission between these levels of data.
Referring again to Fig. 1, if box is drawn hacures, as an example, the respective streams pipeline stage is believed to comprise effective output data. Equally, the useful signal that is sent to next stage from this grade is high level. Streamline when Fig. 1 example illustrates B, D and E level and comprises valid data. A, C and F level do not comprise valid data. When initial, the useful signal that enters pipeline stages A is high level, this means that in the enterprising data that enter streamline of transmission line be effective.
Also at this moment, the reception signal that enters pipeline stages F is low level, therefore, does not have data, no matter is effective or invalid, exports from the F level. It should be noted that effectively and invalid data both transmits between pipeline stages. Be unworthy the invalid data that stores to rewrite, thereby it is removed from streamline. Yet valid data needn't be rewritten because it be process or in lower connection device, use must storage data, said lower connection device is streamline, install or be connected to streamline and from the system of streamline receive data.
In the streamline that Fig. 1 exemplifies, the E level comprises valid data D1, the D level comprises valid data D2, the B level comprises valid data D3, and the device (not drawing) that is connected with the upstream flow waterline comprises transmission influent stream waterline and the data D that processes therein4 Except on the device that connects, B, D and E level comprise valid data, therefore, the useful signal that enters respectively their next stage devices from these grades or device is high level. Yet, because these levels do not comprise valid data, so be low level from the useful signal of A, C and F level.
The device that now supposition connects downwards from streamline does not prepare to receive the data of streamline. This device is put corresponding low level reception signal into the F level and is done bulletin. Yet self does not comprise valid data the F level, therefore can be from above-mentioned E level receive data. Become high level so entered the reception signal setting of E level by the F level.
Equally, the E level comprises that valid data and the preparation of F level receive these data. Therefore, as long as valid data D1At first be transferred to the F level, the E level just can receive new data. In other words, although the downward the transmission of data of F level, all other levels can both be transmitted, and are rewritten or lose without any valid data. Atcycle 1 end, data can be arrived the right by " displacement " step. This situation was shown in thecycle 2.
In the example that has exemplified, in thecycle 2, lower connection device does not also prepare to receive new data, and therefore, the reception signal that enters the F level is still low level. Owing to do like this and can cause valid data D1Rewrite and loss, so the F level can not receive new data. Become low level so entered the reception signal of E level by the F level; Equally, because the E level also comprises valid data D2, make the reception signal also enter the D level from the E level. Yet, all A~D level can both receive new data (because they do not comprise valid data, perhaps because they can make their valid data to following transfer and reception new data), they are high level by the corresponding reception signal setting with them, and transmit this conditioned signal to their preposition adjacent levels.
Pipeline stages aftercycle 2 is illustrated in that delegation inmark cycle 3 among Fig. 1. As an example, suppose that the lower device that connects also do not prepare to receive new data (the reception signal that enters the F level is low level) from the F level. Thus, E and F level still are " blocked ", but in thecycle 3, the D level has received valid data D3, its originally the invalid data in this grade rewritten. Because the D level can not the transmission of data D in thecycle 33, it just can not receive new data, is low level so will enter the reception signal setting of C level. Yet A~C level prepares to receive new data, by placing high level to do bulletin their corresponding signals that receives. Should be understood that data D4Be displaced to the B level from the A level.
The lower device that connects of now supposition becomes to be prepared to receive new data in thecycle 4. Place high level that this information is delivered to streamline by the reception signal that will enter the F level. Although C~F level comprises valid data, they can make data be shifted downwards now, thereby can receive new data. Because therefore every one-level can both make data to following one step of displacement, they are exported separately the reception signal with it and place high level.
As long as entering the reception signal of last pipeline stages (being the F level in this example) is high level, then streamline shown in Figure 1 just serves as the streamline that is fixedly connected with, and just in each cycle data displacement is downwards gone on foot. Therefore, in thecycle 5, included data D in the F level incycle 41Shift out streamline to next device, one step of other all data displacement downwards.
The reception signal that now supposition enters in thecycle 5 in the F level becomes low level. Repeat again once, this means that D~F level can not receive new data. The reception signal of exporting and enter the most contiguous higher level from these grades becomes low level. Therefore, data D2、D3And D4Can not be shifted downwards, however data D5Can. The corresponding state of streamline is shown in the cycle among Fig. 16 after thecycle 5.
According to the preferred embodiments of the present invention, the mutually isolation because the processing level in the streamline becomes, the unappropriated ability of processing level of " filling " of streamline is very useful. In other words, although pipeline stages receive data immediately not, whole streamline also needn't stop and the latency delays level. On the contrary, when one-level can not receive valid data, it formed one interim " wall " simply in streamline. However, the lower of " wall " connects continuation secured transmission of payload data at different levels, even to the circuit that is connected with streamline, " wall " left side is at different levels still to be received and downward secured transmission of payload data. Even when some pipeline stages temporarily can not receive new data, other level still can continue normal operation. Especially as long as the A level not yet comprises the valid data that can not advance owing to next stage does not receive new data immediately, streamline just can continue receive data and enter its initial A level. So example is described, even when one or more processing grade obstruction, data still can be conveyed into streamline and at different levels between.
In the embodiment shown in fig. 1, suppose the reception signal that the every flow pipeline stage is not stored them and received from its most contiguous next stage. Generation be that when the reception signal that enters next stage became low level, the upstream propagation at different levels of this low level signal were not until comprise the most contiguous pipeline stages of valid data. For example, referring to Fig. 1, suppose that the reception signal that enters the F level in thecycle 1 becomes low level. In thecycle 2, this low level signal transfers back to the D level from the F level.
In thecycle 3, as data D3When being locked into the D level, receiving signal and upstream transmit level Four to the C level. When the reception signal that enters the F level in thecycle 4 became high level, it must upstream transmit until the C level. In other words, the variation that receives in the signal must back pass be returned level Four. Yet, if the intergrade that exists some can receive new data in the illustrative embodiment of Fig. 1, receives the starting point that signal just needn't be delivered to passback streamline.
In the embodiment that Fig. 1 exemplifies, each pipeline stages will still need independently to input, output data latch, so that data transmit and unconsciously do not rewrite at inter-stage. Although the streamline that instantly connects gets clogged, be that they are can not transmit contained data the time, the streamline of fall being lifted among Fig. 1 can " compression ", but this streamline not for comprise provide between valid data at different levels do not comprise valid data grade and " expansion ". Say that precisely compressed capability depends on some cycles of existence, does not occur valid data during these cycles before the first-class pipeline stage.
For example, in thecycle 4, keep low electricity to produce if enter the reception signal of F level, and valid data filling pipeline stages A and B, as long as valid data continue to appear at A level input, streamline can not be done any further compression, effectively input data and may lose. However, illustrated streamline among Fig. 1 is because as long as there is the pipeline stages that does not comprise valid data, it just can compress, so reduced the risk of loss of data.
Fig. 2 example illustrates another embodiment of streamline, and it can logically compress and stretch, and comprises that restriction receives the circuit that signal transmits to the most contiguous prime. The below will be at length illustrating and illustrate in order to the circuit of implementing this enforcement, and Fig. 2 is only in order to illustrate its operation principle.
Just for the ease of relatively, identical in entering input data and the reception signal among the streamline embodiment shown in Figure 2 and entering streamline embodiment shown in Figure 1. Therefore, E, D and B level include respectively effect data D1、D2And D3 Entering F level reception signal is low level, data D4Appear at before the initial pipeline stages A. Each contiguous three right line of pipeline stages of connection shown in Figure 2. The line that can be the top of a bus is data wire. Middle that line is the line of transmission useful signal, and beneath that line is the line that transmission receives signal. And as mentioned above, the reception signal that enters the F level all keeps low level except in thecycle 4. Therefore, other data D in thecycle 45Before appearing at streamline.
In Fig. 2, the box indicating in two of each pipeline stages all comprises the primary and secondary data storage cell to illustrate every grade in this streamline embodiment. Every grade right half part represents the primary data memory cell in Fig. 2. Know this description just in order to illustrate, and not as limiting.
As shown in Figure 2, be high level as long as enter certain grade reception signal, data just are transferred to the secondary storage unit of next stage from the primary memory cell of this grade during any given cycle. Therefore, be low level although enter the reception signal of F level, entering all other reception signals at different levels is high level, thus in thecycle 2 data D1、D2And D3Shift forward a step, and data D4Displacement is advanced among the first order A.
By this, streamline embodiment shown in Figure 2 works in the mode of similar streamline embodiment shown in Figure 1. Yet, be low level although enter the reception signal of F level, the reception signal that enters the E level from the F level is high level. As following illustrated, because the secondary storage unit is arranged, except the F level, low level reception signal does not just need to more prime transmission. And by allowing the reception signal that enters the E level continue as high level, it can receive new data immediately the bulletin of F level. Because the F level can not be transmitted the data D in its primary memory cell downwards in the cycle 31(the reception signal that enters the F level is low level) is so the E level must be with data D2The secondary storage unit of F level is entered in transmission. Because the primary and secondary memory cell of F level both comprises the valid data that can not transmit, so the reception signal that enters the E level from the F level places low level. Therefore, this expression low level receives signal and send one-level with respect to reverse passback ofcycles 2, and this reception signal must send until the C level to passback in the embodiment shown in fig. 1.
Because A~E level can transmit its data, the reception signal that enters its most contiguous prime from these grades places high level. Thereby data D3And D4The right shift one-level is so that in thecycle 4, they are loaded respectively the primary data memory cell of E level and C level. Although the E level comprises valid data D in its primary memory cell now3, but its secondary memory cell still can not have in order to store other data the danger of rewriting any valid data.
As mentioned above, suppose that now the reception signal that enters the F level in thecycle 4 becomes high level. This expression streamline can be immediately from the streamline receive data to its lower connection device that transmits data. Yet the F level receives signal with it and places low level, thereby represents that to the E level F level do not prepare to receive new data. The reception signal that it is also noted that each cycle point out in next cycle with " generation " what, that is to say and point out whether data can transmit (the reception signal is high level), and perhaps whether data must remain on origin-location (the reception signal is low level). Therefore, from thecycle 4 to thecycle 5, data D1Device below the F level is sent to, data D2In the F level from the secondary primary memory that is displaced to, but the data D in the E level3Be not sent to the F level. Because stage further has high level to receive signal, so data D4And D5Can as usually, transmit into next pipeline stages.
Comparecycle 4 and the state of streamline in thecycle 5, arranging of visible secondary storage unit can make streamline embodiment expansion shown in Figure 2, that is to say can advance wherein data storage cell of random increase valid data. For example in thecycle 4, because before the reception signal that enters the F level became high level, their data can not transmit, data block D1、D2And D3Form one " caching ". In case yet this signal becomes high level, data D1Shift out streamline, data D2The primary memory cell of F level is advanced in displacement, and if following device can not receive data D2, the secondary storage unit of F level just becomes idle to receive new data. Streamline is " compression " again. This showed in thecycle 6, data D3Be displaced to the secondary storage unit of F level,data 4 are transferred to the E level from the D level as usually.
Fig. 3 a (1),, 3a (1), 3b (1) and 3b (2) (their stack ups represent with Fig. 3) exemplify out the preferred embodiment of streamline. This preferred embodiment uses the not overlapping clock pulses of the two phase with  0 and  1 phase place to realize the structure shown in Fig. 2. Although recommended the two phase clock pulse, it will be appreciated that to drive each embodiment of the present invention also be possible with having clock pulses more than two-phase.
As shown in Figure 3, each pipeline stages represents with the box of the representative primary and secondary memory cell that two separation are arranged. And, although being connected with data wire, useful signal connects different pipeline stages, for convenience of explanation, the reception signal only is shown in Fig. 3. During some received the clock pulses phase place of signal, the variation of state used the arrow that makes progress to represent variation from the electronegative potential to the high potential in Fig. 3. Equally, downward arrow represents the variation from the high potential to the electronegative potential. The transmission of data from a memory cell to another memory cell represents with the arrow of the hollow of t. Suppose that the useful signal of exporting from these memory cells is in high potential when any elementary or secondary storage unit to deciding grade and level includes valid data.
In Fig. 3, each cycle represents with during nonoverlapping clock pulses phase place  0 and  1 whole. As more detailed description below, (box with the left side every grade represents) is sent to primary memory cell (box with the right in every grade represents) to data from the secondary storage unit during clock cycle  1, and during clock cycle  0, data are sent to the secondary storage unit of next stage from the primary memory cell of one-level. Primary and secondary memory cell during Fig. 3 also shows every grade further connects via an internal interface take-up, so that with receiving signal passes to one-level from one-level the same manner by this internal interface collection of letters number. Like this, the secondary storage unit will know when its data can pass to primary memory cell.
Fig. 3 illustrates  1 phase place incycle 1, the data D of the secondary storage unit of wherein formerly be shifted respectively E, D and b level1、D2And D3Be shifted the primary memory cell into each corresponding stage. Therefore, during the  incycle 11 phase place, streamline presents as the same structure thecycle 1 of Fig. 2. As mentioned above, the reception signal that enters the F level supposes it is electronegative potential. Yet as shown in Figure 3, the reception signal that enters the primary memory cell of F level is electronegative potential, but because this memory cell does not comprise valid data, places high potential so will enter the reception signal of its secondary memory cell.
Because the secondary storage unit of F level does not comprise valid data, the reception signal that enters the primary memory cell of E level from the secondary storage unit of F level also places high potential. As mentioned above, because the primary memory cell of F level energy receive data, the data in the primary and secondary memory cell of all upstreams can both be shifted downwards and be rewritten without any valid data. Data displacement from the one-level to the next stage occurs during next  0 phase place in thecycle 2. For example, the valid data D that in the primary memory cell of E level, comprises1Displacement enters the secondary storage unit of F level, data D4Displacement enters pipelining-stage, i.e. displacement enters secondary storage unit of A level etc.
Still do not comprise valid data during  0 phase place of the primary memory cell of F level in thecycle 2, therefore from primary memory cell enter the F level the secondary storage unit the reception signal keep high potential. Therefore during  1 phase place in thecycle 2, data are one step of right shift again, i.e. secondary to primary memory cell from every one-level.
Yet, if the reception signal that enters the F level from the device that connects down remains low level, in case valid data load into the primary memory cell of F level the secondary storage unit of the F level that data will be shifted out and don't rewriting and destruction valid data D1Impossible. Enter the reception signal of secondary storage unit thereby become low level from F level primary memory cell. Yet because it does not comprise that the reception signal of valid data and its output is high level, data D2Still can be shifted the second-level storage into the F level.
During the  incycle 31 phase place, although data can above-mentionedly be shifted at different levels at all, be with data D2The primary memory cell that displacement enters the F level is impossible. In case valid data load the secondary storage unit into the F level, the F level just can not transmit these data. It receives signal setting in low level with its output, thus the generation of this event of bulletin.
Suppose that the reception signal that enters the F level keeps low level, the data of F level top can continue during each clock pulses phase place at inter-stage and level internal shift, until next valid data piece D3Reach the primary memory cell of E level. Such as described, during the  in cycle 41 phase place, reach this situation.
During the  in cycle 50 phase place, data D3Loaded into the primary memory cell of E level. Because these data can not be shifted again, the output of the primary memory cell of E level receives signal and places low electricity to produce. The data of upstream can be shifted as usually.
As in thecycle 5 of Fig. 2, suppose that the device that connects the streamline below can receive pipeline data now. It will enter pipeline stages F during the  in cycle 41 phase place reception signal places high level, thereby sends the information of this situation. The primary memory cell of F level present energy right shift data and they also can receive new data. After this, data D during the  in cycle 50 phase place1Be shifted away, so the primary memory cell of F level no longer comprises the data that must preserve. During the  incycle 51 phase place, data D in the F level2Be displaced to primary memory cell from the secondary storage unit. The secondary storage unit of F level also can receive new data, and the reception signal that will enter the primary memory cell of E level places high level, thereby spreads out of this information. During data transmit in level, namely secondary to its primary memory cell from it, two groups of memory cell will comprise same data, but because these data also will remain in the primary memory cell, the data in the secondary storage unit can be rewritten according to loss ground by countless. This situation is applicable to data are conveyed into next stage from the primary memory cell of one-level secondary storage unit.
The reception signal that enters now the primary memory cell of F level during  1 phase place of supposition in thecycle 5 becomes low level. This means that the F level can not make data D2Send out streamline. Therefore, the F level with the reception signal setting from its primary memory cell to its secondary memory cell in low level, to prevent from rewriting valid data D2 Yet be stored in the data D of the secondary storage unit of F level2Can be rewritten by harmless lost territory, and so data D3During the  incycle 60 phase place, be transmitted the secondary storage unit into the F level. Data D4And D5Can downward displacement as usually. As long as entering the reception signal of F level primary memory cell is low level, in case valid data D3With data D2Be stored in together the F level, then neither one secondary storage unit can receive new data, and places low level to come this information of bulletin by the reception signal that will enter the E level.
When the reception signal that enters streamline from the device that connects down from the low level to the high level or when changing on the contrary, this variation needn't transmit in streamline upward except to a most contiguous upper memory cell (in one-level or in a upper pipeline stages). On the contrary, this variation each clock pulses phase place in streamline transmits a memory cell part upward.
So example is described, and the concept of " level " is the understanding problem to a certain extent in the pipeline organization that Fig. 3 exemplifies. Because data are (from the secondary storage unit to primary memory cell) transmission in level, as it interpolar (entering the secondary storage unit of adjacent next stage from the primary memory cell of the upper level) transmission, people can think that equally level is to comprise " elementary " memory cell, are thereafter " secondary storage unit " rather than as said in Fig. 3. Therefore the concept of " elementary " and " secondary " memory cell mainly is the appellation problem. In Fig. 3, " elementary " memory cell also can be called " output " memory cell, because they are to make data spread out of the memory cell that enters next stage or next device from one-level, and " secondary " memory cell should be with one-level " input " memory cell.
In explanation above-described embodiment, shown in Fig. 1~3, only narrated data transfer under reception and useful signal control. Should be further understood that as each pipeline stages and also can before transmission between the internal storage unit of random data at streamline that its receives, perhaps before being sent to next pipeline stages, process these data. Therefore, again referring to Fig. 3, pipeline stages can be defined as and contain the input and output memory cell and optionally process the part streamline that is stored in the data in its memory cell.
In addition, need not be the hardware configuration of some other type from streamline F level downward " device ", on the contrary, it can be the part of another part or another streamline of same flow waterline. As following, pipeline stages not only can be set to reception (ACCEPT) signal of the corresponding levels low level when all the lower memory cell that connects all is equipped with the effect data, and when processing pipeline stages need to be greater than a clock pulses phase place time for finishing data, also it can be set low. When pipeline stages was set up valid data in one or two memory cell, this situation also can exist. In other words, certain grade does not need only whether to comprise that according to the most contiguous memory cell that connects down the valid data that can not transmit transmit the reception signal. On the contrary, transmit between adjacent memory cell in order to control data, also can in level or by the circuit outside the outer corresponding levels, make to receive signal change itself. Effectively (VALID) signal also can be processed with similar approach.
The very large advantage of two line interfaces (effectively with each line of reception signal) is that it just can control streamline without control signal, and said control signal must transmit against the direction of streamline until its initial level. Again referring to Fig. 1, for example in itscycle 3, F level " notice " E level it can not receive data, E level notice D level, the D level is notified again the C level. Really, as truly the more multistage valid data that comprise being arranged, then this signal can send fartherly along streamline to passback. In the embodiment shown in Fig. 3, in thecycle 3, this low level receives signal and transfers back to till the E level, and then only passes to its primary memory cell.
As mentioned above, this embodiment need not greatly increase the needed silicon area of complete design and just can reach this flexibility. That each latch that is used for data storage in the streamline only needs an independent extra transistor (it is arranged on silicon chip very effectively) representatively. In addition, preferably increase again two added latch and a small amount of gate circuit, in order to process reception and the useful signal relevant with data latches in per half grade.
Fig. 4 example illustrates the hardware configuration of finishing level shown in Fig. 3.
Only as an example, suppose that 8 Bit datas are by pipeline parallel method transmission (having or not further processing in the combinational logic circuit arbitrarily). Yet it will be appreciated that also can use in the embodiment of this invention more than or less than the data of 8 bits. In addition, be applicable to any data bus width according to these embodiment two line interfaces, and if special applications need this data bus width even can change to next stage from one-level. According to this embodiment, this interface also can be used for the signal for the treatment of of simulated.
As previously mentioned, when using other conventional timing means, the not overlapping clock control interface of the most handy two-phase. In Fig. 4~9, these clock phase signals are designated as PH0 and PH1. In Fig. 4, each the time phase place clap signal and represent with line.
The input data enter pipeline stages through long numeric data line bus IN-DATA, and are transferred to following pipeline stages through output data bus OUT-DATA, perhaps are transferred to receiving circuit subsequently. These input data at first load the input latch (one of each input data signal) that is referred to as LDIN into a group with following method, and they consist of above-mentioned secondary storage unit.
In the example that present embodiment exemplifies, suppose their the D input of Q output tracking of whole latch, that is to say that working as the clock pulses input is high level, namely they are " loaded " when the logical one level. In addition, Q output keeps its end value. Change speech this it, this Q output " being latched " is on the trailing edge of their clock pulse signals separately. Each latch has any (as shown in Figure 5) among two not overlapping clock pulse signal PH0 and the PH1, or the combination of the logical “and” of these clock pulse signals PH0, PH1 and a logical signal. By being provided at latch or any other the known latch means that latchs on the clock pulse signal rising edge, as long as conventional method is used to guarantee the correct timing of the work of latching, the present invention can work equally.
From the output data of input data latch LDIN via suitable and combinational logic circuit B arbitrarily1Transmission, this circuit B1To become intermediate data from the output data transaction of input register LDIN, then this intermediate data is loaded into later among the output data latch LDOUT that is comprised of above-mentioned primary memory cell. This from the output of output data latch LDOUT before being transferred to forward the next stage in downstream as OUT_DATA, equally can be by suitable and combinational logic circuit B arbitrarily2 This can be other pipeline stages or any other device that is connected to this streamline.
In practice of the present invention, every grade of streamline also comprises effective input latch LVIN, effective output latch LVOUT, receives input latch LAIN and receives output latch LAOUT. In these four registers each preferably is simple single-stage latch. Output by latch LVIN, LVOUT, LAIN and LAOUT is respectively QVIN, QVOUT, QAIN, QAOUT. Can be used as input by the output signal QVIN of effective input register output and be directly connected to effective output register LVOUT, perhaps via intermediate logic device or the circuit that can change signal.
Equally, input that can be directly connected to effective input latch QVIN of next stage for the output useful signal QVOUT that defines the level is perhaps via the middle device that can change useful signal or logic circuit. This input QVIN also is connected to logic gates (will illustrate below), and its output is connected to the input that receives input latch LAIN. Selectively be connected to same logic gates (the following describes) via another logic gates by the output QAOUT that receives output latch LAOUT.
As shown in Figure 4, output useful signal QVOUT forms an OUT_VALIND signal that can be received by rear one-level as the IN_VALID signal, and perhaps indication is connected to the valid data of the subsequent conditioning circuit of streamline simply. To every grade of state of pointing out in order to the preparation of the lower connection circuit of receive data or level, said signal OUT_ACCEPT preferably is connected to through following logic circuit and receives output latch LAOUT as input with signal OUT_ACCEPT. Equally, receive the output QAOUT of output latch LAOUT as input, preferably be connected to through following logic circuit and receive input register LAIN.
Implementing in the middle of the present invention, from output signal QVIN, the QVOUT of effective register LVIN, LVOUT respectively with receive signal QAOUT, OUT_ACCEPT combination, to form respectively receiving the input of latch LAIN, LAOUT. In the embodiment that Fig. 4 exemplifies, these input signals make up and form as each useful signal QVIN, QVOUT and each logic NAND that receives the logic negate of output signal QAOUT, OUT_ACCEPT. Conventional gate NAND1 and NAND2 carry out NAND work, and phase inverter INV1, INV2 form each logic negate that receives signal.
As known in digital design techniques, when its input signal is the logical zero state when any or all, be logical one from the output of NOT-AND gate. So, only when it all is input as the logical one state, be logical zero from the output of NOT-AND gate. What business of this trade was known is, when its input signal was " 0 ", the digital phase inverter for example output of INV1 was logical one, and when its input signal was " 1 ", its output was " 0 ".
Input to NOT-AND gate NAND1 is QVIN and NOT (QAOUT), and its " NOT " refers to the binary system paraphase. Use known technology, can press following formula to the input that receives latch LAIN and solve:
NAND(QVIN,NOT(QAOUT))=NOT(QVIN)OR QAOUT
In other words, when signal QVIN be " 0 " or when signal QAOUT be " 1 ", when perhaps both occurred simultaneously, the combination of phase inverter INV1 and NOT-AND gate NAND1 was logical one. Gate circuit NAND1 and phase inverter INV1 can finish with single OR-gate, and one of its input directly relies on the QAOUT output that receives latch LAOUT, and another is inputted, and relies on the anti-value of the output signal QVIN of effective input latch LAIN.
As known in digital design techniques, the many latch that are suitable for use as effectively and receive latch can have two output Q and NOT (Q), i.e. Q and its logical inverse value. If select such latch, then can directly rely on NOT (Q) output of effective latch LVIN to an input of OR-gate. Gate circuit NAND1 and phase inverter INV1 can finish with conventional known technology. Yet rely on used latch structure, using can be more effective without the latch of anti-phase output, replacement provide gate circuit NAND1 and phase inverter INV1, the two also can be finished in silicon device effectively. Therefore, any known device can be used for producing Q signal and/or its logical inverse value.
When the clock pulses signal (is PH0 at input side, PH1 at outlet side) with from the output of the reception latch of the same side both during logical one, data and effectively latch LDIN, LDOUT, LVIN and LVOUT load their data inputs separately. Like this, clock pulse signal (PH0 of input latch LDIN and LVIN) and each receive the output of latch (being LAIN in this case) and use in the logical “and” mode, when they both during logical one, only load data.
In special applications, the CMOS device of latch for example, the control latch loads the logical “and” operation of (CK through having illustrated or enable " input "), can be connected to the usual manner that the MOS transistor gate circuit that is connected in series uses finish at an easy rate in the input line of latch by making each enable input signal (for example PH0 and be the QAIN of latch LVIN and LDIN). Therefore, must provide actual logical “and” door, it may produce timing problems owing to postpone transmission in high-speed applications. So the AND gate shown in the figure only shows the logic function of finishing in the enable signal that produces various latch.
Thereby only as PH0 and QAIN both during logical one, data latches LDIN loads the input data. When in two signals any one becomes " 0 ", it will latch these data.
Although only among clock pulses phase signal PH0 or the PH1 is used as the data latches of the input that clock is added to pipeline stages (and output) side and effective latch, another clock pulses phase signal directly is used as the reception latch that clock is added to the same side. In other words, be preferably in the data latches of the clock of reception latch of streamline either side (input and output) and the same side and effective latch " not homophase ". For example, although use PH0 in for data latches LDIN and effective latch LVIN clocking CK, PH1 is used as receiving the clock of input latch.
As by two-wire effectively and the work embodiment of the streamline expanded of receiving circuit, suppose at first input at circuit do not exist from before pipeline stages, perhaps valid data from transmitting device. In other words, suppose reset recently from system after, do not become " 1 " to effective input signal IN_VALID of the level that exemplifies. Also suppose after system's last reset and spent several clock cycles, correspondingly, circuit has reached the condition of stable state. During the next positive period of clock pulses PH0, be used as " 0 " from effective input signal QVIN of effective latch LVIN and load. During the next positive period of clock pulse signal PH1, be used as " 1 " to the input that receives input latch LAIN (through gate circuit NAND1 or another gate equivalent circuit) and load. In other words, because the data in data input latch device LDIN are not effectively, so this grade sends the information (because it does not keep the data of any worth preservation) that it prepares to receive the input data.
Should point out that in this example signal IN_ACCEPT is used to enable data and effective latch LDIN and LVIN. Because at this moment IN_ACCEPT is " 1 ", these latch are worked as conventional transparent latch effectively, so that regardless of being what data in the IN_DATA bus, first-class becomes " 1 " to clock pulses signal PH0, just all put into data latches LDIN. Certainly, needing only from its output QAOUT that receives latch is " 1 ", and these valid data also will be put into the next data latches LDOUT of next pipeline stages.
Therefore, as long as data latches does not comprise valid data, during the next positive period of its each clock signal, it receives or " loading " any data that offer it. On the other hand, such invalid data do not pack into any reception signal from its corresponding reception latch be low level (i.e. " 0 ") those the level. In addition, needing only corresponding IN_ VALID (or QUIN) signal from the output signal of effective latch (it forms effective input signal to next effective latch) is low level to effective latch, just keeps " 0 ".
When the input data that arrive data latches were effective, useful signal IN_VALID pointed out this state by bringing up to " 1 ". Then the output of corresponding effectively latch its separately the next rising edge of clock pulses phase signal constantly bring up to " 1 ". For example, when when the constantly corresponding signal IN_VALID of the next rising edge of clock pulses phase signal PH0 becomes high level (namely bringing up to " 1 "), effective input signal QVIN of latch LVIN just brings up to " 1 ".
Now supposition replaces data input latch device LDIN and comprises valid data. If data output latch LDOUT prepares to receive new data, it receives signal QAOUT will be " 1 ". In this case, during the next positive period of clock pulse signal PH1, data latches LDOUT and effective latch LVOUT will be enabled, and data latches LDOUT will be loaded in the data that its input exists. Because clock pulse signal is nonoverlapping, this will occur in the next rising edge front of another clock pulse signal PH0. Therefore, at the next rising edge of PH0, above-mentioned data latches (LDIN) latchs the new input data that will not latch before the data that transmit from latch LDIN from prime safely at data output latch LDOUT.
Therefore, the data latches of the energy receive data that each is adjacent adopts same operation to (in one-level or between the adjacent level), because they will be according to the phase place work that replaces of clock pulses. Any data latches of not preparing to receive new data, because it comprises the valid data that can not transmit, to there be a low level output to receive signal (receiving the QA output of latch LA from it), and will not load its data latches LDIN or LDOUT, after this, as long as it is low level giving the reception signal (from the output that receives latch) of every side (inputing or outputing) of deciding grade and level or level, will not load its corresponding data latch.
Fig. 4 also illustrates included in a preferred embodiment reset feature. In the example that exemplifies, reset signal NOTRESET0 is connected to paraphase the RESET input R (anti-phase represent with small circle) of effective output latch LVOUT as routine. As everyone knows, this means whenever reset signal NOTRESET0 becomes " 0 " that effectively register LVOUT will be forced to output " 0 ". A benefit that (becomes " 0 ") when reset signal becomes low level latch is resetted is the intermittence in the transmission this latch will be resetted. Then whenever effective transmission beginning and reset signal become high level, they will be in their " zero-bit " or reset mode. Therefore, on-off action that reset signal NOTRESET0 rises digital " ON/OFF ", in order to drive streamline, it must be in the high potential value.
Should point out that the latch that needn't make in the streamline all hold valid data resets. As describing among Fig. 4, effectively input latch LVIN directly resets with reset signal NOTRESET0, and would rather say Indirect reduction. Suppose that reset signal NOTRESET0 drops into " 0 ". Effectively output signal QVOUT also drops to " 0 ", and is irrelevant with the state before it, becomes high level to the input that receives output latch LAOUT (through gate circuit NAND1) subsequently. Receive output signal QAOUT and also bring up to " 1 ". Then, this QAOUT value " 1 " conduct " 1 " is sent to the input that receives input latch LAIN and irrelevant with the state of effective input signal QVIN. Then at the next rising edge of clock pulse signal PH1, receive input QAIN and bring up to " 1 ". Suppose that useful signal IN_VALID resets to " 0 " exactly, then at the next rising edge of clock pulse signal PH0, will become " 0 " from the output of effective latch LVIN, just as its direct reduction.
So example is described, must be only at the side of every grade (comprise rear class) effective latch that resets, in order to reset whole effective latch. In fact, each other effective register needn't reset in many application: if two phase place PH0, PH1 of clock pulses greater than a complete cycle during can guarantee that reset signal NOTRESET0 is low level, then in front the pipeline stages effectively latch " automatically reseting " (sending reset signal to passback) will occur. In fact, if reset signal keeps low level in the clock pulses two phase place has the so much whole periodicity of pipeline series at least, just need only in last pipeline stages, directly effective output latch be resetted.
Fig. 5 a and 5b (being referred to as Fig. 5) example illustrates sequential chart, it illustrates the relation between nonoverlapping clock pulse signal PH0, the PH1, the effect of reset signal, with different arrange effectively and receive maintenance and the transmission of the data of signals, said maintenance and transmission be the pipeline stages both sides that dispose in the embodiment shown in fig. 4 and between them. In the illustrative example, supposing need not be by inserting logic block B from the output of data latches LDIN, LDOUT in the time diagram of Fig. 51、B2Further processing transmit. This is as an example and unrestricted. It will be appreciated that between the data latches of continuous pipeline stages, perhaps between the input and output side of independent pipeline stages, can comprise any combined logical structure. The value of the actual input data that exemplify (for example hexadecimal data word " aa " or " 04 ") also only is illustrative. As mentioned above, as long as other memory device of data latches can be admitted and latch, perhaps store the value of every bit or input word, then input data bus can have any width (and even can simulate).
Preferred data structure-" token "
In application example shown in Figure 4, because not comprising, any level do not avoid making the input data communication device to cross its combined logic block B1、B2Etc. control circuit, so every one-level is processed all input data. For larger adaptability is provided, the present invention includes a data structure, wherein " token " is in order to distribute data in whole system and control information. Each token is comprised of a series of binary bits that separately enter one or more token blocks. In addition, bit is a kind of in the three types: address bit (A), data bit (D), extended bit (E). As an example rather than as limiting, tentation data transmits as word at 8 bit buses with 1 bit expanded bit line. The example of four word tokens is to arrange by transmission:
The first word E A A A D D D D D
The second word E D D D D D D D D
The 3rd word E D D D D D D D D
The 4th word E D D D D D D D D
Should point out that extended bit E (best) is used as replenishing each data word. In addition, the length of address field is variable, preferably just transmits after the extended bit of the first word.
Therefore, token is comprised of one or more words of (binary system) digital data in the present invention. Each this word transmits in turn and preferably concurrently, although this transfer approach not necessarily: some uses the serial data of known technology to transmit also is possible. For example, in video analyzer, parallel transmission control information, and serially-transmitted data.
Illustrated such as this example, each token preferably has a address field (string that the A bit is initial) in order to identify data type included in the token at initiating terminal. In great majority are used, for the part that transmits whole address field, single character or a word is enough, but this is dispensable according to the present invention, as long as corresponding pipeline stages comprises logic circuit, this logic circuit just can be stored the expression of some sufficiently long part address field, so that these levels receive the decode whole address field.
Should point out that the transfer address field does not need industrial siding or register. The transmission of usage data position. As described below, if do not want to drive with specific address field, streamline just can not slow down, and that is to say that pipeline stages can transmit token without delay.
The data redundancy part of following address field in token is not used the constraint of token. These D-data bit can adopt arbitrary value, and these represented meanings here are unessential. That is to say that the meaning of data can change, for example, depend on a certain particular moment position of in system, being placed of data. The quantity of additional data D can be long or short as required after address field, and the number of data word can greatly change in different tokens. Address field and extension bits are used to control signal is sent to pipeline stages. Because the number of words in the data wire (D Bit String) can be arbitrarily, the information of transmitting in data field also can correspondingly change. Therefore, following explanation is the use for address bit and extension bits.
In the present invention, when some circuit block diagrams connected together in fairly simple structure, token was useful especially. The simplest structure is the streamline for the treatment of step. Structure shown in Figure 1 for example. Yet token is not limited to only be used on the pipeline organization.
Suppose that again each box represents a complete pipeline stages. In the streamline of Fig. 1, data flow in the drawings from left to right. Data enter machine and transmit and advance to process level A. These data can be revised or do not revised to this level, then these data is sent to the B level. This modification can be complicated at random if any, and usually sends the different of data item number at different levels and output to. The B level has again changed data and has delivered to the C level, and so on. In such as this scheme, data can not transmit round about, so that the C level can not be sent to data the A level. This restriction usually allows fully.
On the other hand, although directly do not connect between the two-stage, wish very much A level energy and the communication of C level. A level and the communication of C level can only be passed through the B level. One of advantage of token is that they have the ability to realize this class communication. Because any processing level just allows it not be sent to subordinate to its unacquainted token simply with not changing.
According to this example, in each token, extension bits and address and data field transmit together so that one process level can fully needn't be to its address decoder by a token (it can be random length). According to this example, any one extension bits is the token of HIGH (one " 1 "), and the back and then is a word subsequently of the part of same token. This word also has an extension bits, and it shows whether an other word is arranged in token. When one-level runs into the token that an extension bits is LOW (one " 0 "), just know that this is the last character of token. So next word is counted as the first character of a new token.
Note, although it is particularly useful to process the simple stream waterline of level, should understand the more complicated structure that token can be used for processing unit. The example of a more complicated processing unit is described in down.
According to the present invention, unnecessary state with extension bits namely the extension bits reset, is used as the signal of the last word of a given token. The other method that replaces this priority scheme is to change the expansion bit position, and it is shown is the first character of token rather than last word. In decoding hardware, make corresponding changes and just can realize this point.
With extension bits of the present invention as the advantage of the signal of the last character in the token rather than first character be, usually useful for the characteristic of revising a circuit. This modification depends on whether token has extension bits. One of this example is that a token works that one-level of processing the video quantizing value. The video quantizing value is stored in (typically storage component part) in the quantization table. For example, table that comprises 64 8 any bigits.
In order to fill a new quantization table to the quantizer level of streamline, " QUANT_ TABLE " token is sent to quantizer. In this case, suppose that token comprises 65 token words. First character comprises " QUANT_TABLE " code, namely sets up a quantization table. This back is 64 words and then, and they are the integers in the quantization table.
When being video data encoding, must transmit such quantization table once in a while. In order to realize this function, can not deliver to the quantizer level with the QUANT_TABLE token of expansion word for one. See this token and notice that the extension bits of its first character is low, the quantizer level can be read its quantization table, and builds a QUANT_TABLE token. This token comprises 64 and quantizes tabular value. The extension bits of first character (it is originally LOW) becomes HIGH, and token continues with some HIGH extension bits, until the new end of token. Extension bits with the 64th quantification tabular value is the new end that LOW shows token. In whole system, carry out in this typical mode and be encoded into bit stream.
Continue to see this example, whether quantizer puts 1 according to the extension bits of the first character of QUANT_TABLE token, fills a new quantization table to its memory device, perhaps reads its table from memory device. So, whether select in a token with the signal of extension bits as first or last token word, depending on will be with the system of streamline. According to the present invention, two kinds of alternative methods all are possible.
Another alternative method to the extension bits scheme of preferential use is to begin to comprise length counting at token. This token that is arranged in can be beneficial to when very long, such as, efficient is high. For example, in a known applications, suppose that a typical token is 1000 word lengths. With extension bits scheme described above (with the position that invests each token word), in order to comprise all extension bits, token must need 1000 extra positions. Yet, token length is only needed 10 with the binary form coding.
So although long token has some use, experience shows that short token also has many use. Here, this preferential extension bits scheme of using is favourable. If a token only has a word length, then only need one and make this sign. Yet counting scheme typically must require same 10 as before.
The shortcoming of length counting scheme comprises following aspect: 1) not high to short token efficient; 2) token has been added maximum length restriction (only with 10, can not to greater than 1023 word counts); 3) producing necessary known token length before the counting (this chances are token the zero hour); 4) every circuit of processing token may need to have the hardware of pair word count; With 5) if whether counting is just in case destroyed (because a data transmission error), just unclear can being restored.
According to the present invention, the advantage of extension bits scheme comprises: 1) pipeline stages does not need to comprise the circuit piece to each token decode, because unrecognized token is only considered extension bits and can be allowed it correctly pass through; 2) to all tokens, the coding of extension bits is same; 3) to token length without limits; 4) to short this scheme efficient of token high (aspect the overhead of expression token length); 5) can naturally finish error correction. If an extension bits is destroyed, then will produce a token (when the destruction of an extension bits is when becoming " 0 " from " 1 ") or lose a token (destruction of extension bits is to become " 1 " from " 0 ") at random. In addition, problem is confined to some related tokens. After that token, automatically restart true(-)running.
In addition, the length of address field can change. This is very favorable, because it allows the most frequently used token to be compressed to minimum number of words. This is very important equally in the video data pipeline system, because it guarantees that all process level and can move continuously at full bandwidth.
According to the present invention, for the length that allows address field is variable, the address is chosen such that so that the back follows the short address of random data to obscure mutually with long address never. By the famous technology of first discovery of Huffman, therefore, namespace uri is " Huffman Code " to the preferential technology of using of address code field. (address field also is used as " code " of the pipeline stages that activates an expection). Yet there is the people of the general technical ability of one's own profession can know that other encoding schemes also can successfully be used.
Although encoding at Digital Design field Huffman is that everybody is very familiar, following example provides the background of a summary:
The Huffman code comprises the word (in the situation of digital display circuit, just as among the present invention, symbol is binary digit normally) that is comprised of a symbol string. The variable-length of code word. The special characteristics of Huffman code word is that code word is chosen such that so that neither one long code word is with forming sign-on of short code word. According to the present invention, the token address field is preferentially selected famous Huffman coding techniques (although dispensable).
In this invention, address field is also preferential from the highest significant position (MSB) of first character token. (notice that the appointment of MSB is arbitrarily, the difference that this programme can be revised to adapt to MSB is specified). Address field extends to contiguous some than low order. In a known applications, require more than a token word such as a token address, namely exceeded the least significant bit in arbitrary given word, this address field will be extended to the highest significant position of next word. The minimum length of address field is one.
In the present invention, any in some known hardware configurations can be used to produce token. One of this class formation is little programming state machine. Yet known microprocessor or miscellaneous equipment are also available.
According to the present invention, the major advantage of token scheme is that it is to the adaptive capacity of unexpected needs. For example, if introduce a new token, maximum possibility is that this will only affect a small amount of pipeline stages. Most probable situation is, only affects two-stage or two circuits, namely at first produce token that and newly designed or revised to process that or that one-level of new token. Note, unnecessaryly go to revise any other pipeline stages. Or rather, these grades do not need their design made an amendment and just can process new token, because the new token of their nonrecognition just correspondingly allows that token not pass through with being modified.
The present invention have the ability to keep a large amount of device already present, that designed unaffected. This ability has obvious advantage. The impact that keeps some semiconductor chip in the core assembly sheet not to be subjected to other chip in this group to improve design fully is possible. See it all is favourable from the viewpoint of user and chip maker. Mean because the design change has influence on (because the increase of integrated level of all chips even revise, an intrasystem core number reduces, the more and more possibility so this situation becomes), same design still having considerable advantage than otherwise aspect the time of putting on market, because can be reused.
Pay special attention to when expanding to situation about occuring when comprising two word address to the token group. Even in this situation, still already present design of unnecessary modification. Token decode device in pipeline stages can be attempted to the decoding of the first character of such token, and can make a decision it and can not identify this token. Then it can be pass by unaltered token passing, utilizes extension bits correctly to carry out this operation. It can not attempted to second word decoding (even this is comprising address bit) of token, because the part of the token data field that second word of its meeting " supposition " is it can not be identified.
In many cases, a pipeline stages or an associated circuit piece can be revised token. This but is not necessarily normally, takes to revise the mode of the data field of a token. In addition, common situation is to revise the number of data word in the token, perhaps removes some data word or adds some new words. In some situation, some tokens are left out from token streams fully.
In great majority were used, pipeline stages can be typically only to minority token decode (only being started by the minority token); Other token of this grade nonrecognition and do not allow them pass through with not changing. In many cases, only have a token decoded, i.e. data token word itself.
In many application, the result that it operates is in the past usually depended in the operation of an a specific order. Therefore, " state " of this one-level depends on its state in the past. In other words, this one-level depends on the status information of storage. Another saying is that it must keep the one or more cycles in the past about its own some historical information. The present invention is not only applicable to that latch is the application of simple pipeline latch in data path, is applicable to well comprise the streamline of such " state machine " level yet.
According to the present invention, two line interfaces are very large advantages of the present invention to the applicability of above-mentioned state machine circuit. In the place of controlling data path with state machine especially like this. In this situation, " current state " that two above-mentioned line interface technology can be used to the Guarantee Status machine keep and streamline in the data controlled synchronous.
Fig. 6 represents the simplified block diagram of an example of circuit. This circuit is included in one in the pipeline stages of token address field decoding. It represents the pipeline stages with " state machine " characteristic. Each word of token comprises one " extension bits ", and when in token more word being arranged, extension bits is high (HIGH), as for the last character of token then extension bits be low (LOW). If this is the last character of token, next valid data word is the beginning of new token, so its address must be decoded. In any given word, whether to the decision of token address decoding, depend on the value of the extension bits of knowing the front.
The reason in order to simplify just, two line interfaces (and accept and make useful signal and latch) are undeclared in the drawings, and the details that all process circuit reset also is omitted. With in the past the same, be assumed to 8 words just for for example, not in order to limit.
The streamline of this demonstration makes data bit and extension bits postpone a pipeline stages. It is also to the data token decode. When the first character of data token appeared at the output of circuit, signal " DATA_ADDR " was produced and is set to " HIGH ". Data bit is by LDIN and LDOUT latch delay. For 8 data bit that are used for this example, each in two latch repeats (corresponding 8-input, 8-output latch) 8 times. Similarly, extension bits is expanded a latch LEIN and LEOUT delay.
In this example, latch LEPREV is provided the nearest state of being used as the Memory Extension position. The value of the extension bits LEIN that packs into, LEOUT then packs into when the next rising edge of not overlapping (non_overlapping) clock phase signal PH1 arrives. So only at second of non-overlapped two phase clock during half, latch LEOUT comprises the value of current extension bits. Yet, when the next rising edge of clock signal PH0 is come, pack into the value of this extension bits of latch LEPREV. The enable signal of this clock signal and extension bits input latch LEIN is same signal. So during a upper PH0 clock phase, the output QEPREV of latch LEPREV can keep the value of extension bits. From the data word of anti-phase (inverting)Q output 5 add the not anti-phase MD[2 of latch LDIN], the expansion place value QEPREV with the front is combined in a series of gate NAND1, NAND2 and NOR1. Operating in the digital design techniques of these gates is that everybody knows. Symbolic representation " N-MD[m] " shows intermediate data word MD[7:0] the logical inverse of m position. Technology with known Boolean algebra, can show only have when last extension bits be the structure of data word (input word originally) of output place of " 0 " (QRPEV=" 0 ") and noninverting Q latch LDIN when being " 000001xx ", be HIGH (one " 1 ") from the output signal SA (from the output of NOR1) of this logical block. " 000001xx " this representation MD[7]-MD[3] these 5 high order tagmemes all are " 0 ", MD[2] position is " 1 ", is any arbitrary value in the position of 0~1 position. Therefore, there are 4 possible data words (" xx " has four kinds of arrangements) can make SA become HIGH. SA is connected to the input of address signal latch LADDR, so the output of LADDR latch is uprised. In other words, only when four may be suitable one of tokens occur and only when the extension bits of front when being zero, this one-level just provides an initiating signal (DATA_ADDR=" 1 "). Last time, extension bits was zero, and the data word of last time is the last character in token word string last time in other words, this means that current token word is the first character of current token.
When the signal QPREV that comes from latch LEPREV is LOW, be the first character of a new token in the value of latch LDIN output place. Gate NAND1, NAND2 and NOR1 decode to data token (000001xx). Yet this address decoding signal SA is delayed in latch LADDR, so that signal DATA_ADDR has same timing with output data OUT_DATA and OUT_ EXTN.
According to the present invention, Fig. 7 is the example of the relevant pipeline stages of another state, and it produces the LAST_OUT_EXTN signal to show the value of previous output extension bits OUT_EXTN. Add till now with a upper extension bits latch (they are respectively LEOUT and LEPREV) on two enable one of number of making (in the CK input) and obtain from gate AND1 so that these latch only data effectively and data just be they new values (export Q that effective latch LVOUT and output accepts latch LAOUT export all be height) of oneself packing into when just received. Like this, they are only remained valid extension bits and are not packed into and the associated value that does not meet logic of invalid data. In concrete device shown in Figure 7, two lines effectively/RL comprises OR1 door and OR2 door, their input signal is that the paraphase of the reception signal in downstream and the effective latch that is respectively LVIN and LVOUT is exported. This has enumerated a kind of method: if latch has anti-phase output, NAND1/2 door and INV1/2 door among Fig. 4 can be substituted.
Although this is the extremely simple example of the pipeline stages of " state is relevant ", that is, it only depends on the state of a single position, and only when data really transmitted between streamline, the state that keeps in all latch just can be updated. This point is general correct. In other words, only when data be when not only effectively but also having been received by next stage, the state in the latch just can be updated. Correspondingly, must guarantee that carefully these latch reset suitably.
According to the present invention, the generation of token has been compared several advantages with the coding techniques that use is crossed the streamline transmission with known data communication device.
The first, as mentioned above, token makes the address field (and for example, it is next to utilize Huffman to encode) of different length that the representation of the high token commonly used of efficient is provided.
The second, the consistent compilation of token length is so that the end of token (thereby beginning of next token) the correct processing of quilt (comprise and simply do not process transmission). Even the token that token decode device circuit can not be identified in given pipeline stages also can correctly be processed.
The 3rd, the rule that the token that is not identified (namely not transmitting them) is processed with changing and hardware configuration be not so that the one-level in the streamline can and be communication between its next-door neighbour's the downstream stage. This has also increased extendible ability and the high adaptive capacity of efficient of streamline, does not need existing pipeline stages is reseted meter on a large scale because it makes when changing the token group in the future. Token of the present invention with top and below two line interfaces that all mentioned particularly useful especially when using together.
As an above-mentioned example, Fig. 8 a and 8b are taken at together (the following Fig. 8 that collectively is referred to as), describe the block diagram of a pipeline stages. Its function is as follows. If this grade processed a predetermined token (being known as data token in this example), then each word in its meeting duplicate tokens is removed beyond the first character. The address field that comprises data token in the first character. On the other hand, if this grade processed the token of any other type, it can delete each word. Final effect at output is that each word that only occurs in data token and these tokens repeats twice.
Many parts in this example system may be the same with those parts that are shown in the structure simply too much among Fig. 4,6 and 7 of having described. This has illustrated a very large advantage. More complicated pipeline stages can still be enjoyed same flexibility and the benefit of retractility, because two same line interfaces can not need be changed a social system or seldom change a social system and be used. Data Replica level shown in Fig. 8 only is an example of the dissimilar operation of pipeline stages can be finished in any known applications countless versions. Yet this " replication order " is the level that can form " bottleneck ", so that according to this specific implementation, the streamline meeting " is crowded together ".
" bottleneck " can be any level, perhaps its time complete operation that will relatively grow, and the data that perhaps its generates in streamline are more than its reception. This example illustrates that also according to this specific implementation, two line reception/valid interfaces can adapt to different application at an easy rate.
Replication order shown in Figure 8 also has two latch LEIN and LEOUT, example as shown in Figure 6, and they pin the state in the extension bits of the input of this one-level and output respectively. Shown in Fig. 8 a, input expanding position latch LEIN and input data latch LDIN and to make useful signal IN_VALID be clock synchronous.
In order to be easy to consult, the various latch and their the output signal pairings separately that are included in replication order are as follows:
At replication order, from the output of data latches LDIN, form the intermediate data that is known as MID_DATA. This intermediate data word is only when indirect trusted number (being designated as " MID_ACCEPT " in Fig. 8 a) when being set to HIGH, the data of just packing into output latch LDOUT.
The line system shown in below receiving latch LAIN and LAOUT is the circuit that is added to the basic pipeline structure among Fig. 8, to produce various internal control signals. These signals are used for copy data, comprise " DATA_TOKEN " signal, and it shows that current what processing is a valid data token to circuit, also comprises the NOT_DUPLICATE signal, and it is used for controlling copying of data. When circuit during just at the deal with data token, the NOT_DUPLICATE signal changes between a HIGH and a LOW state, and this makes each word in the token be replicated once (but not being repeatedly). When circuit is not to process a valid data token, then NOT_ DUPLICATE signal remains on the HIGH state. Correspondingly, this means that the token word of processing is not replicated.
Such as Fig. 8 a explanation, higher 6 and form the input of gate group NOR1, NOR2, NAND18 from the output signal that latch LI1 comes in 8 of the intermediate data words. The output signal of coming from gate NAND18 is marked as S1. Can find out with well-known Boolean algebra, only when output signal QI1 be that " 1 " and NID_DATA word have following structure: when " 000001xx ", signal S1 is only " 0 ". Structure " 000001xx " shows that higher five all is " 0 ", MID_DATA[2] be " 1 ", at MID_DATA[1] and MID_DATA[0] there is any arbitrary value the position. So signal S1 serves as " token identification signal ", only S1 is only low when MID_ DATA has a structure of being scheduled to and the output from latch LT1 to be " 1 ". The characteristic of latch LI1 and its output QI1 further specifies as follows.
The function that the last value of extension bits to the centre (be designated as " MID_EXTN " and as signal S4) of finishing latch LO1 latchs is come then its latch LI1 that this value is packed at the accompany next rising edge of PH0 of clock. The output of LI1 is a QI1, and it also is one of the input that forms the token decode logical groups of signal S1. As described above, signal S1 only when signal QI1 is " 1 " (and the MID_DATA signal has predetermined structure), just can be reduced to " 0 ". So, whenever a upper extension bits is " 0 ", show that when a token had finished, signal S1 just can be reduced to " 0 ". So the MID_DATA word is first data word of a new token.
Latch LO2 and LI2 and NAND gate NAND20, NAND22 are DATA_TOKEN signal formation memory together. In normal condition, the signal QI1 of NAND20 input and the signal S1 of NAND22 input both can be at the logical one state. Can find out with the Boolean algebra technology that again the working method of these NAND gates is the same with phase inverter in this case, i.e. the output signal QI2 of latch LI2 paraphase in NAND20, then this signal forms signal S2 again byNAND 22 paraphase. In this case, because twice logic paraphase arranged in the path, signal S2 has the value identical with QI2.
Can find out that also the DATA_TOKEN signal forms input to latch LI2 at the output of latch LO2. As a result of, if QI1 and S1 both the situation of HIGH remain unchanged, signal DATA_TOKEN will keep its state (no matter being " 0 " or " 1 "). Even clock signal PH0 and PH1 are to latch (being respectively LI2 and LO2) when latching, this also is correct. Only have as signal QI1 and S1 when both " 0 " or one of them are " 0 ", the value of DATA_TOKEN could change.
Such as earlier time explanation, when the extension bits when the front was " 0 ", signal QI1 is " 0 " always. So when the MID_DATA value is the first character of token (so also comprising the address field of token), signal QI1 is " 0 " always. In this situation, signal S1 can be " 0 " or " 1 ". Such as earlier time explanation, if the MID_DATA word has in this example the predetermined structure that is indicated as being " data " token, signal S1 will be " 0 ". If MID_ DATA word has other any structure (indicating token is certain other token, is not a data token), S1 will be " 1 ".
If QI1 is " 0 " and S1 is " 1 ", this shows the token that certain different pieces of information token is arranged. In the digital electronics field, well-known, the output of NAND20 must be " 1 ". NAND gate NAND22 can anti-phase it (illustrating such as the front), so signal S2 will be " 0 ". As a result of, when next PH1 clock phase began, this " 0 " value can be loaded into latch LO2, and the DATA_TOKEN signal can become " 0 ", and what show that circuit processing is not data token. Signal S2 if QI1 is that " 0 " and SO are " 0 ", thereby shows it is a data token, so will be " 1 " (no matter what other input of NAN22 that comes from NAND20 output is). As a result, when next PH1 clock phase began, this " 1 " value can be loaded into latch LO2, and the DATA_TOKEN signal can become " 1 ", showed that circuit processing a data token.
NOT_DUPLICATE signal (output signal QO3) is loaded into latch LI3 similarly when the next rising edge of clock PH0 arrives. The output signal QI3 of latch LI3 and output signal QI2 are combined into signal S3 in gate NAND24. With in the past the same, Boolean algebra can be used to show, only when two signal QI2 and QI3 had value " 1 ", signal S3 just can be " 0 ". If signal QI2 becomes " 0 ", namely the data token signal is " 0 ", and then signal S3 becomes " 1 ". In other words, if neither one valid data token (QI2=0) or data word are not a duplicate (QI3=0), then signal S3 uprises.
Now supposition, during more than a clock signal, the data token signal remains height. Because NOT_DUPLICATE signal (QO3) " feedback " arrives latch LI3, and can be by gate NAND24 paraphase (because its another input QI2 keeps HIGH), output signal QO3 will saltus step between " 0 " and " 1 ". If yet there is not the valid data token, signal QI2 can be " 0 ", and signal S3 and output QO3 can be forced to HIGH, until DATA_ TOKEN signal becomes " 1 " again.
Output QO3 (NOT_DUPLICATE signal) also is fed and combines with the output QA1 that accepts latch LAIN in a series of gates (NAND16 and INV16, their common formation and doors). When only the value as QA1 and QO3 all was " 1 ", the output of gate was only " 1 ". Shown in Fig. 8 a, also form with the output of door (gate NAND16 follows the back gate INV16) and to receive signal IN_ACCEPT. Cross as described above, this signal is used for two line interface structures.
Receive signal IN_ACCEPT also with making to latch LDIN the enable signal of LEIN and LVIN. As a result, if the NOT_DUPLICATE signal is low, it also can be low receiving signal IN_ACCEPT, and all these three latch can be under an embargo, and can keep the value of storage at their output. Before the NOT_DUPLICATE signal uprises, the corresponding levels will not receive new data. In order to force reception latch LAIN to be output as height, except aforesaid ask for something, also replenish with this.
As long as a valid data token (DATA_TOKEN signal QO2 is " 1 ") is arranged, signal QO3 will saltus step between HIGH and LOW state, so that input latch can and may be accepted data at the most in two clock phase PH0 and PH1 every other week complete cycle separately. The next stage that shows with " HIGH " OUT_ACCEPT signal has been ready to receive data, and these additional conditions must still be satisfied certainly. So output latch LDOUT will be put into same data word on the output bus OUT_DATA, this continues two complete clock cycle at least. Only as data token effective (QO2 HIGH), when making useful signal QVOUT be HIGH again, OUT_ VALID signal just can be " 1 ".
Signal QEIN is the extension bits of corresponding MID_DATA. It and signal S3 are combined in a series of gates (INV10 and NAND10) and form signal S4. Between a data token apparition, each data word MID_DATA is repeated. Method is its pack into twice of output latch. During the first time that this symbol is done, because the effect of NAND10, S4 can be forced " 1 ". Be loaded into LDOUT at MID_DATA and form OUT_DATA[7:0] in, signal S4 is loaded into latch LEOUT, forms OUTEXIN.
So when given MID_DATA was loaded into LEOUT for the first time, relevant OUTEXTN can be forced to uprise, and at twice, OUTEXTN can be the same with signal QEIN. Consider now this situation, known QEIN is low during the last word of a token. During the first time, the MID_DATA LDOUT that packs into, OUTEXTN can be " 1 "; And OUTEXTN can be " 0 " during the second time, shows the real end of token.
Output signal QVIN and the signal QI3 of effective latch LVIN similarly are being combined in the door combination (INV12 and NAND12) together, form signal S5. Can find out that with known boolean's technology no matter make useful signal QVIN for high, or when signal QI3 was low (showing that data are duplicate), signal S5 was high. Signal S5 packs into and makes effective output latch LVOUT, simultaneously, the MID_DATA LDOUT that packs into, middle extension bits (signal S4) is packed into LEOUT. Signal S5 and signal QO2 (data token signal) also make up in gate NAND30 and INV30, form output and make useful signal OUT_VALID. Said as earlier the time, only when effective token with when to make useful signal QVOUT be high, OUT_VALID just be height.
In the present invention, MID_ACCEPT signal and signal S5 make up in a series of gates (NAND26 and INV26), form signal S6. These gates realize known and function. Signal S5 is with making to latch LO1 one of two enable signals of LO2 and LO3. No matter when the MID_ACCEPT signal was high and makes useful signal QVIN be a duplicate (QI3 is " 0 ") for high or token, signal S6 rose to " 1 ". So if signal MID_ACCEPT is high, when clock signal PH1 be height, maybe when the data that are latched were duplicate, latch LO1~LO3 can be enabled in the input that installs to this one-level whenever effective input data.
Can find out from above discussion, can under the control that makes useful signal and reception signal, receive and the transmission data in this level shown in Fig. 8 a and the 8b, as in the concrete device in front. But an exception is arranged, and is exactly in the output signal of the reception latch LAIN of input side and the reproducing signals combination of saltus step, so that before new word will be received, a data word can be output twice.
Certainly, various logic gates such as NAND16 and INV16, can be replaced with the logic circuit of equivalence (in this situation, single and a door). Similarly, for example, if latch LEIN and LVIN have paraphase output, phase inverter INV10 and INV12 are exactly unnecessary. Or rather, the paraphase output of these latch can be directly received in the corresponding input of door NAND10 and NAND12. As long as finish suitable logical operation, this grade just can be worked under same state. Data word and extension bits still can be replicated.
Must be noted that the copy function that the example level is finished can not be done, unless first data word of token is " 1 " in the 3rd position of word, and all is " 0 " on 5 high order tagmemes. (certainly, select other gate and interconnection rather than NOR1 as shown in FIG., NOR2, the NAND18 gate can change and arrange required pattern at an easy rate).
In addition, as shown in Figure 8, during whole token, the OUT_VALID signal can be forced low, unless first data word has above-mentioned structure. Its effect is, except that token that produces reproduction process, all tokens all can be deleted from token streams, can not identify these token words as valid data because be connected to a device of output (OUTDATA, OUTEXIN and OUTAVLID).
Ditto, at the corresponding levels make effective latch LVIN, LVOUT both can make to reset with a single wire NOT_RESETO with single on the latch LVOUT of downstream and input R and reset. Simultaneously, the reset signal backpropagation makes the effective latch of making of upstream be forced to step-down in the next clock cycle.
Should be noted that in example shown in Figure 8, be included in the example as a method of only copying of data in the data token. In this method, circuit can be processed ACCEPT and VALID signal, makes the data of constantly leaving pipeline stages more than the data that constantly arrive input. Similarly, the example among Fig. 8 is got rid of all non-data tokens, is an explanation of doing such as method purely, and circuit can be handled the VALID signal and remove data from stream in this method. Yet, to use in that great majority are typical, pipeline stages can allow its Unidentified any token not pass through simply with changing, so that can work to them when needed at streamline other grade below again.
Fig. 9 a and 9b are taken at a graphic example of timing are described together. This diagram is to do for the data Replica circuit shown in Fig. 8 a and the 8b. Ditto, the timing diagram solution shows the relation between the two phase clock signal, various inside and outside control signals, the mode that the mode that data are latched between corresponding levels input side and output side and data are replicated.
In more detail with reference to Figure 10, the restructural that expression provides according to an aspect of the present invention among the figure is processed level now.
Input latch 34 is received input infirst bus 31. First output of input latch is delivered totoken decode subsystem 33 byline 32. Second output of input latch is delivered toprocessing unit 36 as its first input byline 35. First output oftoken decode subsystem 33 is delivered toprocessing unit 36 as its second input byline 37. Second output oftoken decode subsystem 33 is delivered to action recognition unit 39 by line 40. Action recognition unit 39 also receives input byline 46 fromregister 43 and 44. The state of machine is preserved inregister 43 and 44 stack ups. This state is determined by the history of the token of former reception. The output of action recognition unit 39 is delivered toprocessing unit 36 as its 3rd input by line 38. Output latch 41 is delivered in the output ofprocessing unit 36. The second bus 42 is delivered in the output of output latch 41.
Referring now to Figure 11,, initial code decoder (SCD) 51 receives input by two line interfaces 52. This input can or data token form or data flow in data bit. First output ofinitial code decoder 51 is delivered to first first in first out buffer (FIFO) 54 byline 53. The output offirst FIFO 54 is delivered toHuffman decoder 56 as its first input byline 55 logically. Second output ofinitial code decoder 51 is delivered toDRAM interface 58 as its first input byline 57.DRAM interface 58 also receives from the input ofbuffer manager 59 byline 60. Pass throughline 61 to outside DRAM (not shown in FIG.) transmitted signal or receive signal withDRAM interface 58. First output ofDRAM interface 58 is delivered toHuffman decoder 56 as its first physics input byline 62.
The output ofHuffman decoder 56 is delivered to data directory unit (ITOD) 64 byline 63 as an input. Huffman decoder andITOD 64 work together as a single logical block. The output ofITOD 64 is delivered to calculate byline 65 and is patrolled unit (ALU) 66. First output ofALU 66 is delivered to read-only storage (ROM)state machine 68 byline 67. The output ofrom state machine 68 is delivered toHuffman decoder 56 as its second physics input byline 69. Second output ofALU 66 is delivered to token formatter (T/F) 71 byline 70.
First output 72 of T/F 71 of the present invention is delivered tosecond FIFO 73 byline 72. The output ofsecond FIFO 73 is delivered to reversemoulding device 75 as its first input byline 74. Second output of T/F 71 is delivered toDRAM interface 58 as its 3rd input byline 76. The 3rd output ofDRAM interface 58 is delivered to reversemoulding device 75 as its second input byline 77. The output ofreverse moulding device 75 is delivered toinverse quantization device 79 byline 78 as an input. The output ofinverse quantization device 79 is delivered to reverse zig_zag (IZZ) 81 byline 80 as input. The output ofIZZ 81 is delivered to reverse discrete cosine transform device (IDCT) 83 byline 82 as input. The output ofIDCT 83 is by 84 time of delivery (TOD) of line decoder (not shown).
In more detail referring to Figure 12, be demonstrated according to a temporal decoder of the present invention now.Fork 91 receives the output of IDCT 83 (being shown in Figure 11) as its input byline 92. As the first control token of exporting atfork 91, for example mobile vector etc. is delivered to addressgenerator 94 byline 93. For the purpose of counting, data token is also delivered to addressgenerator 94. As second output atfork 91, data communication device is crossedline 95 and is delivered toFIFO 96. Then the output ofFIFO 96 is delivered to adder 98 byline 97 as first input. Deliver toDRAM interface 100 byline 99 as first input from the output ofaddress generator 94. Pass through the DRAM (not shown) transmitted signal of 101 pairs of outsides of line or receive signal withDRAM interface 100. First output ofDRAM interface 100 is delivered topredictive filter 103 byline 102. The output ofpredictive filter 103 is delivered to adder 98 byline 104 as second input. First output ofadder 98 is delivered tooutlet selector 106 byline 105. Second output ofadder 98 is delivered toDRAM interface 100 byline 107 as second input. Second output of DRAM interface is delivered tooutlet selector 106 byline 108 as second input. The output ofoutlet selector 106 is delivered to video formatter (not shown in Figure 12) byline 109.
Referring now to Figure 13,,fork 111 receives input byline 112 from outlet selector 106 (being shown in Figure 12). As first output atfork 111, the control token is delivered to addressgenerator 114 byline 113. The output ofaddress generator 114 is delivered toDRAM interface 116 byline 115 as the first input. As second output atfork 111, data communication device is crossedline 117 and is delivered toDRAM interface 116 as second input. Send or receive data with the DRAM (not shown) ofDRAM interface 116 by 118 pairs of outsides of line. The output ofDRAM interface 116 is delivered to adisplay pipes 120 byline 119.
Obvious from the above description in case of necessity every line can comprise many lines.
Referring now to Figure 14 a,, in mpeg standard, an images 131 is encoded as a slice (slic) or multi-disc 132. Comprise again many 133 for every 132, and line by line the coding, in every row from left to right. As shown in the figure, every 132 a lucky full line of topped 133, or be less than B or the D of delegation such as piece 133 or the C of multirow such as piece 133.
Referring to Figure 14 b, at JPEG with H.261 in the standard, taked intermediate form commonly used (CIF), an images 141 is encoded as 6 row there, and every row comprises two piece groups (GOBS) 142. Each piece group is comprised of uncertain 143 interior 3 row or 6 row of number again. Each GOB 142 presses the tortuous direction encoding of arrow 144 indications. Each piece group 142 is processed successively line by line, and every row from left to right.
Referring now to Figure 14 c,, can find out that to MPEG and CIF, the output of encoder all is that the form with data flow 151 represents. Decoder receives this data flow 151. So decoder can be according to the form image reconstruction of coding. In order to make starting point and the end point of every kind of standard of decoder identification, it is some sections of 33 that data flow 151 is divided into length.
Referring to the Venn figure of Figure 15, it shows the possible range of table choosing (table selection) value of making according toHuffman decoder 56 of the present invention (being shown in Figure 11). The mpeg decoder possible values and H.261 the decoder possible values be overlapping a bit, show a single table select can to some mpeg format and some H.261 form both decode. Equally, possible some values of possible some of mpeg decoder values and jpeg decoder are overlapping, and this shows that a single table selection meeting both decodes to some jpeg format and some mpeg format. In addition, show among the figure that H.261 value and the value of JPEG are not overlapping, this shows and does not exist the single table that can both decode to two kinds of forms to select.
In more detail referring to Figure 16, it is the graphic representation of the variable length pictorial data of practice according to the present invention now. Thefirst images 161 to be processed comprisesfirst PICTURE_START token 162, first random lengthpictorial information 163, andfirst PICTUER_END token 164. Thesecond images 165 to be processed comprises 166, the second random lengthpictorial information 167 of second PICTURE_START token, andsecond PICTURE_END token 168. PICTURE_START token 162 and 166 pairs of processors show visual 161 and 165 beginning. Equally,PICTURE_END token 164 and 168 pairs of processors indicate visual 161 and 165 end. Like this so that processor can be processed thepictorial information 163 and 167 of variable length.
Referring to Figure 17, dispenser (split) 171 receives input byline 172. First output ofdispenser 171 is delivered to addressgenerator 174 byline 173.DRAM interface 176 is arrived byline 175 in the address that addressgenerator 174 produces. Pass through 177 pairs of outside DRAM (not shown) transmitted signals of line or receive signal withDRAM interface 176. First output ofDRAM interface 176 is delivered topredictive filter 179 byline 178. The output ofpredictive filter 179 is delivered to adder 181 byline 180 as the first input. Second output ofdispenser 171 is delivered to first in first out (FIFO) 183 byline 182 as input. Deliver to adder 181 byline 184 as the second input from the output ofFIFO 183. The output ofadder 181 is delivered to writesignal generator 186 byline 185. First output ofwrite signal generator 186 is delivered toDRAM interface 176 byline 187. Second output of write signal generator is delivered to readsignal generator 189 byline 188 as first input. Second output ofDRAM interface 176 is delivered to readsignal generator 189 byline 190 as second input. The output ofread signal generator 189 is delivered to video formatter (not drawing) byline 191 in Figure 17.
Referring now to Figure 18, the predictive filtering process is described. Aforward direction image 201 is delivered to adder 203 byline 202 as the first input. Deliver to adder 203 byline 205 as the second input for one backward visual 204.Adder 203 is byline 206 outputs.
Referring to Figure 19, asheet 211 comprises one or moremacro blocks 212. Successively, eachmacro block 212 comprises fourluminance block 213 and twocolour signal pieces 214, and comprises the information of 16 * 16 original pixel blocks. The size of each of fourluminance block 213 and twocolour signal pieces 214 is 8 * 8 pixels. Fourluminance block 213 comprise brightness (Y) information that a pixel ground mapping is got from original pixel of 16 * 16 pixel blocks. Acolour signal piece 214 comprises the information of the chrominance levels of blue color signal (CU/b), and anothercolour signal piece 214 comprises the information of the chrominance levels of blusher chrominance signal (CV/r). Each chrominance levels by double sampling (subsampled) so that each 8 * 8colour signal piece 214 comprises the chrominance levels of whole original 16 * 16 pixel block colour signals.
Referring now to Figure 20,, the 26S Proteasome Structure and Function of initial code decoder will become apparent.Value register 221 receives pictorial data byline 222.Line 222 is 8 bit wides, and it allows 8 of each parallel conveys. The output ofvalue register 221 is delivered to codec register 224 with serial mode byline 223. The first output ofcodec register 224 is delivered todetector 225 byline 226.Line 226 is 24 bit wides, and it allows 24 of each parallel conveys. Whetherdetector 225 detects map and exists. Reflection is followed single " a 1 " value representation corresponding to an initial code that has nothing to do with standard with 23 " 0 " value. The map of 8 bit data value is followed in effective initial code map back. When detecting the initial code map,detector 225 transmits an initial map to valuedecoder 228 byline 227.
The second output ofcodec register 224 is passed throughline 229 to valuedecode shift register 230 with serial mode. Valuedecode shift register 230 can be preserved 15 long data value maps. Follow on the right of the data value value of being displaced to decodeshift register 230 of 8 of initial code map back, such as regional 231 indications. Overlapping initial code map is eliminated in this processing, and discussion sees below. The first output of valuedecode shift register 230 is delivered tovalue decoder 228 byline 232.Line 232 is 15 bit wides, and it allows 15 of each parallel conveys.Value decoder 228 usefulness the first look-up table (not shown) is to the decoding of value map. The second output of valuedecode shift register 230 is sent tovalue decoder 228, and it transmits one byline 235 and is identified totoken index converter 234.Value decoder 228 also arrivestoken index converter 234 byline 236 transmission information. Information or data value map or initial code index mapping, it obtains from the first look-up table. What sign showed transmission is any message form.Line 236 is 15 bit wides, and it allows 15 of each parallel conveys. Although the width of selecting among the present invention is 15, can see that the figure place of other length is also available.Token index converter 234 usefulness second look-up table (not shown) become the token map to information conversion. The table 12-3 that the similar user's manual of second look-up table has provided. Then, the token map that is produced bytoken index converter 234 is exported byline 237.Line 237 is 15 bit wides, and it allows 15 of each parallel conveys.
Referring to Figure 21, thedata flow 241 that comprises somerespective bits 242 is input to detector for initial code (not shown in Figure 21). Detector for initial code detects first initial code map. Then detector for initial code receives first data value map 244. Before first data value 244 of processing, detector for initial code may detect second initial code map 245, and it is overlapping on asegment length 246 with the first data value map 244. If this thing happens, detector for initial code is not processed the first data value map 244, and receives and process the second data value map 247.
Referring now to Figure 22,, identifier generator 251 is inputted receive data by line 252 as first. Line 252 is 15 bit wides, and it allows 15 of parallel conveys. Identifier generator 251 also receives sign as the second input by line 253, and receives the effective map of input by first two line interface 254. The first output of identifier generator 251 is delivered to the effective register (not shown) of input by line 255. The second output of identifier generator 251 is delivered to decoding pointer 257 by line 256. Decoding pointer 257 produces four outputs; The initial map of image transmits by line 258, and an image numbers map transmits by line 259, and an insertion map transmits by line 260, and an alternative map transmits by line 261. Leader generator 263 usefulness look-up tables produce and substitute map, and it transmits by line 262b. Extra word generator 264 usefulness MPU produce and insert map, and it transmits by line 262c. Line 262a and line 262c are merged into line 262, and it is the first input of output latch 265. Output latch 265 transmits data by line 266. Line 266 is 15 bit wides, and it allows 15 of parallel conveys.
Inputting effective register (not shown) transmits a map by line 268 and delivers to first or door 267 as the first output. Insert map and deliver to first or door 267 by line 269 as the second input. First or door 267 output by line 271 as the first input deliver to first with door 270. The logical inverse of cancellation map (remove image) by line 272 as the second input deliver to first with door 270, deliver to output latch 265 byline 273 as the second input again. Output latch 265 transmits the effective map of output by second two line interface 274. Receive latch 275 with output and receive signal by the 274 reception outputs of second liang of line interface. Output receives the output of latch 275 and delivers to output receiving register (not shown) by line 276.
Output receiving register (not shown) transmits a map by line 278 and delivers to second or door 277 as the first input. The logical inverse of inputting the output of effective register delivers to second or door 277 by line 279 as the second input. The cancellation map by line 280 as the 3rd the input deliver to second or the door 277. Second or door 277 output by line 282 as the first input deliver to second with door 281. The logical inverse of inserting map by line 283 as the second input deliver to second with door 281. Second delivers to input with the output of door 281 by line 284 receives latch 285. The output that input receives latch 285 transmits by first liang of line interface 254.
Token 1.H.261 SEQUENCE START (sequence is initial) SEQUENCE START (sequence is initial) the MPEG PICTURE START that the map that table 600 form receives produces (image is initial) GROUP.START (all rising the beginning) JPEG (nothing) PICTURE START (image is initial) PICTURE DATA (pictorial data) 2.H.261 (nothing) PICTURE END (image finishes) MPEG (nothing) PADDING (filling) JPEG (nothing) FLUSH (removing) STOP AFTER PICTURE (stopping behind the image)
Table 600 is illustrated in some machine-independent control token, does not have standard signal and relation between the standard signal is arranged. As the table shows, the detection of detector forinitial code 51 map overs produces the irrelevant control of a series of machines token. Be listed in each map on " map of reception " hurdle, start the generation of the irrelevant control of the machine token that is listed in " token of generation " hurdle. So, shown in table 600 the first row, during H.261 processing whenever receive " sequence is an initial " map or during MPEG processes when receiving " image is an initial " map, just produce whole group four control tokens, its a corresponding data value or some data values followed in each token back. In addition, such as the second line display of table 600, four control tokens of second group produced in the suitable time, no matter what the map that detector forinitial code 51 is received is.
Table 601
Display order: I1 B2 B3 P4 B5 B6 P7 B8 B9 I10
Transmit order: I1 P4 B2 B3 P7 B5 B6 I10 B8 B9
Shown in table 601 capable 1, it show the image of transmission and the image that shows between timing relationship, picture frame press the numerical order demonstration. Yet it must exist in the memory for the number that reduces frame, and some frames transmit with different order. It is useful beginning to analyze from basic frame (I frame). The I1 frame is to transmit according to order to be shown. Then next predictive frame (P frame) P4 is transmitted. Then, any two-way interpolation frame that transmission will show between I1 frame and P4 frame (some B frames) represents with frame B2 and B3. This just makes the B frame that is transmitted take past frame (forward prediction) as benchmark, perhaps come frame (back forecast) to be benchmark take the end. All to after the B frame that shows between I1 frame and the P4 frame, transmit next P frame P7 having transmitted. Below, the B frame that all will be shown between P4 frame and P7 frame corresponds to B5 and B6, is transmitted. Then, next I frame, I10 is transmitted. At last, the B frame that all will show between P7 frame and I10 frame should be frame B8 and B9 mutually, is transmitted. This order that transmits frame only requires at any one time and keep two frames in memory, and the transmission that does not require the next P frame of decoder waits or I frame is to show an adjacent B frame.
More about structure of the present invention and operation, also have the information of the aspects such as characteristic, purpose and advantage, in the more detailed description of subsequently invention being specialized intuitively, have the people of general technology can become more apparent to one's own profession. In order to make the clear and purpose easily of explanation, its sorted out and be expressed as follows:
1, many standard configurations
2, JPEG still image decoding
3, active images go compression
4, RAM memory diagram
5, bit stream characteristic
6, restructural is processed level
7, many standard codes
8, the second mode of many standards process circuit-operation
9, initial code decoder
10, token
11, DRAM interface
12, predictive filter
13, the access of register
14, MPI (MPI)
15, MPI reads sequential
16, MPI writes sequential
17, keyhole address location
18, image finishes
19, clear operation
20, remove function
21, stop behind the image
22, many standard search mode
23, reverse moulding device
24, inverse quantization device
25, Huffman decoder and program analyzer
26, various discrete cosine transformers
27,buffer manager 1, many standard configurations
Because various compression standards, i.e. JPEG, MPEG and H.261 be well-known, such as the United States Patent (USP) NO.5 that narrated in front as an example, 212,742, the detailed description of those standards no longer repeats here.
Said such as the front, the present invention can be to the data bit diffluence compression of many kinds of different codings. In each different coding standard, need the output format device of some form to reach data. These data appear at independent operating space decoder output, perhaps appear at the series winding output of a spatial decoder and temporal decoder combination operation operation, (as more being described in detail subsequently) herein. Two decoders are also to this output reformatting, in order to use and in computer or other display system (comprising video display system) demonstration. The realization of this format alters a great deal because of coding standard and/or selected display type.
According to the present invention, in first specific implementation, as described previously with reference to Figure 10~12, address generator is used for storing a formatted data block. The array output of the output of this data block or the first decoder (spatial decoder) or the first decoder (spatial decoder) and the second decoder (temporal decoder). Address generator also is used for will depositing and/or getting to memory through the information of decoding with the order of grating. Video formatter described below provides the combination of a large amount of output signal.
In the preferential multi-standard video decoder implementation of the present invention, spatial decoder and temporal decoder require not only to realize the mpeg encoded signal but also realize H.261 video decoding system. DRAM interface on two equipment is configurable, so that required DRAM amount can reduce when being operated in some little pixel formats and low coded data rate. The reconstruct of these DRAM will further describe in the DRAM interface hereinafter. In typical case, each temporal decoder and spatial decoder circuit require 4 single Mbytes DRAM.
Spatial decoder of the present invention is finished all essential processing in a single width image. This just reduces the redundancy in the images.
Redundancy between one images of temporal decoder minimizing theme image (subject picture) and arrival before the theme image arrives also reduces the redundancy between theme one images visual and that arrive after its arrives. An aspect of temporal decoder is that an address decoder network will be provided, and it processes complicated addressing needs, and with minimum circuit number, high-speed and improved accuracy is read the data relevant with all these images.
As previously mentioned, with reference to Figure 11, before data arrive, by the initial code decoder, be positioned at Huffman decoder and analyzer (parser) fifo register before again by the second fifo register, reverse moulding device first, the inverse quantization device, reverse zigzag and inverse-DCT. These two FIFO do not need to be placed on the chip. In a kind of implementation, data do not flow through the FIFO on chip. Data are added to the DRAM interface, FIFO-IN storage register and FIFO-OUT register. Both outside the chip. These operations register complete and that standard is irrelevant can be narrated in this article subsequently in more detail.
The most of subsystems and the most of level that are shown in Figure 11 are real irrelevant with used specific criteria, and they compriseDRAM interface 58, produce thebuffer manager 59 of DRAM interface IP address,reverse moulding device 75, reverse zig-zag 81, and inverse-DCT 83. Within Huffman decoder and analyzer, compriseALU 66 andtoken formatter 71 with the irrelevant unit of standard.
With reference now to Figure 12,, the unit irrelevant with standard comprisesDRAM interface 100,fork 91,fifo register 96,adder 98 andoutlet selector 106. The unit relevant with standard isaddress generator 94, it H.261 be different among the MPEG, also havepredictive filter 103, it is reconfigurable, has the ability H.261 all working with two kinds of standards of MPEG. Jpeg data can not flow through whole machine fully with changing.
Figure 13 describes a high level block diagram of video formatter chip. The very most of and standard of chip is irrelevant. The only projects that affected by standard are modes that data write DRAM in situation H.261, and this mode is from different when MPEG or the JPEG; And unnecessary to each single picture coding in H.261. Have some timing information to be called time standard (temp-oral reference), it provides some for information about when image will be shown. This part thing is also processed with the address occurrence type of logic in the video formatter.
The circuit remainder of realizing in video formatter and used specific compression standard are fully irrelevant, and these parts are: all color space transformations, over-sampling (up-sampl-ing) wave filter and all γ proofread and correct RAM. Initial code decoder of the present invention is relevant with compression standard, and it must be identified every kind of different initial code pattern of standard in the bit stream. For example, H.261 16 initial codes are arranged, MPEG has 24 initial codes and the JPEG flag code, and it is fully different from other initial code. In case the initial code decoder has identified those different initial codes, its operation is in fact just irrelevant with compression standard. For example, at searching period, except the circuit of identifying dissimilar marks, most of operation is very similar between three kinds of different compression standards.
Next unit is that state machine 68 (Figure 11) is positioned within Huffman decoder and the analyzer. Here, each the actual track that is used for three compression standards be similar complete with. In fact, the unique element that affected by standard is the reseting address of machine. If just analyzer is reset, then its corresponding each standard jumps to different addresses. What in fact, be identified has four standards. H.261 these standards are, JPEG, MPEG and another. In this 4th standard, analyzer is introduced one section code that is used for test. This explanation circuit similar aspect each be complete with, but difference is the micro code program to every kind of standard. So when the program that H.261 operates when a usefulness is being moved, and when the program of a various criterion is being moved, do not have overlapping between them. Correct to this same maintenance of JPEG, JPEG is the 3rd fully independently program.
Next unit isHuffman decoder 56, works together in it anddata directory unit 64. Work in coordination and finish the Huffman decoding in those two unit. Here no matter which kind of compression standard is, be the same to the algorithm of Huffman decoding usefulness. Difference is to enter the whether paraphase of Huffman encoder with any table and data. Itself also comprises state machine the Huffman decoder, and it understands some aspect of coding standard. The selection of these different operatings is corresponding to an instruction that comes from the analyzer state machine. The analyzer state machine with different program works, and is sending correct order with the consistent different time of operation standard to Huff-man decoder to each of three kinds of compression standards.
A last unit is theinverse quantization device 79 relevant with compression standard on chip, and here, the inverse quantization device is that the computing that every kind of different standard is finished is different. In this, a CODING_STANDARD token is decoded, and whatinverse quantization device 79 was remembered to move is any standard. Then, after that event but before another CODING _ STANDARD may occur, data token is subsequently processed with regard to the mode of using the CODING_STANDARD that remembered by inverse quantization device inside to indicate. In more detailed description, different parameters and corresponding those different parameters of which kind of circuit or the calculating in the table explanation various criterion is arranged.
In H.261, to the every sub-systems shown in Figure 12 and Figure 13, the generation of address is different. In Figure 11, the address generate at two forward and backward FIFO of Huffman decoder does not change because of coding standard. Even in H.261, the generation in the address that chip occurs also is immovable. In fact, the difference between these standards is the difference with MPEG and JPEG, and the tissue of some macro blocks is arranged, one side these macro blocks are flatly from the another side that is stretched over of image in the linear rows mode. Can observe best in Figure 14 a, first macro block A covers a full line. Macro block B covers and is less than delegation. Macro block C covers multirow. Be to be divided into some 132 cutting apart in MPEG, and a slice can be a horizontal line A, perhaps can be a part of B of horizontal line, and perhaps it also can expand to next line from delegation, C. Each of these sheets 132 forms the delegation of macro block.
In H.261, organize considerably difference, because image is divided into some piece groups (GOB). A piece group is that three row macro block height take advantage of 11 macro blocks wide. If a width of cloth CIF is visual, 12 such piece groups are arranged. Yet they are not one and are placed on such tissue above another. But have two piece groups mutually against, and have 6 high, namely vertical direction has 6 piece groups, horizontal direction has 2 piece groups.
In all other standards, when carrying out addressing, macro block addresses by above-mentioned order. More specifically, addressing is carried out along row, begins next line behind the arrival end of line. In H.261, in a piece group, the order of piece and description the same, but when shifting to next piece group, order almost is in a zigzag (zig-zag).
The invention provides circuit and process the latter's impact. H.261, the generation of address is like that because changing in spatial decoder and video formatter. When writing DRAM, information just finishes such variation. It is to write with the knowledge that sequence occurs said address, front, thus sequence physical set bit position in RAM, if when being an onesize MPEG image with this should the position be just the same. Therefore, from all address generate circuits of DRAM read data, for example, when forming prediction, unnecessary to understand that be standard H.261, if because the physical location of information in memory be in the MPEG sequence it should the position be the same. So in all cases, only have writing of data to be affected.
In temporal decoder, identify oneself some thing and in fact occurent different to H.261 abstract, Circuits System are arranged. That is exactly, and each piece group is extended away conceptive, so that it is not the rectangle of 11 * 3 macro blocks, but macro block is extended away the piece group that becomes 33 block lengths (seeing Figure 14 c), and its height is a macro block. Since done like that, the counting mechanism used with the piece group of in temporal decoder, passing through for calculating, and identical ground also is used for MPEG.
H.261 the method for piece group line design is corresponding with MPEG sheet line design method. Processed after the initial code decoder when data H.261, there is a slice_start_code each piece group front. There is next slice_start_code next piece group front. The counting that is carrying out in temporal decoder inside for this structure is counted from the beginning to the end, identifying oneself this is a macro block height, 33 piece groups that macro block is long. This is just enough, although circuit is also counted every 11 macro blocks. Count to the 11st macro block or the 22nd macro block when it, it resets some counter. This simple circuit with another counter realizes that this counter is to each count of macroblocks, and when it obtained 11, it was reset to zero. The microcode inquiry is also done that work. All circuits in the temporal decoder of the present invention are irrelevant with compression standard in fact with regard to the physical layout of macro block.
Aspect many standard adaptations property, many different tables and circuit are arranged, be the suitable suitable table of Standard Selection in suitable. Every kind of standard has many tables; Circuits System elects from showing to concentrate at any given time. In any standard, circuit is once selected a table, and another table is then selected constantly at another. In different standards, circuit is selected different table collection. Between those tables, there is some to intersect, pointed when in front Figure 15 being discussed. For example, a table that is used among the MPEG also is used for JPEG. These tables are not the collection that isolates fully. Figure 15 illustrates a H.261 collection, a MPEG collection and a JPPEG collection. Note, H.261 have between collection and the MPEG collection one much bigger overlapping. With regard to their used table, two collection are quite common. H.261 and fully not overlapping between the JPEG have overlappingly on a small quantity between MPEG and JPEG,, thereby these standards have diverse table collection.
As noted, the largest portion of system unit and compression standard are irrelevant. If unit and standard are irrelevant, that then such unit does not need to remember to process is any CODING_STANDARD. Compression standard is remembered in all unit relevant with standard when CODING_STANDARD flows through them. When the information of coding/decoding in first coding standard is distributed to whole machine, and certain machine is being when changing standard, and the machine in front under microprocessor control will normally be selected according to H.261 compression standard work. MPU in these machines in front just produces signal, represents that in different place, the many places of machine intimate compression standard changes. MPU makes change in different, in addition, it can remove whole streamline.
According to the present invention, be arranged in the change that the initial code decoder of first unit of streamline sends the CODING_STANDARD token, this makes the change of compression standard be easy to process. Token declares that certain coding standard is about to begin, so that control information is past dirty along machine, and disposes all other registers in the suitable time. MPU need not plan each register.
How prediction token notice forms prediction with some positions in the bit stream. Circuits System is the information that finds in standard, and the information that namely finds from bit stream changes into a prediction mode token. This depends on move which compression standard. This processing is finished with Huffman decoder and analyzer state machine, processes easily some positions based on some condition here. The initial code decoder produces this prediction mode token. So token down flows to the temporal decoder circuit along machine, it is responsible for forming prediction. What the spatial decoder circuit needn't know that it is moving is that what standard just can be explained token, because in three various criterions, in the token everybody is constant. Spatial decoder is just by doing that token is informed. These tokens have been arranged and suitably utilized them, the design of other unit just has been simplified in the machine. Although in program some complex situations may be arranged, some hard wire logic that originally is difficult to be designed to many standards can use at this, so obtained benefit. 2, to the decoding of JPEG still image
As noted, the present invention is related to the compression of going of signal, especially is related to no matter use what compression standard to the compression of going of encoded video signal.
One aspect of the present invention is to provide one first decoding circuit (spatial decoder) to go the first code signal (JPEG encoded video signal) decoding in the pipeline processes system, also provides one second decoding circuit (temporal decoder) to go the first code signal (MPEG or H.261 encoded video signal) decoding. For decoder between JPEG decoding when not required.
In this, the present invention makes the compression of going of numerous different coding signals become easy by with a simple stream waterline decoder and remove compressibility. Decoding and remove to compress pipeline processor and form with the structure of uniqueness, this structure is so that can process the multi-standard video signal by the technology with all and single streamline decoder and treatment system compatibility. Spatial decoder and temporal decoder combination, video formatter are used for driving video display.
Another aspect of the present invention is the combination of having used spatial decoder and video formatter, and it only is used for still image. Finish the processing of all data in the single visual scope with the irrelevant spatial decoder of compression standard. Such decoder processes the space of interior view image data go compression, those data communication devices are crossed streamline and are distributed in the relevant RAM. Process storage and retrieval at the memory internal information with the irrelevant address generator circuit of standard. The still image data are decoded in the output of spatial decoder, and this output is used as many standards, the input of reconfigurable video formatter, and then formatter provides output to display terminal. In first sequence of similar image, each when image arrives spatial decoder output, all is the same in the length of the position of removing the image that compresses of spatial decoder output. Second sequence of some images may have diverse image size, therefore, with first Length Ratio different length arranged. Also have, the length of the position of all these of similar image the second sequence when these images reach the output of spatial decoder, also all is the same.
What invent is the sequence that the bit stream tissue relevant with standard that enters is become control token and data token in inside on the other hand. Combination also has numerous orders to lay therewith, and is reconfigurable, and through selecting and the processing level of tissue, group has nothing to do with standard as one, reconfigurable pipeline processor.
With regard to JPEG decoding, one single does not have the spatial decoder of the outer DRAM of chip can be very fast to basic jpeg picture decoding. All characteristics of the basic JPEG coding standard of spatial decoder support. Yet the size of the buffer output that can provide may be provided the image size that can decode. The spatial decoder circuit also comprises a RAM circuit, also have relevant with machine, but with the irrelevant address generator circuit of standard. This circuit for the treatment of with information storage in memory.
As noted, temporal decoder does not require the JPEG encoded video signal is decoded. Correspondingly, when setup time, decoder was the JPEG operation, the entrained signal of data token did not directly further process by temporal decoder.
Another aspect of the present invention is that a pair of memory circuitry is provided in spatial decoder, and such as the buffer memory circuit, this is for the Huffman decoder/vision signal separator circuit (HD and VDM) combines work. First buffer memory is placed on HD and VDM front, and second buffer memory is placed on HD and VDM back. HD and VDM be to the bitstream decoding of the binary one in the standard code bit stream and 0, and these circulations are changed into numeral for the downstream. The Double Register system is in order to realize the compressibility that goes of standard more than. The combination of this two buffers and the Huffman decoder that has confirmed to realize will be described hereinafter in more detail.
It is the combination of initial code decoder and Huffman decoder that these many standards are gone to another aspect of compressor circuit. The initial code decoder is positioned at the upstream of the first forward direction buffer. An advantage of this combination is to process incoming bit stream, increased flexibility during the filling that particularly must add in processing bit stream. These confirmed parts, the initial code decoder, the arrangement of store buffer and Huffman decoder has strengthened the processing to some sequence in the incoming bit stream.
In addition, the outer DRAM of chip is used for the video image of real-time decoding JPEG coding. The size of some buffers of reinstating with DRAM one and the speed that speed depends on video data encoder.
Coding standard identifies the information of all types relevant with standard, has among the DRAM that links in order to be stored in spatial decoder, and those information are necessary. What spatial decoder was used is the circuit irrelevant with standard. 3, active images go compression
In the present invention, if active images will go compression by decoding step order, just a temporal decoder must be arranged again. Temporal decoder will combine by the more decoded images of decoded data and front in spatial decoder. These images are predetermined or before current image of decoding or later on demonstration. Temporal decoder is reception information in image coded data stream, goes to identify the information that this time staggers. Temporal decoder is organized the information addressing of being staggered in time and space, retrieve these information and these information of combination, the combination carry out in such a way, going to decode with current image of decoding is arranged in the information of an images, and with a width of cloth composite image as end. This images is complete, and is applicable to be sent to video formatter to drive display screen. In other words, synthetic image can store away as using when making time decoder with image afterwards later on.
In general, temporal decoder carries out the processing between image and the image. These images than current image of decoding earlier and/or a little later in time. Temporal decoder is introduced those information that is provided with coding in the coded representation of image again, because it is redundant, and can obtain at decoder. More particularly, following situation is possible, namely any given image can comprise with other in time or the information of front or rear image similarity. If added movement compensation, it is larger that this similitude can become. Temporal decoder and go compressor circuit also to reduce redundancy between relevant image.
In another aspect of this invention, temporal decoder is for the treatment of the output information relevant with standard of coming from spatial decoder. The information relevant with standard of this single width image is distributed in the middle of several zones of DRAM. This says in the sense, goes to compress output information with what spatial decoder was processed, is that to be stored in other some DRAM registers with other RAM interior. These RAM also have other relevant with machine and with the irrelevant address generator circuit of standard. The address generator circuit is used for the space images of decoded information bag of decoded picture information of interblock space. This images staggers with the time location of the first images in time.
In many standard circuit of having the ability to the mpeg encoded signal decoding, some pixel formats that may be larger in the time of may requiring larger logic DRAM buffer to use MPEG with support.
Pictorial information flows through the serial flow waterline take 8 pixels * 8 pixel blocks as unit. In a kind of form of the present invention, address decoding circuitry is along these pixel blocks of BORDER PROCESSING (storage and retrieval) of these pieces. The address decoder circuit is also from storage and the retrieval of processing these 8 * 8 pixel blocks to the other end on these borders. This many-sided adaptive hereinafter more completely description.
Second temporal decoder also can be provided, and it directly is sent to video formatter to the output of first decoder circuit (spatial decoder), in order there not to be the signal processing delay when processing.
Temporal decoder is also reset the order of image data block in order to show with display circuit. The processing that the address decoding circuitry of hereinafter describing provides this to reorder.
As previously mentioned, temporal decoder key property is that the pictorial information of selecting from some images is added together. These images earlier or a little later arrive than the image of processing. When describing image in this sense, any point below may always distinguishing the flavor of:
1, the coded data of image represents;
2, result, i.e. the formed last decoded picture of result of the treatment step addition finished of decoder.
The image of the former decoding of 3, reading from DRAM;
4, the result of space decoding, i.e. a sheet of data between a PICTURE_START token and PICTURE_END token subsequently.
With after the time decoder processes image data information, data or shown or write back to the video memory unit. Then this information is saved in order to do further reference when the different coded data of another width of cloth of processing is visual.
For visible demonstration reordering of mpeg encoded image contained useful change temporal decoder and the method for the characteristic that reorders, obtain the possibility of the coding image of a width of cloth requirement. 4, RAM memory diagram
Spatial decoder, temporal decoder and video formatter are all used outside DRAM. Best, all three devices are all used same DRAM. Although all three devices are all used DRAM, and all three devices all use the DRAM interface that links with address generator, and what is finished is different to each device in DRAM. In other words, each chip, for example spatial decoder and temporal decoder, even they are with the outside DRAM of similar physics, their DRAM interface is different with the address generator circuit.
In brief, spatial decoder two FIFO that in public DRAM, pack into. Referring again to Figure 11, aFIFO 54 is placed onHuffman decoder 56 and analyzer front, and another is placed on Huffman decoder and analyzer back. FIFO realizes in a relatively simple manner. Be each FIFO, the specific part lie by of DRAM is as physical storage. FIFO will be mounted in it.
Keep tracking to fifo address with spatialdecoder DRAM interface 58 associated address generators with two pointers. There is the first character among the FIFO in pointed, and there is the last character among the FIFO in another pointed, so can carry out read/write operation at suitable word. Reach the terminal of physical storage in the process that is reading or writing, address generator is with regard to " lap wound " top to physical storage.
In brief, temporal decoder of the present invention must be able to be deposited two view picture image or frames, and no matter that appointment is what coding standard (MPEG or H.261). For the sake of simplicity, in DRAM, store the physical storage of two frames in two, a width of cloth of whenever half special use (with suitable pointer) appointment in two images.
MPEG is with three kinds of different visual types: base (I), (P) of prediction and (B) of two-way interpolation. As previously mentioned, the B image is based on the prediction of making according to two images. One width of cloth is following, and a width of cloth is gone over. The I image does not need further to encode with the time decoder, but must have one of two picture buffer, uses for later on to P and B image decoding the time. The decoding request of P image forms prediction from the decoded P in a width of cloth front or I image. Decoded P image exists in the picture buffer, and this is in order to be used for the decoding of P and B image. The B image can require two picture buffer that prediction all is provided. Yet the B image does not exist among the outside DRAM.
Note, do not export from temporal decoder behind I and the P image decoding. But I and P image write one of picture buffer, and and if only if the I that will decode subsequently or P image just be read out when arriving. In other words, temporal decoder relies on subsequently P or I image that the clear picture of front in two picture buffer is removed, and further discusses as hereinafter removing part. In brief, spatial decoder can provide the I of forgery or P image to be used for the P of last time or I image are cleared out in video sequence end. When the video sequence subsequently began, the false image of this width of cloth was eliminated again.
When to the B image decoding, the maximum memory bandwidth load occurs. The worst situation is possible form the B frame according to the prediction from two picture buffer, and the precision of all predictions will reach pixel half.
As previously mentioned, temporal decoder can be configured to provide the MPEG image to reorder. Because this image reorders, the output of P and I image is delayed, until the next P in the data flow or the beginning of I image are decoded by temporal decoder.
Because P or I image are rearranged order, when image was write picture buffer, some token temporarily existed on the chip. When image was read out in order to show, the token of these storages was retracted. In the output of temporal decoder, the data token of the P of new decoding or I image is replaced by the data token of the image of old P or I.
, H.261 only make a prediction from the image of firm decoding. Because every images is decoded, it is written into one of two picture buffer, so it can be used to the decoding of next images. Storage operation is only required and is write 8 * 8 and form prediction with the mobile vector of integer accuracy to DRAM.
In brief, video formatter is deposited three frame formulas, three images. Need to deposit three width of cloth figure adapts to such as multiimage or jumps over figure characteristic like this. 5, bit stream characteristic
When being particularly related to spatial decoder of the present invention now, bit stream (bit stream) characteristic of looking back encoded data stream is helpful, because the Circuits System of spatial decoder and time decoder must be discussed these characteristics. For example, under one or more of standards, the compression ratio of this standard is respectively schemed usefulness by changing certain images coding figure place realizes. Figure place can change on a large scale. Specifically, this means that the bit stream length to the reference map of images coding usefulness can be defined as a unit head, and should another figure of image can be several unit head that the 3rd figure then can be less than a unit head.
Existing each standard (MPEG 1.2, JPEG, H.261) all not regulation finish the method for a figure, its implication is, during next figure beginning, current that finishes. In addition, each standard (especially H.261) allows encoder to produce imperfect image.
According to the present invention, a method that image finishes is provided, Here it is uses one of its token: PICTURE_END. From detector for initial code out still begin the figure that finishes to the PICTURE_END token for the pictorial data of coding comprises each by the PICTURE_START token, their length variations is still very large. May be also (between first and second figure) send other information, but first figure knows and given.
Data flow in spatial decoder output represents the image that some still keep image beginning and visual end mark, but for given sequence, their length (figure place) is identical. Once image beginning and the time length between once image finishes are variable.
Video formatter is got these times image different in size, and they is shown at screen visual display speed is fixed, and concrete numeral depends on driven display class. Type. Different display speeds has been used in the whole world, such as the television standards such as PAL, NTSC are arranged. Taken into account this species diversity with unique way: leave out selectively or repeat some images. Common " frame speed converter ", as 3 frames being kept to 2 frames (2-3 pulldown), with fixing input imagery speed work, but video formatter but can be processed variable input imagery speed.
6, reconfigurable processing level (RECONFIGURABNLE PROCESSING STAGE)
Again consult Figure 10, reconfigurable processing level (RPS) is comprised oftoken decode circuit 33, and it is used for receiving the token from two-wire interface 37 and input latch 34. The output oftoken decode circuit 33 is added to processingunit 36 by two-wire interface 37, and output also is added to action recognition (action identification) circuit. After processing finished, the signal that processingunit 36 will so be finished through output register 41 was delivered to output two-wire interface bus 42.
Two-wire interface bus 40 is passed through in the input of action recognition decoding circuit 39, fromtoken decode circuit 33, with/or pass through two-wire interface bus 46 frommemory circuit 43 and 44. Token fromtoken decode circuit 33 is added to action recognition circuit 39 andprocessing unit 36 simultaneously. The function of RPS and action recognition will further be set forth with chart in the aft section of this explanation.
Functional block diagram among Figure 10 has illustrated that those are not the work at different levels of standard independent circuits among Figure 11,12,13. Data successively flow throughtoken decode circuit 33, processingunit 36, output latch 41, arrive two-wire interface circuit 42. If control token (Control Token) is identified by RPS, it is just interior decoded attoken decode circuit 33, after this suitable action will occur. If unrecognized, it just delivers to output two-wire interface 42 through output circuit 41 without change. The present invention plays the effect of pipeline processes, and it is furnished with two-wire interface with the movement of control control token in streamline. In EUROPEAN PATENT OFFICE (EPO) number of patent application 92306038.8 of the former application of these characteristics of invention more detailed narration is arranged.
In the present invention, actually ortoken decode circuit 33 is used for the token data token control token that identification enters two-wire interface 42 at present. If recognized by the token that token decode circuit is being investigated, it just is withdrawn into action recognition circuit 39, suitable designator (index) signal or marking signal occur simultaneously to indicate this action to be used. Meanwhile,token decode circuit 33 also provides suitable sign or indicator signal to arrive processingunit 36, notes having token just to process in action recognition circuit 39 to remind the latter. The control token also can be accepted corresponding processing.
With hereinafter to the present invention can with various token type be described in more detail. With regard to the purpose of this part explanation, only it is also noted that controlling the entrained address of token decodes indecoder 33, and be used for the register of access in action recognition circuit 39 that this is just enough. When control token that the token of just being investigated is recognized, its reconfiguration status circuit of action recognition circuit 39 usefulness (reconfiguration state circuit) distributes each control signal in whole state machine. As previously mentioned, this starts with regard to the state machine that makes action recognition decoder 39, so 39 pairs of itselfs of decoder are reshuffled. For example, it can change coding standard. This shows that action recognition circuit 39 has solved the required action of specific criteria of just passing through at present state machine for processing, referring to Figure 10.
Similarly, theprocessing unit 36 under action recognition circuit 39 control now at any time can pack processing be contained in the information in the data token data field, as long as opportune words. Under many occasions, the control token is at first come, and action recognition circuit 39 is reshuffled; Then data token accepts processing immediately following thereafter inprocessing unit 36. The control token just withdrawed from output latch circuit 41 and arrives the opposite side of output two-wire interface 42 before the data token that processingunit 36 is disposed.
In the present invention, action recognition circuit 39 is state machines that keep historic state (history state). Register 43 and 43 is keeping solving and to be stored in information in this register from the token decode device. Such register can be located at as required in the chip or chip outside. These numerous status registers contain the action message relevant with the current action recognition of just identifying in action recognition circuit 39. Connect 40 and directly arrive action recognition parts 39 fromtoken decode device 33. Its objective is that expression action also can be subjected to the impact of the current token of just being processed bytoken decode circuit 33.
So far pointed out substantially that the token decode that carries out according to the present invention and data process. Data are processed the mode that consists of according to action recognition circuit 39 and are carried out. Action is subject to the impact of many conditions, it is also affected by following factor: the information that obtains from the token that solved in the past in general, the information inregister 43 and 44 of being stored in that more particularly from the token that solved in the past, obtains, the current token of present, and action recognition unit 39 acquired state own and historical informations. We point out to control the difference between token and the data token whereby.
In arbitrary RPS (reconfigurable processing level), some tokens are regarded as the control token by this RPS unit, and this is because they probably can affect the work of RPS in certain follow-up moment. Other tokens are then regarded as data token by this RPS, such data token contains certain information of being processed by this RPS, processing mode depends on: the design of indivedual (particular) Circuits System, former decoded token and the state of action recognition circuit 39. Although certain other RPS recognizes some token as the control of these indivedual RPS, recognizing other tokens is data tokens, and this is the opinion of these indivedual RPS. Another RPS can have different views to same token. As data token may be regarded certain token in a RPS unit, and another RPS unit then may decide it and be actually the control token. For example, with regard to Huffman decoder and state machine, quantization table information is data, and it is formatted into a series of 8 words, and these words form again the token that is called quantization table token (QuANT_TABLE), the lower processing streamline that passes to. With regard to this machine, these all are data; Deal with data becomes another kind of with a class data transaction, and this is undoubted to be this part a kind of function of processing of machine. Yet when this information arrived the inverse quantization device, the inverse quantization device deposited the information in this token in numerous register. In fact, because 64 8 figure places are arranged, so many registers are arranged, in general, numerous registers can be arranged. This information is counted as control information. Then, this control information meeting impact is to the processing of subsequent data token, because its impact is to the multiplier of each data word. More than be exactly that one-level may be regarded token as data and another grade may be regarded it as the example of control.
According to the present invention, token data almost is counted as data without exception in whole machine. One of importance is, in general, every one-level Circuits System with token decode device must be sought some token, any token that it can not identified must in statu quo hand down from streamline by the corresponding levels, so that have the right to see these tokens in each subsequent stages when the prime downstream, and may respond to it. This is a very important feature, that is, use token skill can realize the communication between each parts not adjacent to each other.
Another important feature of the present invention is, every one-level of Circuits System has the inter-process ability, carrying out each substandard necessary operation, and carries out the control of coming with the token form, goes to determine should finish which operation at certain given time. For aforementioned capabilities is provided, a treatment element is arranged, but it is not at the same level all variant. In the state machine ROM of Parser, three kinds of diverse programs of separating are arranged, each program adapts to a related standard. Carry out which kind of program and depend on the CODING_STANDRAD token. In other words, each section's dual capability of having the decoding processed and processing the CODING_STANDARD standard token within it all of three kinds of programs. When which each coding standard of seeing that the next one will be decoded of three kinds of programs was, they just jumped to the initial address in the microcode ROM that arranges into this specific program faithfully. More than be many standards of how treating at different levels.
There are two things to be subject to the impact of various criterion. The firstth, be considered to everybody style of initial code or identity code in the bit stream, in order to shift register is reshuffled, go to detect the length of beginning flag code. The secondth, the first information of this initial code of expression or identity code meaning in microcode. Please remember every coding difference under three kinds of standards. Therefore, microcode is being searched the whatsit irrelevant with standard for this compression standard in the specific table, namely represents a kind of token of the code that arrives. Because in most of the cases, each different standard all provides certain code to remove to produce this token, and it is general and standard is irrelevant.
Inverse quantization device 79 has mathematical operational ability. Quantizer is done multiplication and addition, and has the ability to adapt to all three kinds of compression standards that formed by some parameters. For example, a flag bit in controlled ROM can inform whether the inverse quantization device adds a certain constant K. Another sign informs whether it adds another constant. When the CODING_STANDARD token streams was crossed the inverse quantization device, there was it in the register in quantizer. When data token after this by the time, the inverse quantization device is that any standard stores to get off with this, and search it be added on the treatment element, for carrying out the required parameters of proper handling. For example, the inverse quantization device will be searched the K that is fit to a certain specific compression standard and be zero setting or put 1, and this is added to treatment circuit.
Similarly, many tables are arranged in the Huffman decoder, some is JPEG, and some is MPEG, and H.261 some for arranging. In fact, these form majorities can be more than one these compression standard services. Use which table to depend on the grammer (synt-ax) of this standard. After the order of receiving from state machine, the Huffman decoder is work just, and which table is the state machine decoder will use. Therefore, be not to have a state directly to enter Huffman decoder itself, store there, and inform it to be processed be which type of coding. On the contrary, be that Parser state machine and Huffman decoder combine and information be included in their inside.
As for spatial decoder of the present invention, the generation of address has change, and this is to shown in Figure 10 similar, and many information of namely decoding from token are such as coding standard. Coding standard and additional information are by the typing register, and this has just affected advancing of address generator state machine, because it passes by length by length, and one by one to the count of macroblocks in the system. H.261 or MPEG afterbody may be predictive filter 179 (Figure 17), and it is operated in one of dual mode, or, and this is easily identification. 7, many standard codes
System of the present invention also provides the combination of some designators (indices) generation circuit that has nothing to do with standard, and these circuit and token decode circuit combine and spread all over total system tactfully. For example, native system is used to specifically video standard H.261, or the MPEG video standard, or the decoding of JPEG video standard. These three kinds of compression and coding standards have been stipulated the step that should process the arrival data similarly, but data flow architecture is not identical. As previously mentioned, one of function of detector for initial code is exactly to detect MPEG initial code, H.261 initial code and jpeg marker code, and all converts them to a kind of form, i.e. a kind of control token that contains the token streams that embodies the present encoding standard. The control token is by pipeline processor, and is used (namely decoded) in relevant with it state machine. The control token is other state machine by having no truck with also. Thus, also treat data token with the same manner, accept processing because they only can be consisted of in the state machine of processing them by the control token at those. In remaining state machine, their former states are passed through unchangeably.
More specifically say, according to the present invention, can comprise more than one word in the control token. If so, that is called as extension bits is set to 1, uses additional word to carry additional information in the token thereby be defined in. Some position of these additional control bits is equivalent to designator (indices), is used to refer to the information that will use in corresponding state machine, to produce one group of indicator signal that has nothing to do with standard. The remainder of token is used to refer to and identifies inter-process control function (internal processing control function), and the latter is standard to all data flow by pipeline processes. In a kind of form of the present invention, the token expansion is used for transmitting (carry) coding standard, and this standard is solved by the relevant token decode circuit that spreads all over complete machine. Every place being suitable for new coding standard operation, the token expansion also is used for the action recognition circuit 39 at different levels that spreads all over complete machine is reshuffled. Token decode circuit can point out whether certain control token is relevant with one of selected standard, and these standards are that circuit design is good treatable in addition.
More specifically, the back of MPEG initial code and jpeg marker is 8 bit value. H.261 the back of initial code is 4 bit value. For this reason, the detector forinitial code 51 usefulness way that detects MPEG initial code or jpeg marker (marker) indicates follow-up 8 to be exactly and the related value of initial code. Irrelevant with top this part thing, it then produces a signal and goes to point out that this is not H.261 initial code but MPEG initial code or jpeg marker. At first, 8 place values enter decoding circuit, and a part of 8 produces the signal of expression designator and sign (flag), this signal in current circuit for the treatment of the token by this circuit. Some segments that this also is used to insert the control token will check these segments later on, are which standards with what determine processing. In this sense, a part points out should do the data of following the operation of which kind of type in the control token, and also some points out that it is relevant with mpeg standard. As previously mentioned, this operation information is used for reshuffling processing level in system, is used for finishing the desired difference in functionality of various criterion and process level, and various standards are exactly to set up for this purpose.
For example, be as the criterion with initial code H.261, it is with relevant immediately following 4 place values thereafter. Detector for initial code is sent this value into token generator state machine. Numerical value is added to 8 decoders, and the latter just produces 3 starting symbol. Starting symbol is used for identifying the beginning of an images, and figure number is indicated with the numerical value of starting symbol.
Native system also comprises multistage parallel processing streamline, and this streamline is according to the principle work of aforesaid two-wire interface. Every grade of machine by common employing form shown in Figure 10 forms.Token decode circuit 33 is used for guiding the token of the current machine that gets the hang of to arrive action recognition circuit 39, or to processingunit 36, is as the criterion with suitable. Processing unit is former to be set to the required form of present encoding standard of processing by the control token reprovision of front more, and the present encoding standard is just entering now processes level, and is that next data token is entrained. In addition, according to this situation of invention, each the follow-up state machine in processing streamline can worked with a kind of coding standard (namely H.261), and its prime can worked with other standard (such as MPEG). Same two-wire interface is used for transfer control token and data token.
System of the present invention has also utilized the control token, and these control tokens are required with the reconfigurable processing level of fixed number some coding standards to be decoded. More specifically say, used PICTURE_END control token, because there is the indication of the visual real finish time extremely important. Corresponding therewith, in the design of many Standard Machines, must at some extra control tokens of the inner generation of the pipeline processor of many standards, make the indication of processing function will use which kind of standard decoding technique. A control token like this is exactly the PICTURE_END token. It is complete that this PICTURE _ the END token is used to refer to current image, is used for forcing buffer to rinse well and be used for current image is pushed to display from decoder. 8, many standards treatment circuit-the second working method
With the form of aforesaid detector for initial code compressor circuit that form, relevant with coding standard, through certain suitable bus, be interconnected to suitably the compressor circuit irrelevant with standard. The circuit relevant with standard is connected to the combinational circuit irrelevant with standard, and links additional busses also by same bus. The circuit irrelevant with standard is added to the circuit relevant with standard with the input that adds, and the circuit that the latter has nothing to do information back and standard. Be added to output from crossing another suitable bus with the irrelevant circuit information exchange out of standard. Table 600 has illustrated that the multiple standards as the detector for initial code relevant with standard 51 inputs is comprising some bit stream, and the latter has the meaning relevant with standard in each coding stream. 9, detector for initial code
As front pointing out, according to the present invention, detector for initial code can extract MPEG, JPEG and bit stream H.261, and can produce thus proprietary (proprietary) sequence of tokens, and these proprietary tokens are meaningful for the remainder of decoder. The example that how decoding is finished as multiple standards, MPEG (1 and 2) picture_start_code, H.261 picture_start_cide and JPEG start_of_scan (SOS) mark are used as signal of equal value by detector for initial code, and they all produce an inner PICTURE_ START token. Similarly, MPEG sequence_start_code and JPEG SOI (start_of_image) mark also all produce a machine sequence_start token. Yet H.261 standard does not have initial code of equal value. Therefore, as to first H.261 response of picture_start_code, detector for initial code will produce a sequence_ start token.
Above-mentioned various visual neither one is directly used, except the use in detector for initial code (SCD). On the contrary, machine PICTURE_START token for example, has been considered to be equal to each the PICTURE_START map (inages) that is included in the bit stream. In addition, must remember that machine PICTURE_START itself is not the direct map (direct image) of the PICTURE_START in standard. On the contrary, it is a control token, and being used for providing the decoding irrelevant with standard with after other control token combination, and this has simulated under each compression and coding standard the operation to image. The cooperation that each controls token adds that the designator and/or the further of sign that partly produce with state machine token decode circuit out of the ordinary cooperate, and is unique in itself. The cooperation that each controls token adds that with the entrained information of control token to the reshuffling of circuit, this is unique in itself equally. A typical reconfigurable state machine will be narrated afterwards.
Again referring to table 600, the table left hurdle shown in be the title of one group of standard picture, shown in the right hurdle is more machine dependent control tokens, and they are used to the standard code signal simulation, and this standard code signal does not exist in standard picture or is not used.
Reference table 600 can find out that as previously mentioned, it just produces machine sequence_start signal No. one time when detector for initial code is decoded to the arbitrary standard signal shown in this table. Detector for initial code produces sequence_start, group_start, and sequence_end, slice_start, user_data, the tokens such as extra_data and PICTURE_START, they are applied to spreading all over system-wide two-wire interface. Decided by the token content with these structures of controlling every one-level of token cooperatings, or determined by the designator that the token content produces. At different levelsly all be ready to process the data that expection will be received when the pictorial data token is come this grade.
As previously mentioned, for example H.261 one of compression standard, does not have the map of sequence_start at it in data flow, do not have the map of PICTURE_END yet. Detector for initial code indicates the place of PICTURE_END in the bit stream that enters, and produces the PICTURE_END token. In this respect, being intended that of system of the present invention transmits the fully data word of combination, so that each register-bit of selecting in the invention process has been set up an information. For this reason, 15 figure places that conduct transmits have been selected between two initial codes. Certainly, the people with the general technical ability of one's own profession also can recognize the selection that can make greater or less than 15. In other words, just from detector for initial code send into the DRAM interface data word whole 15 for the normal operation all be essential. Therefore, detector for initial code will produce the more extra positions that are called filling, in the last word with the data inserting token. For the purpose of example, 15 have been selected.
In order to realize the filling operation, according to the present invention, the follow-up some binary ones of Binary Zero are inserted automatically, to fill up 15 bit data word. Then these data by the coded data buffer, deliver to the Huffman decoder, and this decoder is removed filling again. So, any digit just can be by the buffer of fixed size and width.
In a kind of specific implementation, slice_start control token is used to identify a slice (slice) image. Use slice_start control token, in order to image is cut into less zone. The size in zone is selected by encoder. Detector for initial code identifies the slice_start code of this unique form, cuts into some less zones in order to be positioned at the image detector for initial code downstream, that machine dependent each state levels will received. The size in zone is selected by encoder, is identified by detector for initial code, by reformulating (recombinat-ion) Circuits System and controlling token and use, to coding image decompression (decompre-ss). Recovery after the slice_start code is mainly used in makeing mistakes.
Initial code provides a kind of peculiar methods of starting decoder, and this will be described in a more detailed discussion afterwards. Detector for initial code is placed on before the coded data buffer, rather than is placed on thereafter and is placed on Huffman decoder and vision signal separator (video demultiplexor) before, this has many benefits. Detector for initial code is positioned at the first buffer can makes it: 1) collect (assemble) token, 2) the standard control signal is decoded, as initial code is decoded, 3) before entering buffer, data load bit stream, 4) produce suitable control sequence of tokens, with the clearancen buffer, simultaneously available data is pushed the Huffman decoder from buffer.
The syntactic element (syntactic elements) and the video encoding standard that have directly been reflected different images by the major part of the control token of initial detector output. Detector for initial code is transformed into the control token with syntactic element. Except these natural token parts, some uniquenesses and/or machine dependent token have also been produced. These unique tokens comprise some specially for use the token that designs in system of the present invention, and the latter is unique in itself, helps to embody the character of the many standards of adaptation of the present invention. The example of these unique token has PICTURE_END and CODING_STANDARD.
Also introduce some tokens, removing some the grammer difference between each coding standard, and possessed the function that cooperates with the condition of makeing mistakes. The automatic generation of token is that the data relevant with standard are being done just to finish after a series of analyses. Therefore spatial decoder is identical to the response of two class tokens. One class is that those directly have been provided to the token that spatial decoder is the SCD input, and a class is that those detect the token that has produced after the initial code in coded data. A succession of extra token is inserted into two-wire interface, in order to control many standard feature of the present invention.
MPEG has relevant with standard, non-data, discernible bit models with encoded video streams H.261, and one of them will be known as initial map (start image) and/or relevant code with standard from now on. The code that plays similar effect in JPEG is flag code (marker codes). These initial/flag codes are used for the pith of recognition coding data flow grammer. By detector for initial code finish initial/analysis of flag code is the first step of coded data being made syntactic analysis (parsing).
The modelling of initial/flag code must make them can be needn't be just identified to whole bitstream decoding. So according to the present invention, they can be used for assisting to do makeing mistakes restores and work that encoder starts. Detector for initial code is provided at and detects wrong facility in the encoded data structure, and helps the decoder starting. Decoder starting process and detector for initial code detect wrong ability and all will be described in a more detailed discussion afterwards.
Above explanation be mainly concerned with machine dependent bit stream characteristic and with the relation of addressing characteristic of the present invention. Bit stream characteristic hereinafter with reference to the detector for initial code explanation coded data relevant with standard.
The compressing and coding system of each standard has used a kind of initial code configuration (configuration) or its map of uniqueness, and it is in order to identify the specific standard of this system for selected use. Every kind of initial code is also all with oneself with the initial code value. Initial code value is used for the identification action type relevant with this initial code in the speech range of this standard. In many standard decoders of the present invention, as previously mentioned, compatible configuration based on control token and data token. Indicator signal comprises marking signal, is produced by circuit in each state machine inside, and they will be illustrated when suiting from now on.
The initial code that in each standard, comprises and/or flag code, and with opposed other standard word of data word, sometimes be considered to map, obscure mutually with the code and/or the machine dependent code that use avoiding, the latter refers to the control token that uses and/or the content of data token in machine. Moreover this word of initial code uses as generic word (generic term) often, refers to MPEG and initial code H.261, also refers to the jpeg marker code. The purposes of flag code and initial code is identical. Have, " flushing " word both had been used to refer to the FLUSH token again, also used as verb, for example (comprised " flushed " of signal in " being rinsed ") when mentioning the shift register of flushing detector for initial code. For avoiding confusion, the flushing (English words) in the flushing token is always write with upper case. Other usage of this word (verb or noun) is used lower case.
The coding input picture input stream relevant with standard is comprised of data and the initial map of different length. Initial map is with oneself with numerical value, and this numerical value informs which kind of operation the user will carry out according to this standard to the data that are right after thereafter. Yet in many normal stream waterline treatment system of the present invention, requiring has compatibility to multiple standards, and system through optimizing, can carry out all operations with all standards. Therefore, under many occasions, must set up unique initial control token, not only they are compatible with the numerical value in the numerical value that is included in the code signal Standard Mapping, they can also control operations of going to simulate this standard at different levels, and the parameter that every kind of standard has the one's own profession of appointment to know represents its operation. All such standards all embody in this manual according to reference.
Importantly understand the relation between the token, these tokens individually or with other control token in combination simulation package be contained in non-data message in the standard bit streams. Each state machine produces a cover indicator signal separately, comprises marking signal, to do certain processing in this state machine inside. Numerical value subsidiary in each standard can be with visiting machine dependent each control signal, so that simulation is to the operation of normal data and non-data-signal. For example, the slice_start token is the double word token, and then it enters aforesaid two-wire interface.
The data that are input to system of the present invention can be from any suitable data source, for example disk, tape etc., the data of coming. Data source offers first functional level of spatial decoder with 8 bit data, detector for initial code (Figure 11). Detector for initial code comprises three shift registers;First shift register 8 bit wides, next 24 bit wides, more next 15 bit wides. Each shift register is the part of two-wire interface. The data of coming from data source in a timing cycle with single 8 byte first register of packing into. After this, ground of content of the first shift register moves into decoding (second) shift register. At 24 all after dates, 24 bit registers are just full.
First shift register of once packing into of per 8 cycles of octet. Each the byte value of being loaded into shift register 221 (Figure 20) then uses 8 additional cycle with its clearancen, andshift register 224 is packed into. It has used 8 cycles clearancen, so three such operations or after 24 weeks, still have the content of three bytes in 24 bit registers. Valuedecode shift register 230 still is empty.
Supposing has a PICTURE_START word now in 24 bit shift register, sense cycle has been recognized the form of this PICTURE_START code, and provides initial signal to export as it. In case it is initial that detector detects once, the byte after this signal is the value relevant with this initial code, and this byte is current just to be accounted invalue register 221.
Be identified as initial code owing to detect the content of shift register, its content must be removed from two-wire interface, no longer is further processed with these three bytes guaranteeing. Codec register is by clearancen, and the value decode shift register is being waited for and will all moved past next value through such register.
A value relevant with PICTURE_START is arranged on the low order position of value decode shift register now. The signal corresponding with the PICTURE_START signal of standard is called SD PICTURE_START in the spatial decoder. SD PICTURE_START signal itself is about to be included in the token head now, and value also is about to be included in the expansion word of token head. 10, token
In practice of the present invention, token is a kind of broad-spectrum adaptation unit (adapta-tion unit), its form is an interactive interface packets of information (an interactive interfacing messenger package), is used for control and/or data. It is adapted to the reconfigurable use of processing level. The latter (RPS) is when a certain token of having identified of response, to oneself reshuffling to carry out various operations.
In order to realize different functions, token can or relevant with the position of processing level, perhaps with location independent. Token also can be out of shape, and they can be revised by a certain processing level, then pass to down streamline to carry out more function. Token is can be with all at different levels or be less than all at different levels working in coordination, and in this sense, token can be worked in coordination with adjacent level and/or non-conterminous level. Token can be relevant with the position to some function, and to other functions and location independent; The concrete interaction of token and certain one-level can be take the processing history in past certain grade of as condition.
The PICTURE_END token is a kind of method that the sign image finishes in many standard decoders.
Many standard tokens be with MPEG, JPEG and H.261 data flow be mapped to a kind of method of single decoder. This decoder uses relevant with standard and with the irrelevant hardware of standard and control the mixture of token.
The SEARCH_MODE token is a kind of MPEG of search, JPEG and the skill of data flow H.261, and its allows the error recovery of random access and enhancing.
The STOP_AFTER_PICTURE token is clearly to finish a kind of method of decoding, and it indicates the end of an images and removes decoder stream waterline, i.e. passage conversion (channel change).
In addition, filling is with a kind of method of any digit by fixed size, fixed width buffer to token.
Target of the present invention is a kind of token and two wire device, variable pipeline processes system of configuration of using. Adopt control token and data token and combine with two wire device, make multi-standard system become easy. Compare with the system that does not use the control token, it has the operational capacity of expansion.
The control token is produced by the Circuits System of decoder processor inside, and they simulate the work of many dissimilar signals relevant with standard, and these signals are sent into the pipeline processor of serial and accepted processing. Employed method is to study all parameters of those many standards of being selected by serial processor, attention 1) similitude of these standards, 2) their difference, 3) their needs and requirement, 4) select correct token function, the standard signal of sending into serial processor effectively to process all. The effect of token is that standard is simulated. The agency part ground of control token partly is used as the key element of communicating control information in pipeline processor as the emulation between the signal relevant with standard/conversion.
In the system of prior art, be the special machine that can recognize standard according to the well-known process design, then establish special-purpose line system by MPI. The signal that comes from microprocessor is used to control data flowing special-purpose downstream components. The selection of this decompressing function, timing and organize under the control that all is in fixing logic circuit have the signal of microprocessor to participate in control.
With above-mentioned contrast, system of the present invention then under the control of control token to each functional level configuration in downstream. Obtain control essential and/or its replacement from MPU (microprocessor unit), this provides as option.
Token provides and has formulated the form that gears to actual circumstances for transmission information in the decompression circuit pipeline processor. In the design that embodiment that select hereinafter, preferential uses, each token word at least 8 bit wide, a token can continue one or more words. The length of address field, and may continue a plurality of words. In a preferred embodiment, the address is no longer than 8. So this is not the restriction to the scope of application of the present invention, and this has just limited and has utilized that these tokens must be finished, selected treatment step number. Attention under extension bits Identifiaction plates item, 1 andword 2 in extension bits be 1, its meaning is after this some add-word will occur. In extension bits be 0, therefore indicate the end of this token. Token also can be that variable bit is long. For example, the token word is 9, adds extension bits, is exactly 10. In design of the present invention, the width of each output bus also is variable. The output of decoder is 9 bit wides, perhaps, when comprising extension bits, is 10. Among the embodiment of elder generation, utilizing unique token of these extension bits is data tokens; All its boards are all ignored this extension bits. Should be understood that this is not restriction, this is a kind of realization. Profiling characteristic by application data token and control token just just may change by this length according to the data of token passing, and this is that the meaning of figure place is said in the word. Example was once discussed, and some data bit in the data token word can make some data bit in the word combine with same data, to form 11 or 10 bit address, was used for visiting cloth at each random memory of this serial decompression processor. This just increased can degree, also just make and greatly expand multifunctionality and become easily. As previously mentioned, data token is processed level with data from one and is sent to the next one. Therefore, the characteristic of token has changed during by decoder at it. For example, locating of spatial decoder, what data token carried is bit sequence (bit serial) coded data that 8 words form. Herein, to the length of each token without limits. Yet in order for example to invent multifunctionality (in the output of spatial decoder circuit) in this respect, each token has 64 words just, and every word is 9 bit wides. More particularly, the standard code letter is permitted the message of different length to the different densities and different details codings of image. First width of cloth of one picture group has the longest data bits usually, because it need to provide maximum information to processing unit, so that it can go to begin to decompress with information as much as possible. In a typical case, the length of follow-up word is shorter, is the difference signal that the second place on the first word and the scanning information field is drawn more afterwards because they comprise.
System is desired as standard code, and these words are mingled with mutually, so that the data of varying number can be provided for the spatial decoder input. Yet after spatial decoder was finished effect, its output information but was to suit to provide in the pixel format speed that screen shows. For different demonstration standards from the whole world integrate with (such as NTSC, PAL and SECAM), can change with the output speed of spatial decoder time representation. The static image speed that video formatter becomes to be suitable for showing with this variable visual rate transition. But pictorial data is still transmitted by the data token of 64 words. 11, DRAM interface
In three decoder chips each has all been used the configurable DRAM interface of independent high-performance. In general, the DRAM interface on each chip is the same basically, but these interfaces are different aspect the treatment channel order of priority how. It is spatial decoder, temporal decoder and the used DRAM of video formatter that this interface is designed to directly drive. In typical case, in those systems, the DRAM interface does not always need external logic, buffer or other assembly to the connection between the DRAM.
According to the present invention, interface can dispose from two aspects:
1, the detailed timing relationship of interface is configurable, to be fit to various dissimilar DRAM.
2, data-interface is configurable to the width of DRAM, in order to reach the compromise of price/performance in different application.
In general, the DRAM interface is the every part upper and standard is irrelevant that is contained in three kinds of chips of native system. Again repeat, they are spatial decoder, temporal decoder and video formatter. Again referring to Figure 11,12 and 13, these figure have showed respectively the block diagram that concerns of DRAM interface and spatial decoder, temporal decoder, video formatter remainder. On every kind of chip, the DRAM interface is connected to certain outside DRAM with chip. Use DRAM also not conform to reality because in sheet, making up to now the required relatively huge DRAM of quantity. Attention: each chip has its outside DRAM and its DRAM interface.
H.261, JPEG and MPEG in addition, although DRAM interface and compression standard are irrelevant, it still must be configured to realize each standard in many standards. As for how the DRAM interface being reshuffled to realize many standard operations, this will be described further in this specification back.
Understand the work of DRAM interface, correspondingly need to understand the relation of DRAM interface and address generator, and understand the two and how to communicate by letter with two-wire interface.
In general, as its name suggests, address generator produces the DRAM interface to the required address of DRAM addressing (namely reading or write the particular address of DRAM). Two-wire interface has been arranged, only had read and write just occurs when the DRAM interface has data (from each prime of streamline) and effective address (from address generator) simultaneously. As discussed further below, use address generator separately both to simplify the structure of address generator, also simplified the structure of DRAM interface.
In the present invention, the DRAM interface can move under a kind of clock, and this clock and address generator are asynchronous, and it is also asynchronous with the clock at different levels that data communication device is crossed. Adopted special skill to treat this asynchronous behavior of operation.
Data typically transmit (sole exception is the prediction data in the temporal decoder) take the piece of 64 bytes as unit between DRAM interface and chip remainder. Transmit by the device of a kind of being called " alternately buffer " and realize. This mainly is a pair of RAM with the work of two buffer memory form, the DRAM interface fill or RAM of clearancen in, chip another part clearancen or fill another RAM. Each replaces buffer and is furnished with minute other bus, to transmit from the next address of address generator.
In the present invention, every kind of chip has four alternately buffers, but these effects that replace buffer are not identical in every kind of situation. In spatial decoder, one alternately buffer be used for coded data is sent to DRAM, another is used for reading coded data from DRAM, the 3rd is used for the token data are sent to DRAM, the 4th is used for reading the token data from DRAM. In temporal decoder, then be one alternately buffer be used for the pictorial data of base (intra) or prediction is write DRAM, read the pictorial data of base or prediction from DRAM for second, read forward direction or back forecast data for all the other two. In video formatter, one alternately buffer be used for data are sent to DRAM, its excess-three is used for from the DRAM read data, each respectively reads in brightness (Y) and red, the blue chromatism data (being respectively Cr and Cb) one.
Below explanation is had one write alternately buffer and one and read the alternately work of the imaginary DRAM interface of buffer. It is basically identical with the work of the DRAM interface of spatial decoder. Working condition illustrates in Figure 23.
Other control interface between at different levels of chip that Figure 23 showsaddress generator 301,DRAM interface 302 and transmits data all is two-wire interface.Address generator 301 can produce as the address that receives control token result, also can only produce fixing address sequence (the FIFO buffer that for example, is used for spatial decoder). The DRAM interface is treated the two-wire interface relevant withaddress generator 301 with particular form. When being ready to accept the address, it does not make acceptance (accept) line remain height, but waits for that address generator provides effective address, processes this address, then puts acceptance line for high during a clock cycle. So, it has realized request/response (REQ/ACK) agreement.
Exclusive characteristics ofDRAM interface 302 are that it can reach and provide or accept the at different levels of data and communicate by letter withaddress generator 301 independently. For example, address generator can produce with write buffer alternately in the relevant address (Figure 24) of data, occur but do not have action, signal and show that an existing blocks of data is ready to outside DRAM to be written until write alternately buffer. Say similarly, write alternately that buffer can contain data that have been ready to outside DRAM to be written, but action occurs, until there is the address to be added on the suitable bus fromaddress generator 301. In addition, in case a RAM who writes in the buffer alternately has been full of data, another RAM is clearancen fully, and is stopped (letter of acceptance of two-wire interface is set to low) in the data input and just " replace " before and arrive DRAM interface side.
In the process of understandingDRAM interface 302 work of the present invention, importantly in a suitable system of configuration, the DRAM interface must can make alternately transmit data between the buffer and outside DRAM 303 speed at least and to replace all the average speed sums of transmission data between buffer and the chip remainder the same fast.
EachDRAM interface 302 determines which will use replace buffer its next time. In general, this will be once " circulation " (round robin) (namely, the alternately buffer that uses next time be recently minimum taking turns to and next time available that), or a preferred order encoder (namely, therein, some replaces buffer and has higher priority than other). In both cases, refresh requests generator (refresh request generator) all can be sent additional request one time, and this request is all higher than the priority of all other requests. Refresh requests is produced by refresh counter, and the latter can programme by MPI.
Referring now to Figure 24,, be to write the alternately block diagram of buffer there. Writing alternately, the buffer interface contains two RAM, RAM1 311 andRMA2 312. Further will discuss as this paper, under the control of write address 313 and control 314, the data of coming from prime are written into RAM1 311 andRAM2 312. Data are from RAM1 311 andRAM2 312. Data writeDRAM 315 from RAM1 311 andRAM2 312. When data werewrite DRAM 315, the row address of DRAM was provided by address generator, and column address is then provided by write address and control signal, and this also will narrate from now on. During operation, valid data (data input) occur atinput 316 places. In typical case, from the prime receive data. Along with each data is received from prime, it is written into RAM1 311, and then write address control makes the address increment of RAM1, writes RAM1 to allow next data. Data continue to be written into RAM1 311, until or no longer include data, or RAM1 is full. When RAM1 was full, input side abandoned controlling subsequent signal to the side of reading, and RAM1 has been ready to be read out now with indication. This signal passes through between two kinds of asynchronous clock systems, therefore passes through the trigger of three synchronous usefulness.
IfRAM2 312 is empty, next data of coming in input side just are written into RAM2. Otherwise, behindRAM2 312 clearancens, just write. Point out to take turns to now when reading this and replacing buffer when circulation or priority encoder (which using decide on individual chip), the DRAM interface just reads the content of RAM1 311 and they is writeoutside DRAM 315. Then, cross asynchronous interface and send a signal back to, RAM1 311 is ready to now and can again be filled with expression.
If DRAM interface clearancen RAM1 311 also " replaced " before input side is full ofRAM2 312 to RAM1, then data can constantly be accepted by replacing buffer. Otherwise when RAM2 was full, alternately buffer can set low its acknowledge(ment) signal, is used for input side until RAM1 " has replaced " to return.
According to the present invention, read the work of buffer alternately similarly, but want the mountain peak bus of falling input and output.
The design of DRAM interface of the present invention makes it have maximum available memory bandwidth. Each data block of 8 * 8 exists in the same DRAM page or leaf. Therefore, can use fully the variety of way of the quick access to web page of DRAM, this mode provides first a row address, and many column address then are provided. Particularly, row address is supplied with by address generator, and column address is supplied with by the DRAM interface. This also will further discuss afterwards.
In addition, this interface also has such device, and the width that it allows to receive the data/address bus of outside DRAM is 8,16 or 32. Therefore, employed DRAM quantity can match with application-specific required size and bandwidth.
In this example (it is identical with the DRAM interfaces principle in the spatial decoder), address generator provides each read and write to replace the address of devices of buffer to the DRAM interface. This address is as the row address of DRAM. Six column address itself is provided by the DRAM interface, and these six also as the address that replaces buffer RAM. Extremely alternately the data/address bus of buffer is 32 bit wides. Therefore, if extremely the highway width of outside DRAM is less than 32, then next word from write alternately buffer and read or next word write study in alternately buffer before (read and write refers to the direction of transfer with respect to outside DRAM), must carry out the access of twice or four times outside DRAM.
If temporal decoder and video formatter, situation is just more complicated. The addressing of temporal decoder is more complicated to be because its prediction aspect, and this further discusses in this section. The addressing of video formatter is more complicated to be many standards aspect of exporting owing to video, and this further discusses in each joint relevant with video formatter.
As previously mentioned, temporal decoder has four alternately buffers: wherein two are used for the base (I) of read and write decoding and the pictorial data of prediction (P). This work is as aforementioned. Two are used for receiving prediction data in addition. These two buffers are comparatively interesting.
In general, prediction data will be according to x and the y value of moving dynamic vector regulation, from this piece position skew of processing. Therefore there is the general border of getting along well the data block when originally encoding (and writing DRAM) of that blocks of data of (to be retrieved) to be retrieved to conform to. This point represents in Figure 25, the piece that is forming with the area representative of shade, and the piece that the dotted outline representative is being calculated. The address transition that address generator will move the dynamic vector defined becomes piece skew (the monoblock number represents with large arrow) and pixel skew (representing with little arrow).
In address generator, frame point, matrix address and vector shift three addition remain from the address of the piece of DRAM retrieval with formation. If the pixel skew is zero, just only produce once request. If the skew of x direction or y direction is arranged, just produce Twice requests, i.e. original block address and be right after below address. If existing x skew has again the y skew, then produce four requests. For each piece that will retrieve, address generator calculates the initial sum halt address, and this preferably exemplifies.
Imagination has the pixel skew of (1,1), represents with the shaded area of Figure 26. Address generator is made four requests, and label in the drawings is A to D. The problem that solves is how Quick is for the row address sequence. Answer is to use " initial/as to stop " technology, is described as follows.
Investigate the piece A among Figure 26. Reading must (1,1) beginning and (7,7) end in the position in the position. The tentative byte (that is, 8 DRAM interface) of once reading. Three least significant bits in the x value calculated address of coordinate (co_ordinate pair) centering, the y value forms three highest significant positions. The initial value of x and y all is 1, so the address is 9. Data are read from this address, then x increment. This process repeats, and stops value until the x value arrives. At this moment the y value initial value that increases 1, x reloads, and providesaddress 17. After each data byte was read, the x value increased again until arrive its value that stops. This process repeats, and stops value until x and y value all arrive. So producedaddress sequence 9,10,11,12,13,14,15,17......, 23,25 ..., 31,33 ... ... 57 ..., 63.
Similarly, it is (1,0) and (7,0) that the initial sum of piece B stops coordinate, and piece C is (0,1) and (0,7), and piece D is (0,0) and (0,0).
Next problem is where these data should write. Clearly, see piece A, the data of reading fromaddress 9 write theaddress 0 that replaces the buffer, and should write theaddress 1 that replaces the buffer fromaddress 10 sense datas, etc. Similarly, the data that readaddress 8 in the piece B should write theaddress 15 in the buffer alternately, and the data of reading fromaddress 16 should be at writingaddress 15 in the buffer alternately. This function proof can have very simply realization, is summarized as follows.
Investigate piece A. When reading to begin, replace the anti-value (inverse) that the buffer address register is packed into and stopped to be worth. Y is counter, and the value that stops to form 3 highest significant positions, and x is counter, and the value that stops to form 3 least significant bits. In this case, when DRAM interface when externally DRAM readsaddress 9, alternately the buffer address is zero. Then, when the increment of outside DRAM address register, alternately buffer address register increment, this is consistent with appropriate prediction addressing.
So far, discussion concentrates on 8 DRAM interfaces. If 16 or 32 interfaces must be done a small amount of local modification. At first, must to pixel offset vector " pruning " (clip), make it point to 16 or 32 border. In the example that we use always, for piece A, first DRAM readssensing address 0, and the data inaddress 0 to 3 will be read. Secondly, must abandon unwanted data. This reads after all data being write register alternately (aequum when its physical capacity must be than 8 is large) and adding skew. When carrying out MPEG half pixel interpolation, must read from the DRAM interface with 9 bytes that x and/or y represent. In the case, address generator provides suitable initial sum halt address. In the DRAM interface, used some extra logics, but the working method of DRAM interface there is not basic change.
A bit be to provide additional information with indication which kind of processing to be data done to predictive filter to what temporal decoder DRAM interface of the present invention will be noted at last. This information is comprised of following:
" last byte " signal is to indicate last byte that once transmits (64,72 or 81 byte);
H.261 indicate;
Bi-directional predicted sign;
Two binary digits are with the size of expression piece (8 or 9 bytes representing with x and y); And
The number that is formed by two binary digits is to indicate each piece order.
The sign of last byte can produce when alternately buffer is read in data. All the other signals obtain from address generator, and transmit by the DRAM interface, so that when the predicted wave filter of data was read from replacing buffer, these signals and correct data block were associated.
In video formatter, data write outside DRAM one by one, but read by the grating order. Write with the method for the front spatial decoder that has illustrated identical. But read slightly more complex.
Data in the outside DRAM of video formatter are done so to arrange, so that 8 blocks of data of packing at least in a single page. These 8 is 8 continuous horizontal block. When raster scanning, must from each piece of these 8 continuous blocks, read 8 bytes, and it is write alternately buffer (that is, every same delegation in eight).
Investigate highest line (the supposition interface is a byte wide), x address (three minimum significance bits) are set to zero, too zero setting of y address (3 the highest significance bits). Then, when each 8 bytes are read out, the increment of x address. At this moment, the address high-order portion (ascending thethrone 6 and above position, because least significant bit is position 0) rise in value, and x address (3 minimum significance bits) are reset to zero. This process constantly repeats, until 64 bytes all run through. If to the interface width of outside DRAM be 16 or 32, need only respectively the x address be increased two or increase four, rather than increase one.
In the present invention, address generator can signal to the DRAM interface requirement and read to be less than 64 bytes (these needs may be arranged at a capable head or tail place of grating), although it always reads the multiple of 8 bytes. This can and stop value with initial value and reach. Initial value is used for the high position (position 6 and above position) of address, stops value and initial value relatively, when should stop to read to produce signal designation.
DRAM interface timing part has in the present invention used timing chain, so that the edge of DRAM signal is accurate to 1/4th cycles of system clock. With the clock of 90 ° of the phase place mutual deviations that come by phaselocked loop. They are combined into nominal twice clock. Then each chain forms with two parallel shift registers, with anti-phase each other twice clock work.
At first, there is a chain to produce the start of Page cycle, another generation read/write/refresh each cycle. The length in each cycle can be programmed by microprocessor. Through after such programming, the start of Page chain has fixing length; And the length of another periodic chain is variable during a start of Page, is as the criterion with suitable.
When resetting, each chain is eliminated, and produces a pulse. Pulse is advanced along each chain, and is subjected to from the control of the next status information of DRAM interface. This pulse produces the clock of DRAM interface. The clock cycle of each DRAM interface, therefore, because there is different length in the DRAM cycle, the speed of DRAM interface clock was not constant corresponding to the week of DRAM.
In addition, other has some timing chains will be combined with the information of coming from the DRAM interface from the pulse that above-mentioned each chain comes, and produces output gating and enable signal, such as notcas, notras, notwe, notbe. 12, predictive filter
Again consult Figure 12,17,18, particularly Figure 12, show there the block diagram of temporal decoder. It comprises predictive filter. The relation of predictive filter and termporal filter remainder illustrates in greater detail in Figure 17. The structural element of predictive filter is seen Figure 18 and Figure 28. The detailed description of predictive filter work can be found in " more detailed description of the present invention " joint.
In general, according to the present invention, predictive filter is in MPEG and mode and not using under the JPEG mode H.261. Please remember that in the JPEG mode, temporal decoder just is sent to video formatter with data, it does not do the decoding of any essence, except the sort of decoding that spatial decoder is done. Again consult Figure 18, in the MPEG mode, the forward and backward predictive filter is equal to, and they carry out filtering to MPEG forward and backward prediction piece respectively. Yet, in mode H.261, only use the forward prediction wave filter, because H.261 without back forecast.
Two predictive filters of the present invention are in fact identical. Again consulting Figure 18 and 28, particularly Figure 28, is the block diagram of predictive filter structure there. Each predictive filter is comprised of the level Four of series connection. Data enter form level 331, are placed among the form that carries out easily filtering. In next stage 332, the X-coordinate is carried out I-D calculate. After dimension buffer (dimension buffer)level 333 is finished necessary transposition, the prediction of carrying out the Y-coordinate in level 334. How to predict and to illustrate in greater detail afterwards. Need which filtering operation, this is defined by compression standard. If H.261, the filtering of carrying out and low pass filter similar.
Consult again Figure 17, many standard operations requirement forecast wave filter can be reconfigured so that or carry out MPEG or H.261 filtering, or do not carry out filtering in the JPEG mode. As other many reconfigurable aspects of three chip systems, predictive filter also is that the method with token is reconfigured. Token also is used for informing this particular job of address generator mode. Use this way, address generator just can provide to predictive filter the address of desired data, and these addresses difference between MPEG and JPEG is very large. 13, to the access of register
Most of registers in MPI (micro processor interface, MPI) can only work as they with it relevant level stop Shi Caineng and be modified content. Therefore, each is organized register and always typically links together with access register. Value in access register zero indicates that group register relevant with this access register and should not be modified. In access register, write 1, just ask certain level to quit work. Yet this level does not stop immediately, so the level access register is with retention value zero, until this level stops.
Any user software relevant with MPI and that finish the work by MPI all must " writing after 1 to certain request access register " wait for, untilread 1 from this access register. If during its access register zero setting, the user writes certain configuration register (configuration register) with value, and is consequently uncertain. 14, MPI
On all circuit in spatial decoder and temporal decoder, all used the MPI (MPI) of a byte wide of standard. The various clocks of the work of MPI and spatial decoder and temporal decoder are all asynchronous. Table A .6.1 in being described in more detail referring to the back, it shows the various MPI signal that is used on this interface. The character of signal illustrates in the I/O hurdle, and signal name illustrates in the signal name hurdle, and the explanation of semiotic function illustrates in the explanation hurdle. A.6.2 the power technology condition of MPI sees Table. All technical conditions are all classified according to type, and these types illustrate in title is the hurdle of symbol. Explanation illustrates in the parameter hurdle to these symbologies and so on. Actual technical conditions illustrate in corresponding minimum, maximum and each hurdle of unit.
Each DC (direct current) condition of work can be referring to Table A .6.3. The title on each hurdle and Table A .6.2's is identical in the table. The Dc electrical characteristics illustrate with Table A .6.4, other each column headings and Table A .6.2 and A.6.3 describe identical. 15, MPI's reads regularly
AC (interchange) characteristic that MPI reads timing diagram illustrates with Figure 54. Each provisional capital among the figure marks with corresponding signal name, regularly provides take nanosecond as unit. The Timing characteristics of reading that MPI is detailed represents with Table A .6.5. Title is that the column of number (Number) is used for representing that signal corresponding to the signal name shown in the characteristic hurdle. Column take MIN and MAX as mark provides respectively the maximum duration that shortest time that signal exists and signal can be used. Unit one hurdle provides to describe the measuring unit of signal. 16, the write timing of MPI
The general description of MPI write timing figure is seen Figure 54. The figure shows the title of each respective signals relevant with the MPI write timing. The title of signal, feature and other all physical characteristics are shown in the table 6.6. 17, keyhole address location (KEYHOLE ADDRESS LOCATIONS)
In the present invention, after the keyhole register, placed some Mapping memory unit of asking without frequentation. The keyhole register has two registers relevant with it. First register is the keyhole address register. Second register is the keyhole data register. Certain memory cell (location) in expanded address space inside has been stipulated in the keyhole address. A read operation of keyhole data register or write operation these unit by the appointment of lockhole address register have just been accessed. After access lockhole data register, with it relevant lockhole address register increment. Each access only has new value is write the keyhole address register, just may carry out random access in the address space inside of expanding. Circuit in the present invention can have a more than keyhole Mapping memory. But, do not exist between the different keyholes and influence each other. 18, PICTURE_END
Refer again to Figure 11, the general diagram of the spatial decoder that diagram is used in the present invention. Will be by the effect of PICTURE_END be described with this block diagram. The PICTURE_END function has the advantage that adapts to many standards, and it can process H.261 enciphering image information, mpeg signal and JPEG signal.
As previously mentioned, the system of Figure 11 interconnects with aforesaid two-wire interface. Each functional part arranges to such an extent that they are worked like that according to state machine configuration shown in Figure 10.
In general, according to the present invention, the function of PICTURE_END begins at the detector for initial code place, and the latter produces PICTURE_END control token. PICTURE_END control token arrives the DRAM interface by start control circuit without change. Here it is used to writing in the DRAM interface replaced the buffer removing totally. Please remember to only have when alternately buffer is full, its content just is written to RAM and goes. Yet an images may everywhere not finish at buffer, thereby causes the obstruction of pictorial data. The PICTURE_END token forces these data from alternately buffer output.
Because the present invention is many Standard Machines, for each compression standard, the operation of machine is different. More detailed say that machine has been done so sufficient narration: it is according to machine dependent each action cycle work. For each compression standard, can by the combination of control token and/or the output signal from MPU, from total available action cycle, select the cycle of certain number; Perhaps, can they be selected by the design of these tokens itself. From then on say on the meaning that the present invention organizes like this, information is postponed enter follow-up parts, until all information have been collected complete in upstream components. Native system is in wait state, until data have been ready to reach subordinate. The PICTURE_END signal is added to the coded data buffer by this way, and the control section of PICTURE_END signal is read out the content of data buffer and is added to the Huffman decoder and separates (video demultiple-xor) circuit with vision signal.
Another benefit of PICTURE_END control token is, for the reason in Huffman decoder and the demultiplexer use, the end of image is identified, even image does not typically reach the full journey of expection and/or do not have the desired so much signal of typical case to be added to Huuf-man decoder and vision signal split circuit. In this occasion, the information in the coded data buffer as a width of cloth completely image be added to Hufman decoder and vision signal separator. Therefore, Huffman decoder and vision signal separator still can be processed these data according to the design of system.
Another benefit of PICTURE_END control token is its fully clearancen coded data buffer, is not retained in by accident among the outer DRAM of sheet or replaces in the buffer so that do not have fragmentary information.
Another benefit of PICTURE_END function is that it is used for error recovery. For example, suppose in the just maintained data volume of coded data buffer to be less than the amount of typically using that the latter is used for describing the spatial information that is as the criterion with a single image. Therefore, that last images can be retained in data buffer until alternately buffer is full; Yet according to definition, this buffer can not reinject. Sometime, machine can determine certain condition of makeing mistakes of existence. Therefore, as long as the PICTURE_END token is decoded, and it forces the data in the coded data buffer to be added to Huffman decoder and vision signal separator, and last image is with regard to decodable code, and information also can be from the buffer clearancen. Consequently, machine can not enter the error recovery mode, and can successfully continue to process coded data.
Use another benefit of PICTURE_END token to be, the serial pipeline processor can continue to process not interrupted data. Can process the data of lacking than anticipated number owing to used PICTURE_END token, serial pipeline processor to be configured to, can continue so process. Owing to possess the condition of makeing mistakes, use the machine of prior art itself to stop, this is typical case. As previously mentioned, when macro block entered the storage areas of coded data buffer, this buffer was just to count of macroblocks. In addition, Huffman decoder and vision signal separator know that usually namely, the state machine of Huffman decoder and vision signal separator is partly known its to be processed number within the restore cycle of every images to the information content of every images decoding expection. When the piece number that arrives from the coded data buffer not to the time, originally can bear the wrong process of recovering by typical real estate. Yet, because PICTURE_END control token has reconfigured Huffman decoder and vision signal separator, state machine can work on, and this is because reconfigure and informed Huffman decoder and vision signal separator, state machine processing an amount of information really.
Consult Figure 10, the token decode of cache manager partly detects the PICTURE_END control token that is produced by detector for initial code again. Under normal operation, as the alternately normal operation of buffer of addressing in the past, cache register was filled then clearancen before this. Say once again, part is packed the alternately buffer of data into can clearancen, until it is filled fully and/or it is known clearancen time is up. PICTURE_END control token is decoded in the token decode part of cache manager, and the alternately buffer that it forces to only have part to pack into is sent content into the coded data buffer with own clearancen. This content directly or by the DRAM interface is delivered to Huffman decoder and vision signal separator. 19, clear operation
Another advantage of PICTURE_END control token is the associated working of it and FLUSH token. The FLUSH instruction is both irrelevant with reconfiguring of state of a control machine, also with to system provides data independence. But it uses former partial information completion for machine dependent state machine. Each such state machine is controlled token with FLUSH and is regarded the information of disregarding as. Therefore, the FLUSH token is used to the vacant part that the coded data buffer stays is filled, and allows one group of complete information to send to Huffman decoder and vision signal separator. Therefore, the FLUSH token is similar to the filling (padding) of buffer.
Token decode device in the Huffman circuit is recognized the FLUSH token, forces the pseudo-data that enter it but ignore by the FLUSH token. Before so the Huffman decoder only arrives to PICTURE_ END token and FLUSH token just those data contents of the final image buffer of existence operate. Use separately the PICTURE_END token or with another benefit that the FLUSH token is united use be the reconfiguring and/or recombinating of Huffman decoder circuit (reorgan-ization). Because the arrival of PICTURE_END token, the Huffman decoder circuit knows that it will have the information that is less than normal, expected to come final image is decoded. The Huffman decoding circuit finishes being included in the processing of the information in the final image, and this information is exported in the countercurrent fashion device through DRAM. In the end after the identification of image, the Huffman decoder enters immediately its reset mode and readjusts for the arriving of next pictorial information. 20, scavenging action
According to the present invention, the FLUSH token is in order to by whole pipeline processor, and guarantees that each buffer is reconfigured to wait for the arrival of new data by clearancen, other circuit. More particularly, the present invention comprises the combination of PICTURE_END token, filling word and FLUSH token, and it points out the image processing of current graphic form complete to the serial pipeline processor. After this, each state machine need to reconfigure to wait for new processing of new data arrival acceptance. Be also noted that the FLUSH token serves as resetting especially of native system. The FLUSH token resets it during by every one-level at it, but allows the continuation operations at different levels of back. This has just prevented loss of data. In other words, the FLUSH token is a kind ofly variablely to reset rather than definitely reset. 21, STOP_AFTER_PICTURE
The STOP_AFTER_PICTURE function is used for going the operating a certain logical point of compressor circuit to stop its work at the serial flow waterline. At this moment, produce the PICTURE_END token, designation data has finished to enter from Data In-Line, and the filling operation is also finished. The data token of part vacancy is filled in the filling operation. Then, produce the FLUSH token, it is released all information by the serial flow line system from register, force these registers to get back to the stand-by state of their neutrality. In other words, the PICTURE_END token sends the signal that image finishes, and STOP_AFTER_PICTURE then sends the signal that all current operations finish. 22, many standard search mode
Another characteristic of the present invention is to have used SEARCH_MODE control token, and it is used for reconfiguring the input of serial pipeline processor, the bit stream that enters with inspection. When the setting search mode, detector for initial code is only searched for for the specified start code of arbitrary compression standard or mark. Yet, it must be understood that, for realizing this purpose, also can be used to other image from other data bit stream. Therefore, these images can use in whole the present invention, and becoming another kind of the realization, the latter can also mix except reconfigurable circuit is arranged and uses control token and data token similarly to be operated.
In the present invention, the use of way of search suits under many occasions, wherein has: 1) if the interruption of data bit stream occurs; 2) when the user is interrupted data bit stream with the way of having a mind to change passage (for example, sending data here with the cable that transmits compressed digital video signals); Or 3) when the user when controlled data source (such as CD or optic disk) starts F.F. or fall soon. In general, when the user does not expect that at machine the place of interrupt interrupts the serial flow waterline when normally moving, way of search suits.
When any way of search was set up, detector for initial code was just sought the initial map of input that is suitable for producing machine-independent token. All data that entered detector for initial code before the identification initial map relevant with standard all are considered to meaningless and abandon, and machine is in idle condition when this information of wait.
Detector for initial code can be taked any in many configurations. For example, one of these configurations allow search one picture group to resemble or allow the initial code of higher degree. This pattern makes detector for initial code abandon its all inputs, and removes to seek the group_start Standard Mapping. When recognizing such map, detector for initial code just produces the GROUP_START token, and then way of search automatically resets.
Be important to note that Huffman decoder and vision signal separator are the single circuits according to multiple input signal work. These input signals have: the CODING_STANDATD signal also has each setting (set-up) signal irrelevant with standard. CODING_STANDA-RD signal will directly send from incoming bit stream, be Huffman decoder and the required information of vision signal separator. Yet, Huffman decoder and vision signal split circuit be with the operation of the irrelevant burst of standard under work.
Select this working method to be because it is the most effective, and according to its original design, can be used in the occasion of special control token. Use these special tokens, the signal relevant with standard can be sent into Huffman decoder and vision signal separator rather than transmit these actual signals itself. 23, countercurrent fashion
Countercurrent fashion is a common feature of all three kinds of standards, and it is identical for all three kinds of standards. In general, the data token in the token-caching device contains the information of the value that is quantized coefficient (quantized coefficients), also contains the information (this is a kind of form of run length coding, RLC) of the zero number that shows between coefficient. Countercurrent fashion device of the present invention has adapted to token and has used, and it has just expanded the information of relevant zero the distance of swimming, makes data token contain 64 essential values. After this, the value in data token is exactly the coefficient that can be quantized by the process that the inverse quantization device uses. 24, inverse quantization device
Inverse quantization device of the present invention is the essential parts to sequential decoding, but it has been embodied as the whole integrated circuit group processing of permission multi-standard certificate. In addition, the inverse quantization device has been fit to use with token. The inverse quantization device is positioned between countercurrent fashion device and the anti-DCT (IDCT).
For example, in the present invention, the adder in the inverse quantization device is used for adding certain and counts to pixel decoder number before data are delivered on the IDCT.
IDCT has used pixel decoder number, and this number changes along with employed each standard of information coding. For information is correctly decoded, before data continue to be sent to IDCT, by the inverse quantization device 1024 value is added in the decoding number.
Adder is in the inverse quantization device. Standardization when using these adders to make data before the quantizer arrive IDCT. This has just saved Circuits System or software extra in the integrated circuit when processing the data of being compressed by various criterion. Other allow the operation of many standards operate in " operation after quantizing " during finish, discussion sees below.
Decoded with the control token of data, need the program of the various realizations border standardization that the inverse quantization device finishes to be identified, identification is as detailed below. The repetition of circuit has all been avoided in the realization of these " after quantizing " operations, also all allows this integrated circuit to process many standard codes data. 25, Huffman decoder and Parser machine
Consult Figure 11 and 27, spatial decoder comprises the Huffman decoder again, and the latter is used for decoding to being made the Huffman coded data by different compression standards.
JPEG, MPEG and H.261 each of standard all need some data is made Huffma-n coding, but every kind of desired Huffman decoding of standard is but different in some importances. In spatial decoder of the present invention, be not design and make three each other Huffman decoders, one of every kind of standard; The present invention identifies the common aspect of every kind of Huffman decoder, only makes these aspects once, saves valuable label space with such method. In addition, use cleverly many parts (multi-part) algorithm, made the many-sided of Huffman decoder of every kind of standard also can be common with other standard, otherwise just do not accomplished this point.
Briefly,Huffman decoder 321 is worked with other unit, sees Figure 27. These other unit areParser state machines 322,input shift unit 323,data directory unit 324,ALU 325, andtoken formatter 326. As previously mentioned, the connection between these parts is controlled by two-wire interface. Working condition about these unit will be done more detailed introduction afterwards, stress the certain situation of Huffman decoder aspect the many standard operations of support according to the present invention here.
Parser state machine of the present invention is programmable state machine, and its effect is the remainder of coordinating video Parser (Video Parser). As the response to data, the Parser state machine produces therewith data control word side by side, and control word is transferred to other parts of system, these other parts is worked, thereby reach the purpose of control. Because these parts connect by two-wire interface, transfer control word and relevant data are not only useful but also essential abreast. Like this, data and control just arrive simultaneously. In Figure 27, the transmission of control word represents that withcontrol line 327 it is positioned under thedata wire 328 of connecting components. Except other task, this code word identification is just at that decoded specific criteria.
Huffman decoder 321 also has some control function. Particularly,Huffman decoder 321 contains and can controldata directory unit 324 and calculate the state machine of patrollingunit 325 some function. It is necessary controlling these unit by the Huffman decoder, in order to component-level information is made appropriate decoding. Determine if allowParser state machine 322 make these, will spend the too many time.
An importance of Huffman decoder of the present invention is can be to reading in each coded data bit negate of Huffman decoder. This is necessary to the Huffman code decoding of type H.261 the time, because the sort of special type and the employed code of JPEG of the employed Huffman code that H.261 (in fact also has MPEG) have opposite polarity. Therefore, use phase inverter in fact just to allow the Huffman decoder that three kinds of standards are used same table. How the Huffman decoder realizes that the other side of all three kinds of standards discusses in more detail in " to more detailed description of the present invention ".
Second one in multi-section (muti-part) algorithm finished indata directory unit 324. There is a look-up table that actual Huffman decoding (decoded) data are provided this unit. The call number that the tissue of each list item is produced take the Huffman decoder is as the basis.
Each remainder that the multi-section algorithm is realized in unit (ALU) 325 is patrolled in calculation. Particularly, ALU processing signals expansion. ALU also comprises the register group (register file) that keeps vector prediction and DC prediction, and their use is discussed in each joint relevant with predictive filter. In addition, ALU also has some counters, and they align by the structure of the image of spatial decoder decoding and count through and through, especially, the size of image is programmed, and sends in the register relevant with counter, and this has just made things convenient for the initial of the detection of " image is initial " and each macro block code.
According to the present invention, token formatter 326 (Token formatter, TF) will be assembled in the data token through the data of decoding and go, and then token is sent to other level or other parts of spatial decoder.
In the present invention, 323 receptions of input shift unit are from the data of FIFO (first in first out). The FIFO buffer memory is by the data of detector for initial code. The data that received by the input shift unit generally belong to two types: data token and initial code. Detector for initial code has been used these initial codes token replacement separately, and this further discusses in token one joint. Attention: occur at most in the data, the chances are needs the data token of decoding.
Input shift unit 323 with data serial deliver toHuffman decoder 321. On the other hand, its transfer control token concurrently. In the Huffman decoder, the Huffman coded data is decoded according to the first of multi-section algorithm. Especially, specific Huffman code is identified, and then index of reference number is replaced it.
Huffman decoder 321 is also identified some data that need to be done by other parts among Figure 27 specially treated. This data comprise block end and withdraw from. In the present invention, the 321 interior detections of Huffman decoder they, rather than detect them in thedata directory unit 324, this has just saved the time.
Then this call number is sent todata directory unit 324. The data directory unit is look-up table in essence. According to an aspect of algorithm, the Huffm-an code table of look-up table and JPEG regulation is very nearly the same. In general, look-up table uses compression (condensed) data format, and this form is JPEG defined when transmitting the JPEG table that substitutes.
The call number of process decoding or other data are delivered toALU 325 with the control word of following fromdata directory unit 324, and the latter carries out aforesaid operations.
FromALU 325 out, data and control word are sent to token formatter 326 (TF). In the token formatter, data basedly need to make up to form token with control word, then token is sent to the following at different levels of spatial decoder. Note, at this moment herein, system will with token had. 26, reverse discrete cosine transform (Inverse Discretc Cosine Tramform)
According to the present invention, reverse discrete cosine transform (IDCT) goes compression to the data relevant with the frequency of visual DC composition. When the specific image of a width of cloth was just compressed, the frequency of light was quantized in the image, thereby had reduced the information content that needs storage. IDCT gets this data through quantizing, and it is gone compression, is reduced into frequency information.
IDCT is to the part operation of image. The size of this part is 8 * 8 pixels. Mathematical operation to these data is mainly decided by used specific criteria that data are encoded. Yet, in the present invention, effectively utilized the common mathematical operations of each standard avoiding the unnecessary repetition of Circuits System.
Owing to used specific calibration order (scaling order), algorithm up and down two-part symmetry has improved, so common mathematical operations can reuse, has eliminated extra Circuits System.
IDCT responds many many standard tokens. The data that the first of IDCT checks into, correct with the size of guaranteeing data token, be fit to process. In fact, in certain this occasion, if error is not too large, token streams can be corrected. 27, cache manager
In the video information of cache manager acceptance input of the present invention, it returns the timing information that address generator provides data arriving, demonstration and frame frequency. Used a plurality of buffers to allow to occur the variation of (presentation) rate and demonstration (display) rate. In a typical case, always occurrence rate and display rate with coded data and showing that the monitor of information changes before this. The speed that data arrive changes with the mistake in coding, the decoding or with the source material that produces data usually. When information was come cache manager, it was gone compression. Yet data are according to going the useful order of compressor circuit to arrange, rather than the useful order of specific display unit is arranged. When a blocks of data entered cache manager, cache manager provided information to address generator, and this data block can be placed according to the order that display unit can be used. Buffer manager for use will be adjusted the conversion of the needed frame of input block speed and take into account, so that data block can show in employed particular display device.
In the present invention, the main task of cache manager is to provide information to address generator. However, yet want it to connect with other parts of native system. For example, have an interface to link to each other with input FIFO, FIFO is sent to cache manager with token, and cache manager passes to the write address generator with these tokens again.
Cache manager also connects with the address generator of display, whether is ready to show the information of new data to receive this display unit. Cache manager confirms that also the address generator of display removed the demonstration information in the buffer.
Cache manager of the present invention is monitoring all the time whether specific buffer is empty, full, is preparing to use or using. It is also monitoring the appearance relevant with particular data in each buffer number (presentation number) all the time. Therefore, cache manager partly determines the state of these buffers with the method that once only makes a buffer be ready to show. In case a certain buffer is shown, this buffer just is in " free time " state. When cache manager receives PICTURE_START, FLUSH, effectively or during access token, it just determines the state of each buffer and accepts the preparation of new data. For example, PICTURE_STA-RT token makes cache manager make an inspection tour each buffer, can accept new data so which to be found.
Cache manager also can be configured to process the desired multiple standards of token that it is received. For example, in standard H.261, during showing, data may be skipped. If such token is to cache manager, the data that skip will be eliminated from the buffer of storing it.
Therefore, by the management to buffer, can according in order to the compression standard of data codings, according to the speed of data decode with according to the display unit of use particular type, data be shown effectively.
Believe that above-mentioned explanation realizes general conception of the present invention, system and enough detailed suitable narration has been done in various work so that have the general technical ability of one's own profession the people can by its all characteristic, target and advantage be made and the actual the present invention of use. Yet, for the ease of the present invention there being further more intensive understanding, for the ease of obtain with various embodiments of the present invention more specifically, the relevant subsidiary details of more commercial realization, further specifying and explaining below suggestion is read.
This is the more detailed description to a multi-standard video decoder chipset. It is divided into three major part: A, B and C. For the ease of the convenience of tissue, succinct and explanation, being disclosed in the following part that this is additional is established.
The public characteristic of chipset chips is described:
Token (TOKEN)
Two-wire interface
The DRAM interface
MPI
Clock
The explanation of space decoding chip
The explanation of time decoder chip A.1
The first declaratives have comprised the overwhelming majority circuit design problem relevant with using chipset. A.1.1 typesetting and printing is arranged
For the information of emphasizing some type has been used the little collection of printing typesetting and printing agreement: the effective Wire_name signal of NAMES_OF_TOKENS token name Wire_name signal high level Low level effective register_name register name is Video Decoder series 30MHz operation decodes MPEG A.2, JPEG and H.261 the coded data rate reach 704 * 480 to 25 Mb/S video data rate to 21MB/S MPEG resolution ratio, 30Hz, the PQFP type encapsulation absolute coding data of sticking 208 lead legs of (Glue-less) page mode DRAM interface of the few subsides of 4: 2: 0 full JPEG baselines of variable chroma (baseline) decoding and decoder clocks are reset (Re=orders) MPEG image sequence Video Decoder series provides a low chip-count scheme to realize the high resolution digital video decoder. For supporting three kinds of different videos and image encoding system: JPEG, MPEG and H. 261, chipset is normally configurable.
Full JPEG baseline image decoding is supported. 720 * 480,30Hz, the video of 4: 2: 2 JPEG coding can be by real-time decoding.
CIF (common interchange format) and QCIF H.261 video can be decoded. Have up to 740 * 480,30Hz, full feature (featrue) the MPEG video of 4: 2: 0 forms can be decoded.
Attention: above-mentioned value just needn't be limited to this to illustrating of one embodiment of the invention. Correspondingly, the use of other value and/or scope will be appreciated. A.2.1 System Construction output format A.2.1.1
In the middle of the example that provides below, each example all needs the output format device of certain form, with the data that occur in the output that obtains spatial decoder or temporal decoder, and is that computer or display system are to its reformatting. The details of this format is different to different application. Under certain simple scenario, whole required just address generators obtaining the block format data by decoder chip output, and write memory with a raster order with it.
The pixel format device is a single-chip VLSI equipment, and it provides large-scale output format function. A.2.1.2 JPEG still image decoding
A single spatial decoder that does not have an outer DRAM of chip baseline jpeg image of can decoding rapidly. Spatial decoder will be supported all features of baseline jpeg. Yet dimension of picture that can be decoded will be subject to the restriction of the output buffer size that the user provides. The characteristic of output format device may limit chroma sampling form and the color space that can be supported. A.2.1.3 JPEG video decode
Add the video image that the outer DRAM of chip can real-time decoding JPEG mode encodes to spatial decoder. The size of needed buffer and speed will be according to video and coded data rates and are decided. The temporal decoder video that the JPEG mode is encoded that do not need to decode. Yet if temporal decoder comes across in the chipset of standard decoder more than, when system is jpeg operation when disposing, it only makes data communication device cross the time decoder and does not do any change or modification. A.2.1.4 H.261 decoding
Realize a H.261 Video Decoder, spatial decoder and temporal decoder all need. This DRAM is configurable to the interface of above-mentioned two kinds of devices, can reduce in order to finish the required DRAM quantity of proper handling when using the work of little pixel format and low coded data rate. Say that typically each spatial decoder and temporal decoder need a single 4Mb (namely 512K * 8) DRAM. A.2.1.5 mpeg decode
Finish the required configuration of MPEG operation with H.261 required identical. Yet the larger pixel format that will support MPEG to occur may need larger DRAM buffer, and the people of technical ability knows together as having this area. A.3 token token form A.3.1
Corresponding to the present invention, token provides widely form for the communication information by the decoder chip collection. Although in the present invention, each word of token has 8 bit widths at least, has in this area that the people of general technical ability knows together, and token can be any width. A single token can be striden one or several word in addition; This can realize by use an extension bits in each word. The form of token is summarised among the Table A .3.1.
Whether token of extension bits indication proceeds to another word. Extension bits all is set to 1 except the last character in all words of token. If it is 0 extension bits that the first character of token has, just illustrate that this token only has a word length.
Each token is identified by an address field, and it is from the 7th bit of token first character. This address field length is variable, and can reach a plurality of words (not having length to surpass the address of 8 bits in current chip, can be any length yet the people of this general technical ability in field of tool will recognize the address again).
Some interface transmits the data that surpass 8 bits. For example, the output of spatial decoder has 9 bit widths (comprising that extension bits then is ten bits). Unique token of obtaining benefit from these additional bit is data token. Carry out processing for certain ad-hoc location in system, data token can have many arbitrarily bits as required. Other all tokens are all ignored extra bit. A.3.2 data token
Data token is processed level with data from one and is taken the next one to. Therefore, the characteristic of this token changes during by decoder at it. And the data implication that data token carries changes according to the position of data token in system, that is to say, data are fixed according to the position. In this, according to the position of data token in spatial decoder, data both can be that frequency domain also can be the pixel domain data. For example, in the input of spatial decoder, data token carries the Bit String coding video frequency data that is assembled in 8 bit words. Here, the length of every token without limits. Yet by comparison, in the output of spatial decoder, each data token just carries 64 words and each word is 9 bit widths. A.3.3 use the data of token format
In the middle of some was used, circuit need to directly link to each other with input or the output of decoder or chipset. As a rule, collect data token and that several tokens of synchronizing information (such as PICTURE_START) that provide are provided is enough. In this, see also the A.16 part of back, " output of connection space decoder " and A.19 the part " output of connect hours decoder ".
As discussed above, the activity of observing extension bits is enough to identify when a new token begins. Moreover, the extension bits sign the last word of current token. In addition, address field can be tested with the identification token. Do not need the token that maybe can not identify to be eliminated (and abandoning) and need not know their content. Yet the token that is identified can cause the appearance of an appropriate action.
In addition, the data that input to spatial decoder can be used as the data byte that is encoded or are provided with the form of data token (seeing A.10 part, " coded data input "). Provide token that many characteristics of decoder chip collection are set up from data flow by the coded data port or by MPI. This provides the another kind of approach of finishing configuration by MPI.
Table A .3.1token list
 7  6  5  4  3  2  1  0Thetoken titleReference
 0  0  1QUANT_SCALE
 0  1  0PREDICTION_MODE
 0  1  1(reservation)
 1  0  0MVD_FORWARDS
 1  0  1MVD_BACKWARDS
 0  0  0  0  1QUANT_TABLE
 0  0  0  0  0  1DATA
 1  1  0  0  0  0COMPONENT_NAME
 1  1  0  0  0  1DEFINE_SAMPLING
 1  1  0  0  1  0JPEG_TABLE_SELECT
 1  1  0  0  1  1MPEG_TABLE_SELECT
 1  1  0  1  0  0TEMPORAL_REFERENCE
 1  1  0  1  0  1MPEG_DCH_TABLE
 1  1  0  1  1  0(reservation)
 1  1  0  1  1  1(reservation)
 1  1  1  0  0  0  0(reservation) SAVE_STATE
Table A .3.1 token list (continuing)
 7  6  5  4  3  2  1  0Thetoken titleReference
 1  1  1  0  0  0  1(reservation)RESTORE_STATE
 1  1  1  0  0  1  0TIME_CODE
 1  1  1  0  0  1  1(reservation)
 0  0  0  0  0  0  0  0NULL
 0  0  0  0  0  0  0  1(reservation)
 0  0  0  0  0  0  1  0(reservation)
 0  0  0  0  0  0  1  1(reservation)
 0  0  0  1  0  0  0  0SEQUENCE_START
 0  0  0  1  0  0  0  1GROUP_START
 0  0  0  1  0  0  1  0PICTURE_START
 0  0  0  1  0  0  1  1SLICE_START
 0  0  0  1  0  1  0  0SEQUENCE_END
 0  0  0  1  0  1  0  1CODING_STANDARD
 0  0  0  1  0  1  1  0PICTURE_END
 0  0  0  1  0  1  1  1FLUSH
 0  0  0  1  1  0  0  0 FIELD_INFO
Table A .3.1 token list (continuing)
 7  6  5  4  3  2  1  0Thetoken titleReference
 0  0  0  1  1  0  0  1  MAX_COMP_ID
 0  0  0  1  1  0  1  0  EXTENSION_DATA
 0  0  0  1  1  0  1  1  USER_DATA
 0  0  0  1  1  1  0  0  DHT_MARKER
 0  0  0  1  1  1  0  1  DQT_MARKER
 0  0  0  1  1  1  1  0(reservation)DNL_MARKER
 0  0  0  1  1  1  1  1(reservation)DRI_MARKER
 1  1  1  0  1  0  0  0(reservation)
 1  1  1  0  1  0  0  1(reservation)
 1  1  1  0  1  0  1  0(reservation)
 1  1  1  0  1  0  1  1(reservation)
 1  1  1  0  1  1  0  0  BIT_RATE
 1  1  1  0  1  1  0  1  VBV_BUFFER_SIZE
 1  1  1  0  1  1  1  0  VBV_DELAY
 1  1  1  0  1  1  1  1  PICTURE_TYPE
 1  1  1  1  0  0  0  0  PICTURE_RATE
 1  1  1  1  0  0  0  1  PEL_ASPECT
Table A .3.1 token list (continuing)
 7  6  5  4  3  2  1  0Thetoken titleReference
 1  1  1  1  0  0  1  0HORIZONTAL_SIZE
 1  1  1  1  0  0  1  1VERTICAL_SIZE
 1  1  1  1  0  1  0  0BROKEN_CLOSED
 1  1  1  1  0  1  0  1 CONSTRAINED
 1  1  1  1  0  1  1  0(reservation)SPECTRAL_LIMIT
 1  1  1  1  0  1  1  1DEFINE_MAX_SAMPLING
 1  1  1  1  1  0  0  0(reservation)
 1  1  1  1  1  0  0  1(reservation)
 1  1  1  1  1  0  1  0(reservation)
 1  1  1  1  1  0  1  1(reservation)
 1  1  1  1  1  1  0  0HORIZONTAL_MBS
 1  1  1  1  1  1  0  1VERTICAL_MBS
 1  1  1  1  1  1  1  0(reservation)
 1  1  1  1  1  1  1  1(reservation)
A.3.4 token explanation
This part has provided the token that is provided according to the present invention as data in spatial decoder and temporal decoder. See Table A.3.2 and to note: " r " represents those current are retained and value is 0 bit. All integers all are unsigned numbers, unless specialize.
The token that Table A .3.2 provides in spatial decoder andtemporal decoder
  E
 7 6 5 4 3 2 1 0Explanation
  1 1 1 1 0 1 1 0 0BIT_RATE only does to test to enter to carry MPEG bit-rateparameters R. Produce 18 bit integer by the MPEG definition of b-when the decoding-MPEG Bit String by Huffman (Huffman) decoder
  1 r r r r r r b b
  1 b b b b b b b b
  0 b b b b b b b b
  1 1 1 1 1 0 1 0 0BROKEN_CLOSED carries two MPEG flag bit C-closed_gap b-broken_link
  0 r r r r r r c b
  1 0 0 0 1 0 1 0 18 bit integer of an indication of CODING_STANDARD S-present encoding standard. It is worthwhile frontly by tax to be: 0-H.261 1-JPEG 2-MPEG
  0 s s s s s s s s
  1 1 1 0 0 0 0 c cCOMPONENT_NAME contacts the relation between a component ID and the component name. Referring to ... C-2 bit component ID n-8 bit component " title "
  0 n n n n n n n n
The token that Table A .3.2 provides in spatial decoder and temporal decoder (continuing)
 E 7 6 5 4 3 2 1 0Explanation
 1 1 1 1 1 0 1 0 1CONSTRAINED C-carries by the next constraint parameter sign of MPEG Bit String decoding
 0 r r r r r r r c
 1 0 0 0 0 0 1 c cData are carried data communication device and are crossed 2 bit integer component ID (seeing A.3.5.1) of decoder chip collection C-. This field is not (not being picture dot information) that defines for the token that carries coding
 1 d d d d d d d d
 0 d d d d d d d d
 1 1 1 1 1 0 1 1 1DEFINE_MAX_SAMPLING maximum horizontal and vertical sampling number. They have illustrated the largest block number of horizontal/vertical in any macro block component. See A.3.5.2 h-2 bit level number of samples v-2 bit vertical sampling number
 1 r r r r r r h h
 0 r r r r r r v v
 1 1 1 0 0 0 1 c cThe horizontal and vertical sampling of a certain particular color component of DEFINE_SAMPLING. See A.3.5.2 C-2 bit component ID h-2 bit level hits V-2 bit Vertical Sampling number
 1 r r r r r r h h
 0 r r r r r r v v
The token that Table A .3.2 provides in spatial decoder and temporal decoder (continuing)
  E 7 6 5 4 3 2 1 0Explanation
  0 0 0 0 1 1 1 0 0This token notice video distributor of DHT_MARKER, the data token of its back includes the explanation of Huffman table, provides the syntax of using JPEG " definition Huffman table section ". This token only is that this legal token is produced in the JPEG decode procedure by detector for initial code when a DHT mark occurring in the serial data when coding standard isJPEG
  0 0 0 0 1 1 1 1 0This token notice video distributor of DNL_MARKER data token thereafter includes JPEG Parameter N L, and this parameter has been specified the line number in the frame. This token is produced in the JPEG decode procedure by detector for initial code when a DNL mark occurring in the serial data.
  0 0 0 0 1 1 1 0 1The data token of this its back of token notice video distributor of DQT_MARKER includes the quantization table explanation, provides the syntax of using JPEG " definition quantization table section ". It is legal that this token only is only when coding standard is JPEG. Video distributor produces an OUANT_TABLE token, and it comprises new quantization table information. This token is produced in the JPEG decode procedure by detector for initial code when a DQT mark occurring in the serial data.
  0 0 0 0 1 1 1 1 1The data token of this its back of token notice video distributor of DRI_MARKER includes the JPEG parameters R; This parameter indicates the number of the minimum code unit that restarts between the mark. This token is when a DRI mark occurring in the serial data, is produced in the JPEG decode procedure by detector for initial code.
The token that Table A .3.2 provides in spatial decoder and temporal decoder (continuing)
  E 7 6 5 4 3 2 1 0Explanation
  1 0 0 0 1 1 0 1 0The data token of this its back of token notice video distributor of EXTENSION_DATA JPEG includes growth data. Referring to A.11.3 " initial code is to the conversion of token ", and A.14.6 " reception of user and growth data ". Eight bit fields in jpeg operation carry JPEG mark value. This so that the classification of growth data can be identified.
  0 v v v v v v v v
  0 0 0 0 1 1 0 1 0The data token of this its back of token notice video distributor of EXTENSION_DATA MPEG includes growth data. Referring to A.11.3 " initial code is to the conversion of token ", and A.14.6, " reception of user and growth data ".
  1 0 0 0 1 1 0 0 0   FIELD_INFO
  0 r r r t p f f fCarry about the demonstration of visual thereafter information with auxiliary this image. This function be can't help any existing coding standard and is sent. If the t-image is an interlaced frame, the field above the indication of this bit is first (t=0) or second whether. If the p-image is field, it indicates next image is upper field (p=0) or lower field in the frame. 3 bit numbers of f-indicate the position of field in 8 field PAL sequences.
  0 0 0 0 1 0 1 1 1FLUSH is used to indicate the ending of present encoding data and decoder is passed through in the ending of promotion serial data.
  0 0 0 0 1 0 0 0 1When GROUP_START finds visual initial code group in decoding MPEG process, or be produced when finding the frame mark in the decoding JPEG process.
The token that Table A .3.2 provides in spatial decoder and temporal decoder (continuing)
  E 7 6 5 4 3 2 1 0Explanation
  1 1 1 1 1 1 1 0 013 bit integer of HORIZONTAL_MBS h-, the horizontal width of the image that indication represents with macro block.
  1 r r r h h h h h
  0 h h h h h h h h
  1 1 1 1 1 0 0 1 0HORIZONTAL_SIZE h-16 bit integer is indicated the horizontal width with the image of pixel expression. It can get any integer value.
  1 h h h h h h h h
  0 h h h h h h h h
  1 1 1 0 0 1 0 c cWhich quantization table JPEG_TABLE_SELECT notice inverse quantizer selects in particular color component. C-2 bit component ID (seeing 1.3.5.1) t-2 bit integer table.
  0 r r r r r r t t
  1 0 0 0 1 1 0 0 1MAX_COMP_ID m-2 bit integer, the maximum (seeing A.3.5.1) of the component ID that indication will be used in next image.
  0 r r r r r r m m
  0 1 1 0 1 0 1 c cWhich DC coefficient Huffman table the MPEG_DCH_TABLE configuration should use for color component CC. C-2 bit component ID (seeing A.3.5.1) t-2 bit integer table number code.
  0 r r r r r r t t
The token that Table A .3.2 provides in spatial decoder and temporal decoder (continuing)
  E 7 6 5 4 3 2 1 0Explanation
  0 1 1 0 0 1 1 d nWhether MPEG_TABLE_SELECT notice inverse quantizer uses default for inner or non-internal information or the user defines quantization table. N-0 indicates internal information, and the non-inner d-0 of 1 indication indicates default table, 1user definition
  1 1 0 1 d v v v vMVD_BACKWARDS carries backward that the one-component of motion vector (horizontal or vertical) d-0 refers to the x component, and the complement code number of 1 finger y component v-12bit 2, least significant bit provide the resolution ratio of 1/2 pixel
  0 v v v v v v v v
  1 1 0 0 d v v v vMVD_FORWARDS carries the one-component (horizontal or vertical) of the vector that travels forward. D-0 refers to the X component, 1 finger Y component. The complement code number of v-12bit 2, least significant bit provides the resolution ratio of 1/2 pixel
  0 v v v v v v v v
  0 0 0 0 0 0 0 0 0NULL does not do anything.
  1 1 1 1 1 0 0 0 14 bit integer by the MPEG definition of PEL_ASPECT p-.
  0 r r r r p p p p
  0 0 0 0 1 0 1 1 0PICTURE_END is added by detector for initial code, indicates the end of current image.
The token that Table A .3.2 provides in spatial decoder and temporal decoder (continuing)
  E 7 6 5 4 3 2 1 0Explanation
  1 1 1 1 1 0 0 0 04 bit integer of PICTUE_RATE p-are defined by MPEG.
  0 r r r r p p p p
  1 0 0 0 1 0 0 1 0PICTURE_START indicates the beginning of the new image of a width of cloth. 4 binary image index of being distributed to image by detector for initial code of n-
  0 r r r r n n n n
  1 1 1 1 0 1 1 1 1The bi-directional predicted 3-DC of 2 bit integer 0-inner 1-prediction 2-of the picture coding type of an indication of PICTURE_TYPE MPEG p-back image is inner
  0 r r r r r r p p
  1 1 1 1 0 1 1 1 1H.261, PICTURE_TYPE indicates different H.261 options to be out (1) or closes (0). and these options always close for MPEG and JPEG. S-division screen indicator. D-data video camera f-removes image and freezes the source image form: q=0QCIF q=1CIF
  1 r r r r r r 0 1
  0 r r s d f q 1 1
The token that Table A .3.2 provides in spatial decoder and temporal decoder (continuing)
  E 7 6 5 4 3 2 1 0Explanation
  0 0 1 0 h y x b fFlag bit collection of PREDICTION_MODE, the prediction mode of indication back macro block. F-predicts that forward b-predicts that backward the x-vector forecasting y-posteriorly directed force prediction h-that resets that resets forward enables H.261ring wave filter
  0 0 0 1 s s s s5 bit integer of QUANT_SCALE notice inverse quantizer one new scalar factor s-from 1 to 31 scope. The 0th, retention.
  1 0 0 0 0 1 r t tQUANT_TABLE loads the inverse quantizer table of appointment with 64 8 bit unsigned integer. Value is arranged 2 bit integer that t-indicates the inverse quantizer table of intending loading by the z font.
  1 q q q q q q q q
  0 q q q q q q q q
  0 0 0 0 1 0 1 0 0SEQUENCE_END MPEG seguence_end_code and JPEG EOI mark are produced this token.
  0 0 0 0 1 0 0 0 0SEQUENCE_START is produced by MPEG sequence initial code.
The token that Table A .3.2 provides in spatial decoder and temporal decoder (continuing)
  E 7 6 5 4 3 2 1 0Explanation
  1 0 0 0 1 0 0 1 1SLICE_START is corresponding to MPEG slice_start, H.261 GOB and JPEG sync interval again. The meaning of 8 bit integer " s " is different with the different coding standard. MPEG sheet upright position-1 is piece group number-1 JPEG sync interval sign (only 4 lowest orders) more H.261
  0 s s s s s s s s
  1 1 1 0 1 0 0 t tTEMPORAL_REFERENCE t-carries the time reference. For MPEG, this is 10 bit integer. To H.261, only have 5 lowest orders to be used, a high position will always be 0.
  0 t t t t t t t t
  1 1 1 1 0 0 1 0 dThe 6 visual bit integer of 6 bit integer p-indication of 6 bit integer of the 5 bit integer m-indication of TIME_CODE MPEG time _ code-freeze frame sign-indication hour minute-indication second
  1 r r r h h h h h
  1 r r m m m m m m
  1 r r s s s s s s
  0 r r p p p p p p
  1 0 0 0 1 1 0 1 1The data token of USER_DATA JPEG token notice video distributor back includes user data. See A.11.3,, " initial code is to the conversion of token " and A.14.6, " reception of user and growth data ". In jpeg operation, 8 bit fields " V " carry JPEG mark value. This so that the classification of user data can be identified.
  0 v v v v v v v v
The token that Table A .3.2 provides in spatial decoder and temporal decoder (continuing)
  E 7 6 5 4 3 2 1 0Explanation
  0 0 0 0 1 1 0 1 1The data token of this token notice video distributor back of USER_DATA MPEG includes user data. See A.11.3 " initial code is to the conversion of token " and A.14.6 " reception of user and growth data ".
  1 1 1 1 0 1 1 0 110 bit integer such as the MPEG definition of VBV_BUFFER_SIZE s-.
  1 r r r r r r s s
  0 s s s s s s s s
  1 1 1 1 0 1 1 1 016 bit integer such as the MPEG definition of VBV_DELAY b-.
  1 b b b b b b b b
  0 b b b b b b b b
  1 1 1 1 1 1 1 0 1One of VERTICAL_MBS v-shows 13 bit integer of the vertical size of the image take macro block as unit.
  1 r r r v v v v v
  0 v v v v v v v v
  1 1 1 1 1 0 0 1 1One of VERTICAL_SIZE v-shows 16 bit integer of the visual vertical size take pixel as unit, and it can be any integer value.
  1 v v v v v v v v
  0 v v v v v v v v
A.3.5 the component identification number A.3.5.1 of the figure denote in the token
Corresponding to the present invention, the component id number is that 2 bit integer that indicate a color component say that typically the part that this 2 bit field is used as leader places data token. For MPMG and H.261, its relation is very simple: see Table A.3.3
Table A .3.3 is to MPEG and component ID H.261
Component IDMPEG or color component H.261
    0Brightness (Y)
    1Blue difference signal (Cb/u)
    2Red color difference signal (CI/v)
    3Do not use
For JPEG, situation is more complicated, because JPEG does not limit operable color component. Decoder chip allows up to four kinds of different colours in every one scan. ID when the explanation of color component arrives decoder by continuous dispensing. A.3.5.2 horizontal and vertical hits
For in four color components each, level or the piece number that is vertically arranged in the macro block there are a detailed description, this explanation comprises one than the dibit integer of piece number little 1.
For example, in MPEG (or H.261), be assigned with at 4: 2: 0 chroma samples of tool (figure A.15.4) and component ID such as Table A .3.3.
Table A .3.4 was to 4: 2: the hits of 0/MPEG
Component IDHorizontal hitsWidth take piece as unitThe Vertical Sampling numberHeight take piece asunit
    0     1     2     1     2
    1     0     1     0     1
    2     0     1     0     1
    3Need notNeed notNeed notNeed not
See A.3.5.1 and note: JPEG requires 1 to its macro block when processing 4: 2: 2 data: 1: 1 structure.
Table A .3.5 is for the hits of 4: 2: 2 JPEG
Component IDHorizontal hitsWidth with the metering of piece numberThe Vertical Sampling numberHeight with thepiece metering
    Y
    1     2     0      1
    U     0     1     0      1
    V     0     1     0      1
A.3.6. special token form
Corresponding to the present invention, concentrate at decoder chip such as the token of data token and quantization table token to be used with its " extend type ". In extend type, token comprises some data. In the situation of data token, they can comprise coded data or pixel data. In the situation of quantization table token, they comprise quantization table information.
In addition, " the non-extend type " of these tokens is defined as " sky " in the present invention. This token form provides the place that can be filled subsequently by the extended version of same token in the token string. This form is mainly used in encoder, thereby no longer is further used as data at this and provides.
Table A .3.6 is to the token of various criterion
The token title    MPEG    JPEG    H261
  BIT_RATE     /
  BROKEN_CLOSED     /
  CODING_STANDARD     /     /     /
  COMPONENT_NAME     /
  CONSTRAINED     /
  DATA     /     /     /
  DEFINE_MAX_SAMPLING     /     /     /
  DEFINE_SAMPLING     /     /     /
  DHT_MARKER     /
  DNL_MARKER     /
  DOT_MARKER     /
  DRI_MARKER     /
Table A .3.6 is to the token (continuing) of various criterion
The token title    MPEG    JPEG    H261
 EXTENSION_DATA     /     /
 FIELD_INFO
 FLUSH     /     /     /
 GROUP_START     /     /
 HORIZONTAL_MBS     /     /     /
 HORIZONTAL_SIZE     /     /     /
 JPEG_TABLE_SELECT     /
 MAX_COMP_ID     /     /     /
 MPEG_DCH_TABLE     /
 MPEG_TABLE_SELECT     /
 MVD_BACKWARDS     /
 MVD_FORWARDS     /     /
 NULL     /     /     /
 PEL_ASPECT     /
 PICTURE_END     /     /     /
 PICTURE_RATE     /
 PICTURE_START     /     /     /
 PICTURE_TYPE     /     /     /
 PREDICTION_MODE     /     /     /
 QUANT_SCALE     /     /
 QUANT_TABLE     /     /
 SEQUENCE_END     /     /
 SEQUENCE_START     /     /     /
 SLICE_START     /     /     /
 TEMPORAL_REFERENCE     /     /
 TIME_CODE     /
 USER_DATA     /     /
 VBV_BUFFER_SIZE     /
 VBV_DELAY     /
 VERTICAL_MBS     /     /     /
 VERTICAL_SIZE     /     /     /
A.3.7 the token of various criterion uses
Corresponding to the present invention, every kind of standard is used a different subset that is defined token. See Table A.3.6. A.4 A.4.1 two-wire interface and token port of two-wire interface
Simple two-wire effectively/accept all rank of agreement in chipset to be used flowing with control information. Data just are sent out between piece when only observing transmit leg and recipient all being ready to when the clock rising edge time.
1) data transmit
2) recipient is unripe
3) transmit leg is unripe
If transmit leg unripe (transmit leg such as top 3 is unripe), recipient's input must be waited for. If recipient's unripe (unripe such as the recipient in top 2), transmit leg will continue same data are placed on its output, until its received side reception.
When token information was transmitted between device, the two-wire interface between the device was referred to as a token port. A.4.2 use field institute
Corresponding to the present invention, the decoder chip collection connects three chips with two-wire interface. In addition, the coded data that is input to spatial decoder also is a two-wire interface. A.4.3 bus signals
The data word width that is transmitted by two-wire interface changes (seeing Figure 35, " being wider than the token on the interface of 8 bits ") according to the needs of relevant interface. For example, 12 bit coefficients are input to inverse discrete cosine transform (IDCT), but only have 9 bits to be output.
Table A .4.1 two-wire interface data width
InterfaceData width (bit)
Input to the coded data ofspatial decoder          8
The output port ofspatial decoder          9
The input port oftemporal decoder          9
The output port oftemporal decoder          8
The input port ofpixel format device          8
Except data-signal also has other three signals that transmit by two-wire interface:
Effectively
Receive
Expand A.4.3.1 spread signal
Spread signal is corresponding with the token extended bit that illustrates previously. A.4.4 design is considered
The purpose of two-wire interface is the short distance point to point link between the chip.
Decoder chip should adjacently be placed, so that the PCB wire length between the chip is the shortest. In possible place, wire length should keep less than 25mm. The PCB wire capacitances should remain on minimum of a value.
The distribution of clock should be designed to make that not occur clock between the chip uneven. If there is any clock uneven, should arrange to make " receiving chip " to see before clock at " transmission chip ".
All should be by same digital power operation by the chip communication of two-wire interface. A.4.5 interface regularly
Table A .4.2 two-wire interface regularly
NumberCharacteristicThe 30MHz minimaxUnitAnnotatea    b
   1Inputsignal setting time     5    ns
   2The inputsignal retention time     0    ns
   3The outputsignal driving time          23    ns
   4The outputsignal retention time     2    ns
A, figure can be different and different with design in Table A .4.2
It is 2095 (20 picofarad) note 1 that b, peak signal load, and A.16.3: figure has shown system assignment device chip and from the two-wire interface between the spatial decoder coded data passage of main decoder clock operation. This is optional, because this two-wire interface can be from coded data clock work that can be synchronous with decoder clocks. Referring to part A.10.5, " coded data clock ". Similarly, the display interface of pixel format device can be operated from a clock with the main decoder clock synchronous. A.4.6 signal level
Two-wire interface is used the CMOS input and output. V1HminApproximately beVDD70% and V1LMaxApproximately beVDD30%. Numerical value shown in the Table A .4.3 is V1HAnd V1LAt its poorest V separatelyDDIn the situation. VDD=5.0±0.25V。
Table A .4.3 direct current characteristic
SymbolParameterMinimumMaximumUnit
 V1HInput logic " 1 " volt   3.68     VDD+0.5   V
 VILInput logic " 0 " volt   GND-0.5     1.43   V
 VOHOutput logic " 1 " volt   VDD-0.1   Va
  VDD-0.4   Vb
 VOLOutput logic " 0 " volt     0.1   Vc
    0.4   Vd
 lINThe input leakage current     ±10   μA
a.IOH≤ 1 milliampere of b.IOH≤ 4 milliamperes of c.IOL≤ 1 milliampere of d.IOLA.4.7 control clock for≤4 milliamperes
Usually in fact, control is the decoder clocks (decoder_clock) of chip by the clock that two-wire interface transmits. Coded data port input to spatial decoder is an exception. It is controlled by encoded clock (coded_clock). Will be further described below clock signal. A.5 DRAM interface DRAM interface A.5.1
The configurable DRAM interface of one single high-performance is used in each video decoder chip. Usually, the DRAM interface on each chip is in fact identical; Yet, be different between the interface how on the treatment channel priority. Interface is designed to directly drive the used DRAM of each decoder chip. In typical case, in most systems, do not need external logic, buffer or element to connect the DRAM interface in DRAM. A.5.2 interface signal
Table A .5.1 DRAM interface signal
Signal nameI/OExplanation
DRAM_data(31:0)     I/OThe DRAM data/address bus of 32 bit widths. This bus selectively is configured to 16 or 8 bit widths, sees A.5.8 part
DRAM_addr(10:0)     OThe DRAM interface IP address of 22 bit widths is to make the timesharing multichannel in the bus of 11 bit widths
RAS     OThe DRAM row address messenger that changes
CAS(3:0)     OThe DRAM column address messenger that changes. Each byte of the data/address bus of interface is provided a signal. All CAS signals are driven simultaneously.
WE     ODRAM allows write signal
OE     ODRAM allows output signal
DRAM_enable     IWhen this input signal makes all output signals on the interface become high-impedance state when low. Annotate: the data processing is non-stop on the high-impedance state sheet when the DRAM interface is in, thereby when the DRAM_enable enable signal was low level, mistake will produce when planning access DRAM such as slice, thin piece.
Corresponding to the present invention, interface can dispose in two ways:
The details sequential of interface can be configured to adapt to multiple different DRAM type
" width " of DRAM interface can be configured to provide the compromise of price/performance in different application.
A.5.3 the configuration of DRAM interface
Usually, there are three groups of registers and DRAM interface to interrelate: interface timing configured register, interface bus configuration register and refresh configuration register. Refresh configuration register (register among the Table A .5.4) should be disposed at last. A.5.3.1 the situation after resetting
After resetting, according to the present invention, DRAM interface start-up operation is with a series of default timing parameters (corresponding with the slowest mode of operation). Beginning, the DRAM interface will be carried out the refresh cycle (comprising other all transmission) continuously. This will proceed to a value and be written into the refresh interval register. Then the DRAM interface can be carried out the transmission of other type between the refresh cycle. A.5.3.2 bus configuration
Bus configuration (register among the Table A .5.3) should only not carry out finishing when data transmit at interface. Be right after after resetting and before a value was written into the refresh interval register, interface was placed in this state. If necessary, only in that interface can reconfigure afterwards when being attempted without transmitting. See temporal decoder chip access function resister (A.18.3.1) and spatial decoder buffer management access function resister (A.13.1.1). A.5.3.3 interface timing configured
According to the present invention, interface timing configured information is by regularly access (interface_ timing_access) register control of interface. For this reason registerwrite 1 so that interface timing register (among the Table A .5.2) can be modified. Work as interface_timing_access=1, the DRAM interface continues the configuration operation with its front. After writing 1, the user should wait for before writing any interface timing register until 1 can be read back from interface_timing_access. After configuration finished, 0 should be written into interface_timing_access. Then new configuration will be transmitted to the DRAM interface. A.5.3.4 refresh configuration
The refresh interval of DRAM interface of the present invention can only dispose once after resetting. Interface continues to carry out refresh cycle, until refresh_interval is configured. This has prevented that any other data from transmitting. Data transmit and can begin after a value is written into refresh_interval.
So be known in the field, the DRAM typical case needs 100 μ s to " time-out " between the 500 μ s after powering up first, then is possible a large amount of refresh cycles before the normal operation. Correspondingly, these DRAM start requests should be satisfied before giving value of refresh_interval. A.5.3.5 to the read operation of configuration register
All DRAM interface registers of the present invention all can be read at any time. A.5.4 interface timing (ticks pulse)
The DRAM interface is regularly obtained by a clock, and this clock is with the speed running (decoder clocks) of the input clock that is four times in equipment. This clock is produced by phaselocked loop in the chip.
For easy, the cycle of this high-frequency clock is represented as " ticks pulse ". A.5.5 interface register
Table A .5.2 interface timing configured register
The register titleSize/directionReset modeExplanation
intertace_ timing_ access     1     bit     rw     0This function makes the register can access DRAM interface sequence configuration register. When this register kept 0 value, configuration register should not be modified. Be written into after this register theDRAM interface 0 and will begin in the sequential configuration register, to use new value.
page_start-length     5     bit     rw     0Indicate the initial length of access with ticks. Spendable minimum of a value was 4 (referring to 4 tick). 0 selects the maximum length of 32ticks.
transfer_ cycle_length     4     bit     rw     0Indicate the fastest page or leaf read or write cycle of expressing with ticks. Spendable minimum of a value was 4 (referring to 4 tick). 0 selects the maximum length of 16ticks.
refresh_ cycle-length     4     bit     rw     0Indicate the length of refresh cycle with ticks. Spendable minimum of a value was 4 (referring to 4 tick). 0 selects the maximum length of 16ticks.
RAS_falling     4     bit     rw     0Startup RAS transfers low level tick number to after specifying in the startup access. Spendable minimum of a value was 4 (referring to 4 tick). 0 selects the maximum length of 16ticks.
CAS_falling     4     bit     rw     8After specifying in startup read cycle, write cycle time or access, start CAS and transfer low level tick number to. Spendable minimum of a value was 1 (referring to 1tick). 0 selects the maximum length of 16ticks.
Table A .5.3 interface bus configuration register
The register titleSize/directionReset modeExplanation
DRAM_data -width     2     bit     rw     0Indicate and be used in DRAM interface data bus DRAM_data[31:0] on bit number. See A.5.8
row_ address -bits     2     bit     rw     0Be indicated as being the bit number that the row address of DRAM interface IP address bus partly uses. See A.5.10
DRAM_ enable     1     bit     rw     1For writing 0 value, register force the DRAM interface to enter a high impedance status. Be written into register if DRAM enable signal is in low level or 0, from then on 0 incite somebody to action, and register is read out.
CAS_strength     3     bit     rw     6The output of these three bit register configurations DRAM interface signal drives intensity. This so that interface can be multiple different loads and be configured. See A.5.13
RAS_strength
addr_strength
DRAM_data- strength
OEWE_strength
A.5.6 interface operation
The DRAM interface uses fast page mode. Have three kinds of dissimilar accesses to be supported:
Read
Write
Refresh
Each reads or writes access transmits train of pulse from 1 to 64 byte to a single DRAM page address. Read and write transmits and does not mix mutually in a single access, and each continuous access is all processed as the arbitrary access to a new DRAM page.
Table A .5.4 refresh configuration register
The register titleSize/directionReset modeExplanation
refresh_ interval
    8     bit     rw   0This value indicatedtake 16 decoder_ clock cycles as the interval between the refresh cycle of unit. Value in from 1 to 255 scope can be configured. 0 value is by automatic loading and force the DRAM interface continue to carry out the refresh cycle, until an effective refresh interval is configured after resetting. Refresh_interval preferably only is configured once at every turn after resetting.
no_refresh     1     bit     rw   0Prevented the execution of any refresh cycle for this register value of writing 1.
A.5.7 access structure
Each access comprises two parts
Access is initial
Data transmit
Among the present invention, each access is followed thereafter one or more data transfer cycles by an initial beginning of access. In addition, access initial sum data transfer cycles there are reading and writing and refresh variant.
When a particular access having been finished final data and transmit, interface enters its default setting (seeing A.5.7.3) and keeps this state, until the preparation of beginning is carried out in new access. Prepare beginning if one new when being accessed in an access and having finished, this new access will begin immediately so. A.5.7.1 access is initial
Access is initiated with to read or write to transmit and the page address is provided and sets up some initialize signal condition. Corresponding to the present invention, have three kinds of different accesses initial:
Read initial
Write initial
Refresh initial
Table A .5.5 DRAM interface timing parameters
NumberCharacteristicMinimumMaximumUnitNote
   5By the RAS prestrain cycle of register RA S_falling setting    4    16  tick
   6Initial lasting by the access that register page_start_length arranges    4    32
   7CAS prestrain length by register CAS_falling setting    1    16   a
   8Quick page read or write cycle length by register transfer_cycle_ length setting    4    16
   9By the refresh cycle length of depositing the refresh_cycle setting    4    16
A. this value can be less than RAS_falling, to guarantee that CAS occurred before RAS refreshes.
In each case, the sequential of RAS and row address are all by register RA S_falling and Page_start_length control. OE and DRAM_data[31:0] state is held from the end that last data transmits, to becoming low level to RAS. When RAS became low level, how three kinds of initial types of different accesses only drove OE and DRAM_data[31:0 at them] on difference to some extent. See Figure 43. A.5.7.2 data transmit
In the present invention, data transfer cycles has dissimilar:
The quick page read cycle
Write cycle time behind the quick page
Refresh cycle
One is refreshed and initially can only follow a single refresh cycle. Read that (or writing) is initial can follow one or more quick pages and read (or writing) cycle for one. The initial CAS of read cycle be driven to high level and new column address driven.
Further, one early write cycle time be used. Be driven to low level and keep to the last end of writing transmission of low level at the initial WE that writes for the first time transmission. The output data are driven according to the address.
Owing to be by the initial initiation of refresh cycle at the CAS of RAS before the refresh cycle, within the refresh cycle, there is not interface signal movable. The purpose of refresh cycle is to satisfy the required minimum RAS low period of DRAM. A.5.7.3 interface default setting
Interface signal enters a default setting in the ending of primary access among the present invention:
RAS, CAS and WE high level
* data and OE remain on their former state
The address keeps stable A.5.8 data-bus width
Two bit register DRAM_data_width allow the wide data path of DRAM interface to be configured. This so that when the work of little pixel format the DRAM cost can be down to minimum.
Table A .5.6 disposes DRAM_data_width (DRAM data width register)
DRAM_data_width
0a DRAM_data[31:24]bOn 8 bit width data/address bus
1 DRAM_data[31:16][b]On 16 bit width data/address bus
2DRAM data[31:0] on 32 bit width data/address bus
A, default after resetting
B, untapped signal are maintained at A.5.9 row address width of high impedance
For providing row address to be disposed by register row_address_bits from the bit number of the mid portion taking-up of 24 bit internal addresses.
Table A .5.7 disposes row_address_bits (row address bits)
    row_address_bitsTherow address width
    1DRAM_addr[9:0] upper 10bits
    2DRAM addr[10:0] upper 11 bits
A.5.10 address bit
24 bit address are produced at chip. How this address is used for producing the figure place that will select according to the width of data/address bus with for row address the row and column address is decided. Some configuration does not allow all internal address bit all to be used, and therefore produces " hidden bit ".
Similarly, row address is extracted by the mid portion from the address. Correspondingly, this makes DRAM naturally be refreshed with maximum rate.
Mapping between the inside and outside address of Table A .5.8
The row address widthInside → outer row address transitionData-bus widthColumn address conversion inside → outside
   9 [14:6]→[8:0]    8 [19:15]→[10:6][5:0]→[5:0]
   16 [20:15]→[10:5][5:1]→[4:0]
   32 [21:15]→[10:4][5:2]→[3:0]
   10 [15:6]→[9:0]    8 [19:16]→[10:6][5:0]→[5:0]
   16 [20:16]→[10:5][5:1]→[4:0]
   32 [21:16]→[10:4][5:2]→[3:0]
   11 [16:6]→[10:0]    8 [19:17]→[10:6][5:0]→[5:0]
   16 [20:17]→[10:5][5:1]→[4:0]
   32 [21:17]→[10:4][5:2]→[3:0]
A.5.10.1 low level column address bit
4 of least significant bit to 6 are used to provide the address for the fast page-mode that reaches 64 bytes transmits in the column address. Control the required address size of these transmission and will decide according to the width of data/address bus (seeing A.5.8). A.5.10.2 the decoded row address is with the more DRAM body of access
In the place of a monolithic entity that only uses DRAM, employed row address will depend on the DRAM type of use. For those need to more than a single DRAM body can typical case's storage capability that provide application, can dispose a wider row address, and some row address bit of then decoding is to select a single DRAM body.
Attention: row address is that the centre of internally address extracts. If some of row address decoded bodies with selection DRAM, all probable values of these " body is selected the position " must be selected a DRAM body so. Otherwise address space just may stay the cavity. A.5.11 DRAM interface enable
In the present invention, there are two methods can make all output signals on the DRAM interface become high impedance, namely, by DRAM_enable register and DRAM_enable signal are set. For making the driver operation on the DRAM interface, register and signal all must be in logical one. If any one is low, interface is set to high impedance so.
Attention: when the DRAM interface during at high impedance on the chip data process and finish. Thereby, attempt such as fruit chip that access DRAM just may make a mistake when interface is in high impedance.
Corresponding to the present invention, the ability that the DRAM interface is set to high impedance is provided so that spatial decoder (or temporal decoder) when not being used miscellaneous equipment can test or use DRAM by spatial decoder (or temporal decoder) control. This is not in order to share memory at the normal miscellaneous equipment that allows in service. A.5.12 refresh
Unless register no_refresh is write to forbid refreshing, the DRAM interface will use a CAS automatically DRAM to be refreshed at RAS in an interval that is determined by register refresh_interval before the refresh cycle.
Value among the refresh_interval indicates interval between the refresh cycle with 16 decoder_clock cycles. Value in 1 to 255 scope can be configured. 0 value is by automatic loading and force the DRAM interface continue to carry out the refresh cycle (in case permission), until an effective refresh interval is configured after resetting. We advise that refresh_interval should only be configured once at every turn after resetting.
When the reset signal occurred, the DRAM interface can not refresh DRAM. Yet decoder chip is enough lacked required resetting time, thereby should reset them, and reconfigures the DRAM interface before the content consumption of DRAM is lost. A.5.13 signal strength signal intensity
The driving intensity of the output of DRAM interface can be used by theuser 3 register CAS_ strength, RAS_strength, and addr_strength, DRAM_data_strength and OEWE_strength dispose. The highest order of this 3 bit value is selected the fast or slow of edge rate. Two low orders are different load capacity configuration output.
Default intensity after resetting is 6, if its configuration output load capacitance is 24pf, approximately drives one between GND and V with 10nsDDBetween signal.
The configuration of Table A .5.9 output intensity
Intensity levelDrive characteristic
    0About 4 nanoseconds/volt on the capacitive load of6pf
    1About 4 nanoseconds/volt on the capacitive load of12pf
    2About 4 nanoseconds/volt on the capacitive load of24pf
    3About 4 nanoseconds/volt on the capacitive load of48pf
    4About 2 nanoseconds/volt on the capacitive load of6pf
    5About 2 nanoseconds/volt on the capacitive load of12pf
    6aAbout 2 nanoseconds/volt on the capacitive load of24pf
    7About 2 nanoseconds/volt on the capacitive load of 48pf
A, reset rear default
The load that drives for it when an output and suitably being disposed, it will meet Table A .5.13 to the alternating current characteristics of indicating A.5.16. When configuration is suitable, therefore the overshoot of minimum degree will appear in each output and its load approximate match after a signal saltus step. A.5.14 electrical characteristics
All information that this part provides only are the explanations of one embodiment of the present invention, are to be included within the example and needn't be as restriction.
Table A .5.10 maximum rating
SymbolParameterMinimumMaximumUnit
VDDSupply voltage with the GND reference     -0.5     6.5     V
VINInput voltage on any pin    GND-0.5   VDD+0.5     V
TARunning temperature     -40     +85     ℃
TSStorage temperature     -55     +150     ℃
Table A .5.10 only is provided with maximum rating for the explanation embodiment. To this particular implementation of emphasizing below, that lists in the table should be used, to guarantee reliability of operation.
Table A .5.11 dc operation condition
SymbolParameterMinimumMaximumUnit
VDDSupply voltage with the GND reference     4.75     5.25     V
GNDGround
    0     0     V
VIHInput logic " 1 " voltage     2.0    VDD+0.5     V
VILInput logic " 0 " voltage     GND-0.5     0.8     V
TARunning temperature     0     70     ℃a
A, with TBA wire feet per minute clock crossflow
Table A .5.12 direct current characteristic
SymbolParameterMinimumMaximumUnit
VOLOutput logic " 0 " voltage     0.4     Va
VOHOutput logic " 1 " voltage     2.8     V
IOOutput current     ±100     μAb
IOZOutput off-state leakage current     ±20     μA
IIZThe input leakage current     ±10     μA
IDDThe RMS source current     500     mA
CINInput capacitance     5     pF
COUTOutput I/Oelectric capacity     5     pF
A, AC parameter are by with VOLmaxRank is pointed out b to=0.8V, this is the stable state driving force of interface in order to measure. Immediate current may be much bigger. A.5.14.1 AC characteristic
The deviation of the relative nominal value of Table A .5.13 gate pulse
NumberParameterMinimumMaximumUnitNotea
  10Cycle time     -2     +2     ns
  11Cycle time     -2     +2     ns
  12High level pulse     -5     +2     ns
  13Low level pulse     -11     +2     ns
  14Cycle time     -8     +2     ns
The driving intensity of a, signal must corresponding its load and is configured, as the people of technical ability can recognize having in this area.
Between twice gating of Table A .5.14 to the deviation of nominal value
NumberParameterMinimumMaximumUnitNotea
  15Be strobed into the time-delay of gating   -3     +3     ns
  16The low level retention time   -13     +3     ns
  17Be strobed into giving the load time of gating, such as tCRP, tRCS, tRCH, tRRH, tRPCb   -9     +3     ns
Give the CAS signal of load pulses on wide a plurality of DRAM between any two CAS, as tCP or RAS rise with CAS decline between CAS signal such as tRPCb   -5     +2     ns
B. the definition of each time symbol sees also general DRAM handbook.
Between twice gating of Table A .5.14 to the deviation (continuing) of nominal value
NumberParameterMinimumMaximumUnitNotea
18Giving before forbidding filled    -12    +3     ns
The driver intensity of a, two signals must adapt to their load and be configured
Between Table A .5.15 bus and the gating with respect to the deviation of nominal value
NumberParameterMinimumMaximumUnitNotea
  19Setup times     -12     +3     ns
  20Retention time     -12     +3     ns
  21Address access time     -12     +3     ns
  22Next time effectively behind the gating     -12     +3     ns
The driver intensity of a, bus and gating must adapt to its load and be configured
Between Table A .5.16 bus and the gating with respect to the deviation of nominal value
NumberParameterMinimumMaximumUnitNote
  23Read data setup times before the CAS signal begins to be upgraded tohigh level    0   ns
  24The read data duration after the CAS signal begins to be upgraded tohigh level    0   ns
When reading DRAM, the DRAM interface DRAM_data [31:0] that when the CAS signal is upgraded to high level, samples.
Cross-reference between Table A .5.17 " standard " DRAM parameter name and the timing parameters value
ParameterParameterParameter
TitleNumerical valueTitleNumerical valueTitleNumerical value
  tPC
    10  tRSH     16  tRHCP  tCPRH    18
  tRC     11  tCSH  tASR    19
  tRP     12  tRWL  tASC
  tCP  tCWL  tDS
  tCPN  tRAC  tRAH    20
  tRAS     13  tOAC/tOE  tCAH
  tCAS  tCHR  tDH
  tCAC  tCRP
    17  tAR
  tWP  tRCS  tAA    21
  tRASP  tRCH  tRAL
  tRASC  tRRH  tRAD    22
  tACP/tCPA     14  tRPC
  tRCD
    15 tCP
  tCSR  tRPC
A.6 MPI (MPI)
The MPI of one standard byte wide (MPI) is used on all chips in the video decoder chip collection, yet the people of general technical ability will recognize that the MPI of other width also can be used in tool this area. MPI and the operation of multiple decoder chip clock synchronous. A.6.1 MPI signal
Table A .6.1 MPI interface signal
Signal nameI/OExplanation
enable[1:0]InputThe sheet of two Low level effectives enables, and the two must be low level can carry out access by MPI
r wInputThe high level indicating equipment is wanted from the video chip value of reading. This signal should be stable when chip is allowed to.
addr[n:o]InputIn the memory Transformation Graphs ofaddress indication chip 2nIn the individual place one. This signal should be stable when chip is allowed to.
data[n:o]OutputThe data I of 8 bit widths/O port. These pins are exactly high impedance if arbitrary enable signal is high level.
irqOutputA Low level effective open collector interrupt request singal.
A.6.2 MPI electrology characteristic
Table A .6.2 absolute maximum rating
SymbolParameterMinimumMaximumUnit
VDDSupply voltage with the GND reference     -0.5     6.5     V
VINInput voltage on any pin    GND-0.5   VDD+0.5     V
TAOperating temperature     -40     +85     ℃
TSStorage temperature     -55     +150     ℃
Table A .6.3 DC operating condition
SymbolParameterMinimumMaximumUnit
VDDSupply voltage with the GND reference     4.75     5.25     V
GNDGround
    0     0     V
VIHInput logic " 1 " voltage     2.0   VDD+0.5     Va
VILInput logic " 0 " voltage   GNO-0.5     0.8     V(a)
TAOperating temperature     0     70     ℃b
A, AC input parameter are measured rank at 1.4V and are measured. B, with TBA wire feet per minute clock crossflow.
Table A .6.4 DC electrical characteristics
Table A .6.4 DC electrical characteristics (continuing)
SymbolParameterMinimumMaximumUnit
ICDThe RMS source current     500     mA
CINInput capacitance     5     pF
COUTOutput I/Oelectric capacity     5     pF
a,Io≤I0∝minB, this is the stable state driving force of interface, and transient current may be much higher. C, when OC output irg set, during and pull-down impedance≤100 Ω. A.6.2.1 the AC property list A.6.5 MPI read regularly
NumberCharacteristicMinimumMaximumUnitNote a
  25Low level allows thephase    100     ns
  26High level allows the phase     50     ns
  27Address or read-write are set to chip enable and go      0     ns
  28Address or read-write keep forbidding coming fromchip      0     ns
  29The output openingtime     20     ns
  30The readdata access time     70     ns    b
Table A .6.5 MPI is read regularly (continuing)
NumberCharacteristicMinimumMaximumUnitNote
    31The readdata retention time   5   ns
    32The read data turn-off time   20
A, in this example, select enable[0] come start cycle and select enable[1] what finish is arbitrarily. The status of these signals is identical. B, access time are 50pF and appointment for the upper maximum load of each data [7.0]. Larger load may increase the access time.
Table A .6.6 MPI write timing
NumberCharacteristicMinimumMaximumUnitNote
    33Writedata setup time   15   ns   a
    34Write the continued time ofdata   0   ns
A, in this example selects enable[0] come start cycle and select enable[1] to come end period be arbitrarily. The status of these signals is identical. A.6.3 interrupt
According to the present invention, " event " is a term that is used for describing state on the chip that a kind of user may want to monitor. The software that event can be indicated a mistake or be can be the user provides information.
There are two single bit registers and each interruption or " event " to interrelate. They are state event register and state mask register. A.6.3.1 state event register
The state event register is a read/write register, and its value is set to 1 by a state that comes across inside circuit. Even be that register also is set to 1 in moment and the situation about now having disappeared at this state. Then register is guaranteed to be set to 1, until user's software is to its reset (or whole chip is reset).
This register is set to 0 by writing 0 value.
Writing 0 to register makes to deposit and is not changed
Register must be established by user software before the appearance of again observing this state
Be set to 0.
Register will be reset to 0 state mask register A.6.3.2 when resetting
The state mask register is a read/write register, if corresponding state event register is set up, it allows an interrupt requests to be produced. If state event is set up when 1 is written into the state mask register, an interrupt requests can be sent at once.
Value 1 allows to interrupt
Register clear 0 when resetting.
Unless be illustrated as other kind of situation, piece after producing an interrupt requests with shut-down operation and will be at state event or state mask register by clear 0 rear resume operations. A.6.3.3 event and mask bit
Event bit and mask bit are the corresponding bit position in the successive byte in the memory map (see Table A.9.6 and A.17.6) by grouping often. This is so that break in service software can use the value of reading from mask register as the shielding of event registers intermediate value; Produced interruption to identify which event. A.6.3.4 chip event and shielding
Each chip has one single " totally " event bit, and it has summarized the activity of event on the chip. The chip event registers provide all have at its mask bit event on 1 the chip or.
One 1 in the chip mask bit makes chip can produce interruption. One 0 in the chip mask bit stops event generation interrupt requests on any chip.
Write 1 to 0 for the chip event and do not produce any impact, when all being eliminated, it just is eliminated in all events (enabling by 1 in their mask bits). A.6.3.5 irq signal
Both be set up such as fruit chip event bit and chip event mask, the irq signal is set.
The irq signal is a Low level effective, " open collector " output, and it requires the outside upper resistance that draws of chip. Irq output is drop-down by 100 Ω or less impedance when effective.
We think that the pull-up resistor of about 4K Ω will be applicable to great majority and use. A.6.4 A.6.4.1 halt circuit permission access of access function resister
In the present invention, if associated therewith is stopped, most of registers just can be modified. Thereby the register group interrelates with an access function resister usually.
The register group that 0 value indication in the access function resister is associated with this access function resister should not be modified. Writing 1 to an access function resister requires a piece to be stopped. Yet piece may not can stop and the access function resister of piece will keep 0 value until it is stopped immediately.
Correspondingly, user software will wait for (write 1 ask access after) until read 1 from access function resister. If the user is set to write a value to configuration register at 0 o'clock at its access function resister, the result is undefined. A.6.4.2 the integer of holding in the register
The least significant bit of any byte interrelates with signal data [0] in the memory map.
Those integer-valued registers that maintain greater than 8 are split into 2 or 4 successive byte positions in the memory map. Byte order is " large number ending " shown in Figure 55. Yet, do not have any hypothesis that is written into the relevant order of this class of multibyte register about byte.
No position will return 0 in the memory map when being read, unless no position keeps signed integer in the register. In this case, the high significance bit of register will be by sign extended. For example, 12 bit sign register will be by sign extended to fill one 16 storage map position (2 byte). 16 memory map positions that maintain 12 signless integers will return 0 from its high significance bit. A.6.4.3 keyhole (Keyhole) address location
In the present invention, some storage map position that more seldom is accessed is placed in " keyhole (Keyhole) " back. One " keyhole " has two associated registers, and one is the keyhole address register, and one is the keyhole data register.
Keyhole address indicates the unit in an expanded address space. Operation access that the keyhole data register is read or write is by the unit of keyhole address register appointment.
In access behind keyhole data register associated keyhole address register produce increment. Arbitrary access in expanded address space only has by writing new value to the keyhole address register and be only possible for each access.
One can have more than one " keyhole " corresponding to chip of the present invention and store Transformation Graphs. Do not interact between the different keyholes. A.6.5 specified register untapped register A.6.5.1
The register or the position that are described to " not using " are exactly those positions of not using in current equipment is implemented in the memory conversion. Usually, can read from thesepositions 0 value. Write zero without any effect to these positions.
The people of skill will understand as one has in this area like that, and compatible mutually for the variation that keeps might occurring in the future with these products, user software had better not depend on the value that read use location never. Similarly, when configuration device, 0 value should be avoided or be set in these positions. A.6.5.2 keep register
Similarly, the register or the position that are described in the present invention " reservation " do not provide it for the impact of equipment performance as data, should not be accessed. A.6.5.3 detected register
Further, be described to the register of " detected register " or the multiple aspect that the equipment detectability is being controlled in the position. Thereby these registers are not employed in the normal operation of equipment, and do not need to be disposed and control by general device software access. A.7 clock
Corresponding to the present invention, in video decoder system, there are many different clocks to be identified. The example of clock has illustrated in Figure 56.
When data are passed the different clocks zone that video decoder chip concentrates, it by with each new clock more synchronously (on chip). In the present invention, the peak frequency of any input clock is 30MHz. Yet the people with general technology in this area will recognize, other frequency comprises that those greater than 30MHz, also can be used. On each chip, MPI (MPI) and the operation of chip clock synchronous. In addition, the pixel format device can produce a low frequency audio clock synchronous with the visual rate of decoded video. Correspondingly, this clock can be used to provide audio/video synchronization. A.7.1 spatial decoder clock signal
Spatial decoder has two kinds of different (and potential asynchronous) clock inputs:
Table A .7.1 spatial decoder clock
Signal nameI/OExplanation
coded_clockInputThis clock control transmits to the data in the coded data port of spatial decoder. The processing of this clock control coded data on chip is until these data reach coded data buffer.
decoder_clockInputMost of processing capacity on the decoder clocks control spatial decoder. Decoder clocks is also controlled data communication device and is crossed output port and pass to the spatial decoder outside.
A.7.2 temporal decoder clock signal
Temporal decoder only has a clock input:
Table A .7.2 temporal decoder clock
Signal nameI/OExplanation
decoder_clockInputAll processing capacities on the decoder clocks control time decoder. Decoder clocks is also controlled data communication device and is crossed its input port to the transmission of temporal decoder and the output by output port.
A.7.3 electrology characteristic
The requirement of Table A .7.3 input clock
NumberCharacteristic       30MHzUnitNote
MinimumMaximum
    35Clock cycle     33     ns
    36The clockhigh period     13     ns
    37The clockhigh period     13     ns
The Table A .7.4 clock initial conditions symbolic parameter V of minimax unitIHInput logic " 1 " voltage 3.68 VDD+0.5   V VILInput logic " 0 " voltage GND-0.5 1.43 V IOZInput leakage current ± 10 μ A are the level of CMOS A.7.3.1
Clock input signal is the CMOS input. V1HminApproximately beVDD70% and V1LmaxApproximately beVDD30%. Value shown in the Table A .7.4 is those V1HAnd V1LAt its poorest V separatelyDDIn the situation. VDD=5.0 ± 0.25V is clock stability A.7.3.2
In the present invention, obtain from input clock signal for driving DRAM interface and the chip clock to chip interface. The sequential specification requirement of these interfaces hypothesis input clock is stable in ± 100PS (picosecond) scope regularly. A.8 JTAG
Arrange more and more intensively along with circuit board, utilize the more and more difficult connection of checking between the element of traditional method to connect, as using the in-circuit test of nail bed (bed-of-nails) means. During solving access issues and standardized trial, JTAG (JTAG) produced on methodology. The work of this group is summed up as " standard testing access port and boundary-scan architecture ", is now adopted as standard 1149.1 by IEEE. Spatial decoder and time decoder are deferred to this standard.
This standard is used a boundary scan chain, and it links to each other with each digital scan pin order on the equipment. The test circuit system is transparent in general operation, but boundary scan chain allows test pattern to be changed in test mode, and puts on the pin of equipment. The resulting signal that comes across JTAG equipment input on the circuit board can be scanned out and be detected by relatively simple test equipment. By this method, the contact between the element can be tested, just as the logic region on the circuit board can be tested.
All JTAG operations are to finish by the test access port (TAP) that is made of 5 pins. Trst (test reset) pin does not power in test mode with assurance equipment to the jtag circuit system reset. Tck (test clock) pin is used to drive serial test pattern to tdi (test data input) pin with clock, and go out from tdo (test data output) pin, at last, the operator scheme of jtag circuit system is set up by suitable position is sequentially entered tms (test mode selection) pin with the clock driving.
The JTAG standard is extendible, takes the circumstances into consideration additional characteristic to adapt to chip manufacturer. At spatial decoder andtemporal decoder 9 user instructions are arranged, comprise 3 JTAG imperative instructions. Extra-instruction allows internal unit test to a certain degree to be performed, and additional external testing flexibility is provided. For example, all equipment output can be held by a simple JTAG sequence.
Understand about obtainable equipment with about the full details of instruction how to use jtag port, referring to following JTAG application note. A.8.1 the binding of JTAG pin and non-JTAG system
How Table A .8.1 links JTAG input signal direction explanation trst is inputted this pin and has and draw load on the inside, but must be low when powering on
Level is not even the JTAG characteristics are used. This can pass through public affairs
Trst links to each other with chip reset pin reset and reaches altogether. These pins of tdi have inner upper load, and if jtag circuit not by---input can not be bonded when using. Tms tck inputs this pin and does not draw load on the tool, if should when not using jtag circuit
When being grounded. Tdo is high impedance when output removes the JTAG scan operation. If JTAG is not made
With, this pin can not be bonded. A.8.2 with IEEE 1149.1 meet rank A.8.2.1 the rule
Strictly all rules is all adhered to, although following situation should be noted:
Table A .8.2 JTAG rule rule declaration 3.1.1 (b) trst pin is provided 3.5.1 (b) and guarantees for all common instruction for all common instruction guarantee (seeing IEEE 1149.1 5.2.1 (c)) 5.2.1 (c). To some private instruction, the TDO pin can
With at Capture-DR, Exit-DR appoints among Exit-2-DR and the Pause-DR
Become under one state effectively. 5.3.1 (a) electrification reset that is used to complete by the trst pin. 6.2.1 (e.f) the BYPASS instruction code is loaded at test-logic-reset mode. 7.1.1 the instruction code that (d) is not assigned with is identical with BYPASS. 7.2.1 (c) do not have the device id register. 7.8.1 (b) external control of single step action need system clock. 7.9.1 (...) do not exist RUNBIST facility 7.11.1 (...) not exist IDCODE instruction 7.12.1 (...) not exist USERCODE instruction 8.1.1 (b) not exist equipment identity register 8.2.1 (c) to guarantee for all common instruction. Path from tdi to tdo shows that length ought
Private instruction code may change under some field is closed when being loaded. 8.3.1 (d-i) guarantee for all common instruction. When private instruction code is loaded, remove
The rising edge of tck is outer At All Other Times, and data may be loaded into.
Table A .8.2 JTAG rule (continuing)
RuleExplanation
  10.4.1(e)During INTEST, the system clock pin must be subjected to outside control.
  10.6.1(c)During INTEST, be controlled by shifted data by the tdi output pin.
A.8.2.2 suggestion
The satiable suggestion of Table A .8.3
SuggestionExplanation
  3.2.1(b)Tck is high impedance CMOS input
  3.3.1(c)Tms has on the high impedance and draws
  3.6.1(d)For slice, thin piece)
  3.7.1(a)For slice, thin piece)
  6.1.1(e)Sampling/preloaded instruction code is loaded into during Capture-IR
  7.2.1(f)The INTEST instruction is provided
  7.7.1(g)During EXTEST, zero is loaded on system's output pin
  7.7.2(h)All system's output may be set to high impedance
  7.8.1(f)Be loaded into zero on system's input pin during the INTEST
  8.1.1(d.e)The design specialized test data register does not supply public access
The unconsummated suggestion of Table A .8.4
SuggestionExplanation
  10.4.1(f)During EXTEST, drive from the system clock pin and to enter that the signal of logic is provided by the outside on the sheet.
A.8.2.3 allow
The satiable permission of Table A .8.5
AllowExplanation
  3.2.1(c)For all common instruction guarantee
  6.1.1(f)Command register is not used for collecting design specialized information
  7.2.1(g)Some additional common instruction are provided
  7.3.1(a)Some privately owned instruction codes are assigned with
  7.3.1(c)(rule?) this type of instruction code lists data in
  7.4.1(f)Some extra-codes are entirely same to the effect of BYPASS
  10.1.1(i)Each output pin has the three-state control of oneself
  10.3.1(h)Parallel latch is provided
  10.3.1(i.j)During EXTEST, input pin is controlled by shifted data by tdi
  10.6.1(d.e)At the Test-Logic-Reset state, ternary element is not forced to inertia
A.9 spatial decoder
The 30MHz operation
Decoding MPEG, JPEG, and H.261
The coded data rate reaches 25Mb/S
Video data rate reaches 21MB/S
Variable chroma
Fully JPEG baseline decoding
DRAM interface without bonding (Glue-less)
Single+the 5V power supply
The PQFP type encapsulation of 208 pins
Maximum consumption of power 2.5W
Absolute coding data and decoder clocks
Application standard page-mode PRAM
Spatial decoder is a configurable VLSI decoder chip, is used in multiple JPEG, MPEG and H, and 261 images and video decode are used.
In a minimal configuration that does not contain DRAM outside the chip, the space decoding is the high speed JPEG decoding device of a single-chip. The video image that adds DRAM JPEG coding so that spatial decoder can be decoded. 720 * 480,30Hz, 4: 2: 2 " JPEG video " can be by real-time decoding.
Temporal decoder has been arranged, and H.261 spatial decoder can be used to decoding and MPEG (and JPEG). 704 * 480,30Hz, 4: 2: 0 MPEG video can be decoded.
Point out that again above-mentioned value just is used for explanation corresponding to the representative value of one embodiment of the present invention, only for need not as restriction for example. Correspondingly, those people with general skill in this area will recognize that other numerical value and/or scope also can be used. A.9.1 spatial decoder signal
Table A .9.1 spatial decoder signal
Signal nameEnter/go outPin numberExplanation
  coded_clock     I   182Spatial decoder coded data or token are seen A10.1 part and part A.4.1
  coded_data[7:0]     I   172,171,169,   168,167,166,   164,163
  coded_extn     I   174
  coded_valid     I   162
  coded_accept     O   161
  byte_mode    I  176
   enable[1:0]     I   126,127MPI (MPI) is seen A.6.1 part
  r w    I  125
  addr[6:0]     I   136,135,133,   132,131,130,   128
  data[7:0]     O   152,151,149,   147,145,143,   141,140
   irq     O   154
Table A .9.1 spatial decoder signal (continuing)
Signal nameEnter/go outPin numberExplanation
  DRAM_data[31:0]     I/O   15,17,19,20,   22,25,27,30,   31,33,35,38,   39,42,44,47,   49,57,59,61,   63,66,68,70,   72,74,76,79,   81,83,84,85The DRAM interface is seen A.5.2 part
  DRAM_addr[10:0]     O   184,186,188,   189,192,193,   195,197,199,   200,203
  RAS     O   11
   CAS[3:0]     O   2,4,6,8
  WE     O   12
   OE     O   204
  DRAM_enable     I   112
  out_data[8:0]     O   88,89,90,92,   93,94,95,97,   98Output port is part A.4.1
  out_extn     O   87
  out_valid     O   99
  out_accept     I   100
Table A .9.1 spatial decoder signal (continuing)
Signal nameEnter/go outPin numberExplanation
 tck     I    115Jtag port is seen A.8part
 tci     I
   116
 tdo     O    120
 tms     I    117
  trst     I    121
 decoder_clock     I    177The main decoder clock is seen A.7 part
  reset    I   160Reset
Table A .9.2 spatial decoder test signal
Signal nameEnter/go outPin numberExplanation
tph0ish     I    122As override=1 then tph0ish and tph1ish be two phase clock input on the sheet, override=0 is established in normal operating. Tph0ish and tph1ish be left in the basket (ground connection or meet VDD)
tph1ish     I    123
override     I    110
chiptest     I    111Chiptest=0 is established in normal operating
tloop     I    114Normal operating ground connection or meet VDD
ramtest     I    109Allow RAM on the test pieces such as ramtest=1. Ramtest=0 is established in normal operating
Table A .9.2 spatial decoder signal (continuing)
Signal nameEnter/go outPin numberExplanation
 pllselect     I    178Be under an embargo such as phaselocked loop on the pllselect=0 sheet. Normal operation arranges pllselect=1
 ti     I    180Two of the DRAM interface requirement clocks when test operation. Ground connection or meet V during normal operationDD
 tq     I    179
 pdout     O    207For phaselocked loop, this two pins is attached to external filter
 pdin     I    206
Table A .9.3 spatial decoder pin assignment
Signal namePinSignal namePinSignal namePinSignal namePin
nc 208 nc 156 nc 104 nc 52
test pin 207 nc 155 nc 103 nc 51
test pin206 nq154nc102nc50
GND 205 nc 153 VDD 101 DRAM_data[15] 49
OE 204 data[7] 152 out_accept 100 nc 48
DRAM_addr[0] 203 data[6] 151 out_valid 99 DRAM_data[16] 47
VDD 202 nc 150 out_data[0] 98 nc 46
cc 201 data[5] 149 out_data[1] 97 GND 45
DRAM_addr[1] 200 nc 148 GNO 96 DRAM_data[17] 44
DRAM_addr[2] 199 data[4] 147 out_data[2] 95 nc 43
GND 198 GND 146 out_data[3] 94 DRAM_data[18] 42
DRAM_addr[3] 197 data[3] 145 out_data[4] 93 VDD 41
nc 196 nc 144 out_data[5] 92 nc 40
DRAM_addr[4] 195 data[2] 143 VDD 91 DRAM_data[19] 39
VDD 194 nc 142 out_data[6] 90 DRAM_data[20] 38
DRAM_addr[5] 193 data[1] 141 out_data[7] 89 nc 37
DRAM_addr[5] 192 data[0] 140 out_data[8] 88 GND 36
nc 191 nc 139 out_extn 87 DRAM_data[21] 35
GND 190 VDD 138 GND 86 nc 34
DRAM_addr[7] 189 nc 137 DRAM_data[0] 85 DRAM_data[22] 33
DRAM_addr[8] 188 addr[6] 136 DRAM_data[1] 84 VDD 32
VDD 187 addr[5] 135 DRAM_data[2] 83 DRAM_data[23] 31
DRAM_addr[9] 186 GNO 134 VDD 82 DRAM_data[24] 30
nc 185 addr[4] 133 DRAM_data[3] 81 nc 29
DRAM_addr[10] 184 addr[3] 132 nc 80 GND 28
GND 183 addr[2] 131 DRAM_data[4] 79 DRAM_data[25] 27
coded_clock 182 addr[1] 130 GND 78 nc 25
VDD 181 VDD 129 nc 77 DRAM_data[25] 25
test pin 180 addr[0] 128 DRAM_data[5] 76 nc 24
test pin 179 enabte[0] 127 nc 75 VDD 23
test pin 178 enabte[t] 126 DRAM_data[6] 74 DRAM_data[27] 22
deccder_clock 177 r w 125 VDD 73 nc 21
byie_rncde 176 GND 124 DRAM_data[7] 72 DRAM_data[28] 20
GND 175 test pin 123 nc 71 DRAM_data[29] 19
coded_extn 174 test pin 122 DRAM_data[8] 70 GND 18
Table A .9.3 spatial decoder pin assignment (continuing)
The signal name pinThe signal name pinThe signal name pinThe signal name pin
 nc            208 nc             156  nc             104  nc              52
 test pin      207 nc             155  nc             103  nc              51
 test pin      206 irQ            154  nc             102  nc              50
 GND           205 nc             153  VDD            101  DRAM_data[15]   49
 OE            204 data[7]        152  out_accest     100  nc              48
 DRAM_addr[0]  203 data[6]        151  out_vaiid       99  DRAM_data[15]   47
 VDD           202 nc             150  out_data[0]     98  nc              46
 nc            201 data[5]        149  out_data[1]     97  GND             45
 DRAM_addr[1]  200 nc             148  GND             96  DRAM_data[17]   44
 DRAM_addr[2]  199 data[4]        147  out_data[2]     95  nc              43
 GND           198 GND            146  out_data[3]     94  DRAM_data[18]   42
 DRAM_addr[3]  197 data[3]        145  out_data[4]     93  VDD             41
 nc            196 nc             144  out_data[5]     92  nc              40
 DRAM_addr[4]  195 data[2]        143  VDD             91  DRAM_data[19]   39
 VDD           194 nc             142  out_data[6]     90  DRAM_data[20]   38
 DRAM_addr[5]  193 data[1]        141  out_data[7]     89  nc              37
 DRAM_addr[6]  192 data[0]        140  out_data[8]     88  GND             36
 nc            191 nc             139  out_extn        87  DRAM_data[21]   35
 GND           190 VDD            138  GND             86  nc              34
 DRAM_addr[7]  189 nc             137  DRAM_data[0]    85  DRAM_data[22]   33
 DRAM_addr[8]  188 addr[6]        136  DRAM_data[1]    84  VDD             32
 VDD           187 addr[5]        135  DRAM_data[2]    83  DRAM_data[23]   31
 DRAM_addr[9]  186 GND            134  VDD             82  DRAM_data[24]   30
 nc            185 addr[4]        133  DRAM_data[3]    81  nc              29
 DRAM_addr[10] 184 addr[3]        132  nc              80  GND             28
 GND           183 addr[2]        131  DRAM_data[4]    79  DRAM_data[25]   27
 coded_dock    182 addr[1]        130  GND             78  nc              26
 VDD           181 VDD            129  nc              77  DRAM_data[26]   25
 test pin      180 addr[0]        129  DRAM_data[5]    76  nc              24
 test pin      179 enable[0]      127  nc              75  VDD             23
 test pin      178 enable[1]      126  DRAM_data[6]    74  DRAM_data[27]   22
 decoder_clock 177r w             125 VDD             73 nc              21
 byte_mode     176 GND            124  DRAM_data[7]    72  DRAM_data[28]   20
 GND           175 test pin       123  nc              71  DRAM_data[29]   19
 coded_extn    174 test pin       122  DRAM_data[8]    70  GND             18
Table A .9.3 spatial decoder pin assignment (continuing)
Signal namePinSignal namePinSignal namePinSignalnamePin
nc
173 trst 121GND 59 DRAM_data[30] 17
coded_data[7] 172tco 120 DRAM_data[9] 58nc 16
coded_data[6] 171nc 119nc 57 DRAM_data[31] 15
VCD 170VDD 118 DRAM_data[10] 56VDD 14
coded_data[5] 169tms 117VDD 55nc 13
coced_data[4] 168tdi 116nc 54WE 12
coced_data[3] 167lck 115 DRAM_data[11] 53RAS 11
coced_data[2] 166test pin 114nc 52nc 10
GND 165GND 113 DRAM_data[12] 51GND 9
coded_data[1] 164DRAM_ecable 112 GND 50 CAS[0] 8
coded_data[0] 163test pin 111 DRAM_data[13] 59nc 7
coded_valid162test pin110nc58 CAS[1]6
coded_acceat 161test pin 109 DRAM_data[14] 57VDD 5
reset 160nc 108VDD 56 CAS[2] 4
VDD 159nc 107nc 55nc 3
nc 158nc 106nc 54 CAS[3] 2
nc 157nc 105nc 53nc 1
The implication of (notes) Table A .9.3 signal name is seen A.7.1 to A.9.1 saving. A.9.1.1 " nc " is without connecting pin
Indicate among the Table A .9.3 that the pin of nc is current not to be used. These pins should not be connected. A.9.1.2 VDDWith the GND pin
The people of skill will recognize as in tool this area, all V that are providedDDAll should be connected with suitable power supply with the GND pin. Unless all VDDAll correctly used with the GND pin, otherwise can not be guaranteed correct equipment operation. A.9.1.3 the test pin of normal operating links
9 pins on the spatial decoder are preserved for close beta.
The default test pin of Table A .9.4 connects
Pin numberConnect
Ground connection during normal operating
Meet V during normal operatingDD
Let alone open circuit during normal operating
A.9.1.4 the JTAG pin of normal operating
See A.8.1 A.9.2 spatial decoder storage map of part
Table A .9.5 spatial decoder memory map list
Address (hexadecimal)Register nameSee Table
0x00...0x03The break in service district A.9.6
0x04...0x07The input circuit register A.9.7
0x08...0x0FThe initial code detected register
0x10...0x15Buffer starts control register A.9.8
0x16...0x17Do not use
0x18...0x23The DRAM interface arranges register A.9.9
0x24...0x26Buffer manager for use access and keyhole register A.9.10
0x27Do not use
0x28...0x2FThe Hafman decoding register A.9.13
0x30...0x39The re-quantization register A.9.14
0x3A...0x3BDo not use
0x3CKeep
0x3D...0x3FDo not use
0x40...0x7FScratchpad register
Table A .9.6 break in service district register
Address (hexadecimal)ItemRegister nameReference page
  0x00
    7   chip_event CED_EVENT_0
    6Do not use
    5   Illegal_length_count_event   SCD_ILLEGAL_LENGTH_COUNT
    4Keep readable 1 or 0SCD_JPEG_OVERLAPPING_START
    3   overlapping_start_event   SCD_NON_JPEG_OVERLAPPING_START
    2   unrecognised_start_event   SCD_UNRECOGNISED_START
    1   stop_after_picture_event   SCD_Stop_AFTER_PICTURE
    0   non_aligned_start_event   SCD_NON_ALIGNED_START
  0x01
    7   chip_mask CED_MASK_0
    6Do not use
    5   Illegal_length_count_mask
    4Keep, 0 writes thismemory cell SCD_JPEG_OVERLAPPING_START
    3   non_lpeg_credapping_start_mask
    2   unrecognised_start_mask
    1   stop_after_picture_mask
    0   non_aligned_start_mask
  0x02
    7   ldct_too_few_event IDCT_DEFF_NUM
    6   idct_too_many_event IDCT_SUPER_NUM
    5   accept_enable_event BS_STREAM_END_EVENT
    4   target_met_event BS_TARGET_MET_EVENT
    3   counter_flushed_too_early_event   BS_FLUSH_BEFORE_TARGET_MET_EVENT
    2   counter_flushed_event BS_FLUSH_EVENT
    1   parser_event DEMUX_EVENT
    0   huttman_event HUFFMAN_EVENT
Table A .9.6 break in service district's register (continuing)
Address (hexadecimal)ItemRegister nameReference page
0x03
    7  ldct_too_few_mask
    6  idct_too_many_mask
    5  accept_enable_mask
    4  target_met_mask
    3  counter_flushed_too_early_mask
    2  counter_flushed_mask
    1  parser_mask
    0  hutfman_mask
Table A .9.7 detector for initial code and input circuit register
Address (hexadecimal)ItemRegister nameReference page
  0x04
    7  coded_busy
    6  enable_mpl_lnput
    5  coded_extn
    4:0Do not use
  0x05     7:0  coded_data
  0x06     7:0Do not use
  0x07     7:0Do not use
  0x08     7:1Do not use
    0  start_code_detector_access  also input_clrcuit_access  CED_SCD_ACCESS
  0x09     7:4The CED_SCE_STATUS of usefulness not
    3  stop_after_picture
    2  discard_extension_data
    1  discard_user_data
    0  ignore_non_aligned
  0x0A     7:5The CED_SCD_CONTROL of usefulness not
    4  insert_sequence_start
    3  discard_all_data
    2:0  start_code_search
Table A .9.7 detector for initial code and input circuit register (continuing)
Address (hexadecimal)ItemRegister nameReference page
  0x0B   7:0Scratchpad register length_count
  0x0C   7:0
  0x0D   7:2Do not use
  1:0 start_code_detector_coding_standard
  0x0E   7:0 start_value
  0x0F   7:4Do not use
  3:0picture_number
Table A .9.8 buffer starts register
Address (hexadecimal)ItemRegister nameReference page
  0x10     7:1Do not use
    0  startup_access CED_BS_ACCESS
  0x11     7:3
    2:0  bit_count_prescate CED_BS_PRESCALE
  0x12     7:0  bit_count_target CED_BS_TARGET
  0x13     7:0  bit_count CEO_BS_COUNT
  0x14     7:1Do not use
    0  offchip_queue CED_BS_OUEUE
  0x15     7:1Do not use
    0  enable_stream CED_BS_ENABLE_NXT_STM
Table A .9.9DRAM interface arranges register
Address (hexadecimal)ItemRegister nameReference page
  0x18     7:5Do not use
    4:0  page_start_length  CED_IT_PAGE_START_LENGTH
  0x19     7:4Do not use
    3:0  read_cycle_length
  0x1A     7:4Do not use
    3:0 write_cycle_length
Table A .9.9 DRAM interface arranges register (continuing)
Address (hexadecimal)ItemRegister nameReference page
  0x1B     7:4Do not use
    3:0  refresh_cycle_length
  0x1C     7:4Do not use
    3:0  CAS_falling
  0x1D     7:4Do not use
    3:0  RAS_falling
  0x1E     7:1Do not use
    0  lntertace_timing_access
  0x1F     7:0  refresh_intervai
  0x20     7Do not use
    6:4  DRAM_addr_strength[2:0]
    3:1  CAS_strength[2:0]
    0  RAS_strength[2]
  0x21     7:6  RAS_strength[1:0]
    5:3  OEWE_strength[2:0]
    2:0  DRAM_data_strength[2:0]
  0x22     7For the ACCESS position of filling intensity etc. etc.?untapped CED_DRAM_CONFIGURE
    6  zero_buffers
    5  DRAM_enable
    4  no_refresh
    3:2  row_address_bits[1:0]
    1:0  DRAM_data_width[1:0]
  0x23     7:0Scratchpad register CED_PLL_RES_CONFIG
Table A .9.10 buffer manager for use access and keyhole register
Address (hexadecimal)ItemRegister nameReference page
  0x24     7:1Do not use
    0   buffer_manager_access
  0x25     7:6Do not use
    5:0   buffer_manager_keyhole_address
  0x26     7:0   buffer_manager_keyhole_data
Table A .9.11 buffer manager for use expanded address space
Address (hexadecimal)ItemRegister nameReference page
  0x00   7:0Do not use
  0x01   7:2
  1:0   cdb_base
  0x02   7:0
  0x03   7:0
  0x04   7:0Do not use
  0x05   7:2
  1:0   cdb_length
  0x06   7:0
  0x07   7:0
  0x08   7:0Do not use
  0x09   7:0   cdb_read
  0x0A   7:0
  0x0B   7:0
  0x0C   7:0Do not use
  0x0D   7:0   cdb_number
  0x0E   7:0
  0x0F   7:0
  0x10   7:0Do not use
  0x11   7:0   tb_base
  0x12   7:0
  0x13   7:0
  0x14   7:0Do not use
  0x15   7:0   tb_length
  0x16   7:0
  0x17   7:0
  0x18   7:0Do not use
  0x19   7:0   tb_read
  0x1A   7:0
  0x1B   7:0
  0x1C   7:0Do not use
  0x1D   7:0   tb_number
  0x1E   7:0
  0x1F   7:0
Table A .9.11 buffer manager for use expanded address space (continuing)
Address (hexadecimal)ItemRegister nameReference page
  0x20   7:0Do not use
  0x21   7:0  buffer_limit
  0x22   7:0
  0x23   7:0
  0x24   7:4Do not use
  3  cdb_full
  2  cdb_empty
  1  tb_full
  0  tb_empty
Table A .9.12 video separator register
Address (hexadecimal)ItemRegister nameReference page
  0x28
    7   demux_access CED_H_CTRL[7]
    6:4   huffman_ertor_code[2:0] CED_H_CTRL[5:4]
    3:0Private Huffman control bit [3] is selected space CBP, and [2] select 4/8bit fixed length CBP
  0x29     7:0   parser_ertor_code CED_H_DMUX_ERR
  0x2A     7:4Do not use
    3:0   demux_keyhole_address   CED_H_KEYHOLE_ADDR
  0x2B     7:0
  0x2C     7:0   demux_keyhole_data CED_H_KEYHOLE
  0x2D
    7   dummy_last_picture CED_H_ALU_REG0.   r_dummy_last_frame_bit
    6   field_into CED_H_ALU_REG0,r_field_into_bit
    5:1Do not use
    0   continue CED_H_ALU_REG0,r_continue_bit
  0x2E     7:0   rom_revislon CED_H_ALU_REG1
  0x2F     7:0The private register
Table A .9.12 video separator register (continuing)
Address (hexadecimal)ItemRegister nameReference page
  0x2F
    7CED_H_TRACE_EVENT writes 1 to single step, and after single step is finished, 1 will be read out
    6CED_H_TRACE_MASK is arranged to 1, to enter single-step mode
    5CED_H_TRACE_RST works as when arranging 1,0 sequence properly, partial reset
    4:0Do not use
Table A .9.13 video separator expanded address space
Address (hexadecimal)ItemRegister nameReference page
  0x00   0x0F     7:0Do not use
  0x10     7:0  horiz_pels r_horiz_pels
  0x11     7:0
  0x12     7:0  vert_pels r_vert_pels
  0x13     7:0
  0x14     7:2Do not use
    1:0  buffer_size r_buffer_size
  0x15     7:0
  0x16     7:4Do not use
    3:0 pel_aspect r_pel_aspect
  0x17     7:2Do not use
    1:0  bit_rate r_bit_rate
  0x18     7:0
  0x19     7:0
  0x1A     7:4Do not use
    3:0  pic_rate r_pic_rate
  0x1B     7:1Do not use
    0Restricted r_constrained
  0x1C     7:0  picture_type
  0x1D     7:0  h261_pic_type
Table A .9.13 video separator expanded address space (continuing)
Address (hexadecimal)ItemRegister nameReference page
  0x1E     7:2Do not use
    1:0  broken_closed
  0x1F     7:5Do not use
    4:0  prediction_mode
  0x20     7:0  vbv_delay
  0x21     7:0
  0x22     7:0MPEG private register full_pel_fwd, JPEG Pending_frame_change
  0x23     7:0MPGE private register full_pel_bwd, JPEG restart_index
  0x24     7:0Private register horiz_mb_copy
  0x25     7:0  pic_number
  0x25     7:1Do not use
    1:0  max_h
  0x27     7:1Do not use
    1:0  max_v
  0x28     7:0Private register scratch1
  0x29     7:0Private register scratch2
  0x2A     7:0Private register scratch3
  0x2B     7:0  Nt MPEG unused1.H261 ingob
  0x2C     7:0MPGE private register first_group, JPEG first_scan
  0x2D     7:0MPEG privateregister in_picture
  0x2E
    7  dummy_last_picture r_rom_control
    6  field_into
    5:1Do not use
    0Continue
  0x2F     7:0  rom_revision
  0x30     7:2Do not use
    1:0  dc_huff_0
  0x31     7:2Do not use
    1:0  dc_huff_1
  0x32     7:2Do not use
    1:0  dc_huff_2
Table A .9.13 video separator expanded address space (continuing)
Address (hexadecimal)ItemRegister nameReference page
  0x33     7:2Do not use
    1:0  dc_huff_3
  0x34     7:2Do not use
    1:0  ac_huff_0
  0x35     7:2Do not use
    1:0  ac_huff_1
  0x36     7:2Do not use
    1:0  ac_huff_2
  0x37     7:2Do not use
    1:0  ac_huff_3
  0x38     7:2Do not use
    1:0  tq_0r_tq_0
  0x39     7:2Do not use
    1:0  tq_1r_tq_1
  0x3A     7:2Do not use
    1:0  tq_2r_tq_2
  0x3B     7:2Do not use
    1:0  tq_3r_tq_3
  0x3C     7:0  component_name_0r_c_0
  0x3D     7:0  component_name_1r_c_1
  0x3E     7:0  component_name_2r_c_2
  0x3F     7:0  component_name_3r_c_3
  0x40   0x53     7:0The private register
  0x40     7:0  r_dc_pred_0
  0x41     7:0
  0x42     7:0  r_dc_pred_1
  0x43     7:0
  0x44     7:0  r_dc_pred_2
  0x45     7:0
  0x46     7:0  r_dc_pred_3
  0x47     7:0
  0x48   0x4F     7:0Do not use
Table A .9.13 video separator expanded address space (continuing)
Address (hexadecimal)ItemRegister nameReference page
  0x50     7:0  r_prev_mnt
  0x51     7:0
  0x52     7:0  r_prev_mvt
  0x53     7:0
  0x54     7:0  r_xev_mhb
  0x55     7:0
  0x56     7:0  r_prev_mvb
  0x57     7:0
  0x58   0x5F     7:0Do not use
  0x60     7:0  r_horiz_mbcnt
  0x61     7:0
  0x62     7:0  r_vert_mbcnt
  0x53     7:0
  0x64     7:0  horiz_macroblocks r_horiz_mbs
  0x65     7:0
  0x66     7:0  vert_macroblocks r_vert_mbs
  0x67     7:0
  0x68     7:0Private register r_restart_cnt
  0x69     7:0
  0x6A     7:0  restart_interval r_restart_int
  0x6B     7:0
  0x6C     7:0Private register r_blk_h_cnt
  0x6D     7:0Private register r_blk_v_cnt
  0x6E     7:0Private register r_conpid
  0x6F     7:0  max_component_id r_max_compid
  0x70     7:0  coding_standard r_coding_std
  0x71     7:0Private register r_pattern
  0x72     7:0Private register r_fwd_r_sige
  0x73     7:0Private register r_bwd_r_sige
  0x74   0x77     7:0Do not use
  0x78     7:2Do not use
    1:0  blocks_h_0 r_blk_h_0
Address (hexadecimal)ItemRegister nameReference page
  0x850   0x85F     7:0   CED_KEY_MTYPE_P_CPB
  0x860   0x86F     7:0   CED_KEY_MTYPE_B_CPB
  0x870   0x88F     7:0   CED_KEY_MTYPE_H251_CPB
  0x880   0x900     7:0Do not use
  0x901     7:0   CED_KEY_HDSTROM_0
  0x902     7:0   CED_KEY_HDSTROM_1
  0x903   0x90F     7:0   CED_KEY_HDSTROM_2
  0x910   0xAB   F     7:0Do not use
  0xAC   0     7:0   CED_KEY_DMX_WORD_0
  0xAC
  1     7:0   CED_KEY_DMX_WORD_1
  0xAC
  2     7:0   CED_KEY_DMX_WORD_2
  0xAC
  3     7:0   CED_KEY_DMX_WORD_3
  0xAC
  4     7:0   CED_KEY_DMX_WORD_4
  0xAC
  5     7:0   CED_KEY_DMX_WORD_5
  0xAC
  6     7:0   CED_KEY_DMX_WORD_6
  0xAC
  7     7:0   CED_KEY_DMX_WORD_7
Table A .9.13 video distributor expanded address space (continuing)
Address (hexadecimal)ItemRegister nameReference page
  0x189   7:0   ac_eob_1CED_H_KEY_EOB_INDEX1
  0x18A   0x18B   7:0Do not use
  0x18C   7:0   ac_zrl_0CED_H_KEY_ZRL_INDEX0
  0x18D   7:0   ac_zrl_1CED_H_KEY_ZRL_INDEX1
  0x18E   0x1FF   7:0Do not use
  0x200   0x2AF   7:0   ac_huffval_0[161:0]CED_H_KEY_AC_ITOD_0
  0x2B0   0x2BF   7:0   dc_huffval_0[11:0]CED_H_KEY_DC_ITOD_0
  0x2C0   0x2FF   7:0Do not use
  0x300   0x3AF   7:0   ac_huffval_1[161:0]CED_H_KEY_AC_ITOD_1
  0x3B0   0x3BF   7:0   dc_huffval_1[11:0]CED_H_KEY_DC_ITOD_1
  0x3C0   0x7FF   7:0Do not use
  0x800   0xAC  F   7:0The private register
  0x800   0x90F   7:0   CED_KEY_TCOEFF_CPB
  0x810   0x81F   7:0   CED_KEY_CBP_CPB
  0x820   0x82F   7:0   CED_KEY_MBA_CPB
  0x830   0x83F   7:0   CED_KEY_MVD_CPB
  0x840   0x84F   7:0   CED_KEY_MTYPE_I_CPB
Table A .9.13 video distributor expanded address space (continuing)
  0x79     7:2
    1:0  blocks_h_1r_blk_h_1
  0x7A     7:2
    1:0  blocks_h_2r_blk_h_2
  0x7B     7:2
    1:0  blocks_h_3r_blk_h_3
  0x7C     7:2
    1:0  blocks_v_0r_bik_v_0
  0x7D     7:2
    1:0  blocks_v_1r_blk_v_1
  0x7E     7:2
    1:0  blocks_v_2r_blk_v_2
  0x7F     7:2
    1:0  blocks_v_3r_blk_v_3
  0x7F   0xFF     7:0
  0x100   0x10F     7:0  dc_bits_0[15:0]CED_H_KEY_DC_CPB0
  0x110   0x11F     7:0  dc_bits_1[15:0]CED_H_KEY_DC_CPB1
  0x120   0x13F     7:0
  0x140   0x14F     7:0  ac_bits_0[15:0]CED_H_KEY_AC_CPB0
  0x150   0x15F     7:0  ac_bits_1[15:0]CED_H_KEY_AC_CPB1
  0x160   0x17F     7:0
  0x180     7:0  dc_zssss_0CED_H_KEY_ZSSSS_INDEX0
  0x181     7:0  dc_zssss_1CED_H_KEY_ZSSSS_INDEX1
  0x182   0x187     7:0
  0x188     7:0  ac_eob_0CED_H_KEY_EOB_INDEX0
Address (hexadecimal)ItemRegister nameReference page
  0xAC
  8     7:0   CED_KEY_DMX_WORD_8
  0xAC
  9     7:0   CED_KEY_DMX_WORD_9
  0xAC A   0xAC B     7:0Do not use
  0xAC   C     7:0   CED_KEY_DMX_AINCR
  0xAC   D     7:0
  0xAC   E     7:0   CED_KEY_DMX_CC
  0xAC   F     7:0
Table A .9.13 video distributor expanded address space (continuing)
Table A .9.14 inverse quantizer register
Address (hexadecimal)ItemRegister nameReference page
 7:0Do not use
0x30  7:1Do not use
 0   lq_access
0x31  7:2Do not use
 1:0   lq_coding_standard
0x32  7:5Do not use
 4:0Scratchpad register iq_scale
0x33  7:2Do not use
 1:0Scratchpad register iq_component
0x34  7:2Do not use
 1:0Scratchpad register inverse_quantiser_ prediction_mode
0x35  7:0Scratchpad register jpeg_indirection
0x36  7:2Do not use
 :0Scratchpad register mpeg_indirection
0x37  7:0Do not use
Table A .9.14 inverse quantizer register (continuing)
Address (hexadecimal)ItemRegister nameReference page
0x38  7:0  iq_table_keyhole_address
0x39  7:0  iq_table_keyhole_data
Table A .9.15 Iq (re-quantization) table extended address space
Address (hexadecimal)Register nameReference page
  0x00:0x3FThe default base table of JPEG re-quantization table 0 MPEG
  0x40:0x7FThe default non-base table of JPEG re-quantization table 1 MPEG
  0x80:0xBFThe long-range base table of packing into of JPEG re-quantization table 2 MPEG
  0xC0:0xFFThe long-range non-base table of packing into of JPEG re-quantization table 3 MPEG
A.10 coded data input
Must know that corresponding to system of the present invention which kind of video standard is transfused to for processing. After this, the token that system can accept to be pre-existing in or original word data, then the raw bytes data are put into token by detector for initial code.
Thereby, coded data and token is set can be provided for spatial decoder by two kinds of approach:
The coded data input port
MPI (MPI)
Which paths of choice for use will depend on is used and system environments. For example, when low data rate, might control simultaneously with a single microprocessor multichannel distribution of decoder chip collection and completion system Bit String. In this case, might finish by MPI the input of coded data. Select as another kind, a high coded data rate may require coded data to be provided by the coded data port.
In some applications, may be fit to take the input of MPI and coded data port hybrid. A.10.1 coded data port
Table A .10.1 coded data port signal
Signal nameI/OExplanation
coded_clockEnterOne operation up to the conspicuous clock control input circuit of 30 megahertzes
coded_data[7:0]EnterRealize that the token port transmits desired 11 lines of 8bit data value, see that A.4 part is to the electric description of this interface. Off-chip circuitry must be assembled coded data becomes token
coded_extnEnter
coded_validEnter
coded_acceptGo out
byte_modeEnterWhen this signal is that high level represents that then the information through the coded data port transmission is by byte mode rather than token mode.
Can two kinds of patterns be operated corresponding to coded data port of the present invention: token pattern and byte mode. A.10.1.1 token pattern
In the present invention, if byte_mode is low level, the coded data port moves as a token port in conventional method so, and accepts token under coded_valid and coded_ accept control. The electric details of operation of this interface is seen A.4 part.
Signal byte_mode and data[7:0], coded_extn and coded_valid namely are sampled at the rising edge of Coded_Clock simultaneously. A.10.1.2 byte mode
Yet if byte_mode is in high level, data byte is being sent to data[7:0 under the control of two-wire interface control signal coded_valid and coded_accept so] on. In the case, coded_extn is left in the basket. Byte is fitted into the DATA token continuously on chip, until input pattern is changed.
The first character of the token that 1) in the token pattern, provides (" head ").
2) the token the last character (coded_extn becomes low level) that provides.
3) first byte of the data that in byte mode, provide. A new DATA token is produced on chip automatically. A.10.2 provide data by MPI
Token can offer spatial decoder through MPI by the Access Coding Minimal data input register. A.10.2.1 write sign by MPI
Coded data register among the present invention be assembled in the memory map two bytes so that active data transmit. 8 data bit, coded_data[7:0] at a memory cell, and control register, coded_busy, enable_mpi_input and coded_ extn are in second memory cell (seeing Table A.9.7).
When disposing for the token input of passing through MPI, whenever a value is written into coded_ data[7:0], current token is just by the currency expansion with coded_extn. Software should be written into coded_data[7:0 at any the last character of token] before coded_extn be set to 0.
For example, write 1 then to coded_data[7:0 to coded_extn] writing 0 * 04 can an initial DATA token. Then, the initial spatial decoder that is delivered to of this new DATA token is to deal with.
Whenever 8 new place values are written into coded_data[7:0], current token just is expanded. When finishing current token, when for example introducing another token, coded_extn only needs by again access. By writing 0 to coded_extn then to coded_data[7:0] write the last character of current token, the last character of current token can be suggested.
Table A .10.2 coded data input register
Register nameSize/directionResetmodeExplanation
coded_extn
  1   rwUncertainBy writing these registers, through MPI, token can offer spatial decoder
coded_data [7:0]   8   wUncertain
coded_busy   1   r 1This buffer status represents whether spatial decoder can accept to write on coded_ data[7:0] in token.Value 1 expression interface is busy, can not accept data. When coded_busy=1, the user attempts coded_data[7:0] situation that writes then can not determine.
enable_mpl_ input   1   rw 0The coded data input that this function allows the value of register to control to spatial decoder is by coded data port (0) or by MPI (1)
Writing coded_data[7:0 at every turn] before, coded_busy should be examined to see whether interface has been ready to accept more data. A.10.3 the conversion between the input pattern
Suppose that suitable advance notice is observed, then might dynamically change data entry mode. Usually in fact, the token passing that is undertaken by any approach should finish before translative mode.
Table A .10.3 translation data input pattern
Preceding modeNext patternBehavior
ByteTokenOn-chip circuit will be used in last byte of providing under the byte mode as last byte of data token, and this data token is accepting to consist of (namely, the extn position will be set to 0) before the next token
The MPI input
TokenByteUnder the token pattern, provide the sheet external circuit of token before selecting byte mode, to be responsible for finishing token (the extn position that is the last byte of information is set to 0)
The MPI inputTo not be allowed to (being that coded_ busy maintenance is set to 1) by MPI access input, unless under the token pattern, provide the circuit of token to finish token (the extn position that is the last byte of information is set to 0)
The MPI inputByteBe set to before 0 at enable_mpi_input, control software must be finished this token (the extn position that is the last byte of information is set to 0)
The MPI input
The first byte that provides with byte mode causes a DATA token head to be produced at chip. Any other byte that transmits with byte mode all is affixed to this DATA token after this, until input pattern changes. Recall, the DATA token can comprise required any multidigit again.
The MPI register-bit, coded_busy and signal coded_accept indication spatial decoder will be ready which interface to accept data from. The correct monitoring of these signals can guarantee do not have data to be lost. A.10.4 coded data is accepted speed
In the present invention, input circuit is sent to detector for initial code (seeing A.11 part) with token. Detector for initial code is analyzed the data bit in the DATA token continuously. The normal process speed of detector is one of each clock cycle (coded_clock). Correspondingly, say that typically it is with the coded data of a byte of per 8 coded_clock cycle decoders. Yet, sometimes also need the treatment cycle that adds, for example when a non-DATA token is provided or runs into an initial code in coded data. When this happens, detector for initial code can not be accepted more information within a blink.
After detector for initial code, data enter first a logic coding data buffer. If this buffer is full, detector for initial code can not be accepted more information so.
Thereby, when detector for initial code can not receive more information, from the coded data port or do not have more from MPI that odd encoder data (or other token) are accepted. This will (state of coded_accept and register coded_busy be indicated by signal.
By using coded_accept and/or coded_busy, the user is guaranteed not have coded message and is lost. Yet as having in this area the people of skill will recognize, if spatial decoder can not be accepted data, system must be able to cushion the newly arrived coded data arrival of new data (or stop). A.10.5 coded data clock
Corresponding to the present invention, other function in coded data port, input circuit and the spatial decoder is controlled by coded_clock. Further, this clock can be asynchronous with main decoder _ clock. Data transmit by with chip on decoder_clock synchronous. A.11 detector for initial code initial code A.11.1
Known such as people in this area, but MPEG and H.261 the coding the video string include the discrimination bit model that is called as initial code. The mark code plays similar functions in JPEG. The live part of initial/mark code recognition coding serial data grammer. The analysis of detector for initial code is finished initial/mark code is the first step in the coded data syntactic analysis. Detector for initial code is first piece of following on the spatial decoder in the input circuit back.
Initial/mark code model is designed to and can be identified and the whole bit string that need not to decode. Like this, they can be used to help to find wrong and the starting decoder corresponding to the present invention. Detector for initial code provides facility to detect wrong in encoded data structure and to assist the starting of decoder. A.11.2 detector for initial code register
As previously mentioned, many detector for initial code registers are being used regularly by detector for initial code. Therefore, if detector for initial code just in deal with data, these registers of access will be insecure. The user has a responsibility for guaranteeing that detector for initial code is stopped before its register of access.
Thereby register start_code_detector_access is used to stop detector for initial code and allows its register of access. Have no progeny in it produces one, the initial code register will stop.
Look for and abandon all data patterns at initial code and when can be initiated in addition further restriction. A.11.8 and A.11.5.1 these describing to some extent.
Table A .11.1 detector for initial code register
Register nameSize/directionReset modeExplanation
start_code _detector_ access   1   rw   0Write 1 and stop to allow its register of access to this register request detector for initial code. The user need to the value of waiting until 1 from then on register read, stopped and can having carried out access with the expressionoperation
illegal_length _count_event
  1   rw   0One counting step illegal event will occur when writing the coding jpeg data, the found value of length gauge digital section is less than 2, this should only appear in the jpeg data and to produce error result such as mask register is set to 1, and interruption can be produced, and the initial code decoder will quit work. If wrong suppressed (mask register is set to 0) follows vicious behavior is uncertain, see A.11.4.1
illegal_length _count_mask   1   rw   0
ipeg_overlapp-ing_start_ event   1   rw   0Be JPEG such as coding standard, in to the locking of mark code,sequence 0 *FF 0 * FF is found, and then this event occurs. This sequence is effective padding sequence. Be to be arranged to 1 such as mask register, then interrupt and to be produced, and the initial code decoder will quit work, see A.11.4.2
ipeg_overlapp-ing_start_mask   1   rw   0
overlapping_start_event   1   rw   0Be MPEG or H.261 such as coding standard, in to the initial code locking, an overlapping initial code is found, and then this event occurs. Be to be set to 1 interruption can be produced such as mask register, and the initial code decoder will quit work. See A.11.4.2
overlapping_start_mask   1   rw   0
Table A .11.1 detector for initial code register (continuing)
Register nameSize/directionReset modeExplanation
unrecognised_ start_event
    1     rw     0If a unrecognizable initial code is run into, this event will occur. Can be produced when mask register is set as 1 interruption, and the initial code decoder will quit work. When the initial code decoder stopped, the initial code value of reading from bit string can obtain in the start_value register. See that A.11.4.3 when normal operating, what comprise among the start_value is initial/up-to-date decode value of mark code. Under H.261 operating, only 4 of start_ value least significant bits are used 4 high significance bits and are 0.
unrecognised_ start_mask     1     rw     0
start_value     8     roUncertain
stop_after_ picture_event     1     rw     0Be set to 1 such as the stop_after_picture register, then " stopping behind the image " event will occur in visual tail and can produce by being set as 1 interruption such as mask register after the initial code decoder, and the initial code decoder will quit work, see A.11.5.1 after visual tail has been detected, stop _ after_picture is by clear 0, therefore need be by directly clear 0.
stop_after_ picture_mask     1     rw     0
stop_after_ picture     1     rw     0
Table A .11.1 detector for initial code register (continuing)
Register nameSize/directionReset modeExplanation
non_aligned_ start_event
   1    rw   0When ignore_non_aligned is set to 1, there is not the initial code of byte location to be left in the basket (treating as normal data) when ignore_non_aligned is set to 0, H.261 will be detected with the MPEG initial code. No matter the byte location also is non-byte location, initiation event all will occur. Be set to 1 such as mask register, then event will cause interruption, and the initial code decoder will stop, if see that A.11.6 coding standard is configured to JPEG, ignore_non_aligned is left in the basket, and the initiation event of non-location will never occur
non_aligned_ start_mask    1    rw   0
ignore_non_ aligned    1    rw   0
discard_ extension_data    1    rw   1When this register is set to 1, expansion or the user data that can not be decoded by spatial decoder are abandoned by the initial code decoder, see A.11.3.3
discard_ user_data    1    rw   1
discard_ all_data    1    rw   0When being set to 1, all data and token are abandoned by the initial code decoder. This situation continue up to that the FLUSH token is provided or register by directly clear 0. The FLUSH token of this register of resetting is left in the basket, and is not exported by the initial code decoder, sees A.11.5.1
insert_ sequence _start    1    rw   1See A.11.7
Table A .11.1 detector for initial code register (continuing)
Register nameSize/directionReset modeExplanation
start_ code_ search     3     rw  5When this register is set to 0, the initial code decoder normally operates, and when being set to high value, the initial code decoder is ignored data, unless the initial code of specified type is detected. When the initial code of appointment is detected, this register is set to 0, and transfers normal operating to. See A.11.3
start_code_ detector_ coding_ standard     2     rw  0The coding standard that this register configuration initial code decoder uses. Register can directly load or make the CODING_STANDARD token. Whenever initial code decoder generation CODING_ STANDARD token (seeing A.11.7.4), it carries the configuration of present encoding standard. This token will dispose the coding standard that all other parts of decoding chip collection use. See A.21.1 and A.11.7
picture_ number     4     rw  0Be produced whenever the initial code decoder detects visual initial code (perhaps H.261 or the equivalent among a JPEG) PICTURE_START token in data flow, it is with the picture_number currency. This register is rised in value subsequently.
length_ count     16     ro  0This register comprises the currency of JPEG length counting, and register is modified under the coded data clock control, and can only when the initial code decoder stops, reading by MPI.
A.11.3 initial code is to the transformation of token
In normal operating, the function of detector for initial code is to identify initial code at serial data, then they is transformed into suitable initial code token. In the simplest situation, data offer detector for initial code with single long DATA token form. The output of detector for initial code is that it interleaves the shorter DATA token with the initial code token in a large number.
Corresponding to the present invention, alternative other method is that the input data of detector for initial code can be divided into a large amount of short DATA tokens. How being divided into the DATA token without limits for coded data, except each DATA token must comprise 8 * n position, is an integer at this n.
Other token can be provided directly to the input of detector for initial code. In this case, token does not pass other stage that detector for initial code arrives spatial decoder with not doing any processing. These tokens can only be inserted near locating before an initial code position in the coded data. A.11.3.1 initial code form
Detector for initial code among the present invention can be identified three kinds of different initial code forms. This is to be configured by register start_code_detector_coding_standard.
Table A .11.3 initial code form
Coding standardInitial code type (hexadecimal)Initial code value size
MPEG  0x00 0x00 0x01(value)     8bit
JPEG  0xFF(value)     8bit
H.261  0x00 0x01(value)     4bit
A.11.3.2 the equivalent of initial code token
After detecting an initial code, the value that detector for initial code research is associated with initial code, and produce a suitable token. Usually, token is to copy the suitable MPEG syntax to be named, yet the people with general skill in this area will recognize that token can defer to other name form. Current selected coding standard disposed initial code value and produce relation between the token. This relation is as shown in Table A .11.4.
Table A .11.4 is from the token of initial code value
The initial code token that producesInitial code value
MPEG (hexadecimal)H251 (hexadecimal)JPEG (hexadecimal)JPEG (name)
  PICTURE_START  0x00   0x0C   0xDA   SOS
  SLICE_STARTa  0x0:10  0xAF   0x0:10   0x0C   0xD0:0   0xD7   RST0:0   RST7
  SEQUENCE_START 0xB3  0xD8  SO1
  SEQUENCE_END  0xB7   0xD9   SO1
  GROUP_START  0xB8   0xC0   SOF0b
  USER_DATA  0xB2   0xE0:0   0xEF   APP0:0   APP:
  0xFE   COM
  EXTENSION_DATA  0xB5   0xC3   JPG
  0xF0:0   0xFD   JPG0:0   JPGD
  0xC2:0   0xBF   RES
  0xC1:0   0xCB   SOF:10   SCF11
  0xCC   DAC
  DHT_MARKER   0xC4   DHT
  DNL_MARKER   0xDC   DNL
  DQT_MARKER   0xDB   DOT
  DRI_MARKER   0xDD   DRI
A, token comprise the data segment of a 8bit, and the value that it is loaded is the value that is determined by initial code. B, point out that baseline DCT coded data begins. A.11.3.3 the extension feature of coding standard
Coding standard provides a large amount of gimmicks so that data are embedded in the serial data, and the current standard that is not encoded of the use of serial data defines. This can be " user data " for application that extra facility is provided for a particular manufacturer. As selection, it is " growth data " also. Coding standard authorities have kept in the future and are using growth data to increase the right of feature as coding standard.
The skill of two kinds of uniquenesses is used, JPEG priority user piece and the growth data with mark code. Yet, H.261 inserted " extraneous information " by an extraneous information position indication in the coded data. MPEG can use this two kinds of skill and technique.
Can be detected by detector for initial code corresponding to MPEG/JPEG user's piece of the present invention and the growth data in initial/mark code back. H.261/MPEG " extraneous information " detected by Huffman decoder of the present invention. Referring to A.14.7 " reception of extraneous information ".
Register discard_extension_data and discard_user_data allow detector for initial code to be configured to abandon user data and growth data. If these data are not dropped at detector for initial code, it can be accessed when arriving video separator. Referring to A.14.6. " reception of user and growth data ".
Spatial decoder of the present invention is supported the baseline characteristic of JPEG. The non-baseline characteristic of JPEG is considered as growth data by spatial decoder. Therefore, all JPEG mark codes before non-baseline jpeg data all are used as growth data and treat. A.11.3.4 JPEG table definition
JPEG supports the long-range Huffman that packs into and quantization table. In jpeg data, belonging to before the definition of these tables is mark code DNL and DQT. When these mark codes were detected, detector for initial code produced token DHT_MARKER and DQT_MARKER. These tokens comprise the coded data (using the form that illustrates among the JPEG) of explanation Huffman or quantization table to the DATA token of video separator indication back. A.11.4 error detection
Detector for initial code can detect some mistake in the coded data, and provides some facility so that decoder recovers (seeing A.11.8 " initial code is looked for ") after a mistake is detected. A.11.4.1 illegal JPEG length is counted
Most JPEG mark codes have 16 bit length count areas that interrelate with it. The indication of this field have how many data therewith the mark code be associated. 0 or 1 length counting is non-method. Illegal length only should occur after an error in data, and in the present invention, if illegal_length_count_mask is set to 1, this will produce an interruption.
Owing to seek the difficulty of initial code in jpeg data, recovering in the mistake from jpeg data also to need other data that indicate application (referring to A.11.8.1). A.11.4.2 overlapping initial/the mark code
In the present invention, overlapping initial code should only occur in the error in data back. The overlapping initial code of MPEG, byte location has illustrated in Figure 64. At this, detector for initial code is at first seen a kind of pattern that seems a similar visual initial code. Then, detector for initial code see this visual initial code by with one group initial overlapping. Correspondingly, detector for initial code produces an overlapping initiation event. And then detector for initial code will produce one and interrupt and stop, if overlapping_start_mask is set to 1.
Which is distinguished in two initial codes is correct and which causes it is very important by an error in data. Yet, abandon first initial code corresponding to detector for initial code of the present invention, and will be second initial code of the serviced rear continuation decoding of overlapping initial code event, " it is correct resembling it ". If there is a series of overlapping initial code, detector for initial code will abandon (for each overlapping initial code produces an event) all except last.
Similarly mistake also is possible (H.261 or possible MPEG) in non-byte navigation system. In this case, the ignore_non_aligned state also must be considered. Figure 65 has described an example, and wherein, the first initial code that finds is byte location, but it and a non-location initial code are overlapping. If ignore_non_aligned is set to 1, second overlapping initial code will be treated as data by detector for initial code so, and thereby the initial code event that can not overlap. This has hidden a kind of possible data communication error. If ignore _ non_aligned is set to 0, detector for initial code will be seen the initial code of second non-location and will see it and the first initial code overlaid. A.11.4.3 unrecognizable initial code
When a unrecognizable initial code is detected, detector for initial code can produce an interruption, if (unrecognized_start_mask=1). The value that causes the initial code of this interruption can be read from register start_value.
Initial code value 0xB4 (sequence errors) is used to indicate a passage or media mistake in the mpeg decoder system. For example, this initial code can be by in the ECC circuit data inserting, if it detects its mistake that can not correct. A.11.4.4 the order that produces of event
In the present invention, some coded data model (may indicate an error situation) will cause not only that an above-mentioned error situation occurred in the space in a blink. Thereby detector for initial code to the order that coded data detects error situation is:
1) initial code of non-location
2) overlapping initial code
3) unrecognizable initial code
Like this, if the initial code of a non-location initial code and other back is overlapping, the first event that produces is located right and wrong that initial code interrelates. After this event is serviced, the operation of detector for initial code will continue, and detect overlapping initial code after the short time.
Detector for initial code only just attempts to identify initial code after the detection of all non-location and overlapping initial code is finished. A.11.5 decoder starts and closes
New task of a beginning that detector for initial code provides facility so that current decoding task is finished neatly.
The use of these technology has some restrictions, because the JPEG encoded video can comprise the value (seeing A.11.8.1) of simulation mark code as data fragments. A.11.5.1 clean the end decoded
Detector for initial code can be configured to produce an interruption in case the data of current image are finished, and quits work. This by arrange stop_after_picture be 1 and stop_after_ picture_mask 1 reach.
In case the ending of an images is by detector for initial code, a FLUSH token namely is produced (seeing A.11.7.2), and an interruption is produced, and detector for initial code stops. Notice that the image that just has been done will be decoded with normal mode. Yet, in some applications, be suitable for and detect the FLUSH that arrives the output of decoder chip collection, because it will indicate the end of current video sequence. For example, demonstration will be freezed the output in final image.
After detector for initial code stopped, the data that may have from " always " video sequence " fell into " in the buffer of the user's realization between media and decoding chip. Register discard_all_data is set will be made spatial decoder consumption and abandon this data. This will proceed to, and a FLUSH token arrives detector for initial code or discard_all_data is reset by MPU interface.
After " always " sequence has abandoned any data, decoder now has been ready to start working for a new sequence. A.11.5.2 the initial pattern that entirely abandons when
Entirely abandoning pattern will will begin in a minute after being written into the discard_all_data register at one 1. If this carries out during deal with data actively at detector for initial code, the result is unpredictable so.
The pattern that entirely abandons can at any one detector for initial code event (non-location initiation event etc.), produce an interruption and caused safely. A.11.5.3 initial one new sequence
Where be not positioned at certain coded data, the mechanism that can use initial code to look for so if also know the initial of a new encoded video sequence. This method is discarded in the initial any unwanted data before of sequence. A.11.5.4 jump between sequence
This part has been described the application of some above-mentioned technology, and purpose is to "jump" to another part from the part of encoded video sequence. In this example, filing system only allows access data " piece ". This block structure can get from sector-size or error correction system block size of a dish. Like this, inlet point and the exit point position in coding video frequency data can be irrelevant with the filing system block structure.
Stop_after_picture and discard_all_data mechanism allow to be dropped from the data that do not need of presbyopia frequency sequence. Insert the FLUSH token discard_all_data pattern that can reset at place, the end of a upper filing system data block. The initial code pattern of looking for can be used so, is in suitable inlet point any data before in next data block to abandon. A.11.6 byte is located
As ripe well known in the art, different encoding schemes is positioned with for the byte of initial in the serial data/mark code diversely to be treated.
For example, H.261 communication is regarded as bit string, like this, just do not have the concept of initial code byte location. Be 0 by ignore_non_aligned is set, detector for initial code can detect has the initial code that arrange any position. Be 0 by non_aligned_start_ mask is set. The non-location of initial code is interrupted suppressed.
Yet by contrast, JPEG locates guaranteed computer environment design for a kind of byte. Therefore, the mark code should be only detected when byte is located.
When coding standard is configured to JPEG, register ignore_non_aligned is left in the basket, and non-location initiation event will never be produced. Yet suggestion arranges ignore_non_ aligned=1 and non_aligned_start_mask=0 and guarantees compatibility with future products.
On the other hand, MPEG designs for the needs that satisfy simultaneously communication (bit string) and computer (byte-oriented) system. Initial code in the mpeg data generally should be by the byte normal alignment. Yet standard is to look for (unless it is exactly initial code, not the having other MPEG bit model with any location will look like an initial code) designed for the Bits Serial of initial code. Like this, a mpeg decoder can be designed to and will be tolerated in losing that byte is located in the serial data communication.
If a non-location initial code is found, it will indicate a garble that occurred in the past usually. If mistake is " biased moving " in a bit-serial communication system, includes so these wrong data and be transmitted to decoder. This mistake might cause other mistake in the decoder. Yet the new data that arrives detector for initial code can continue decoded after the losing of this byte location.
By ignore_non_aligned=0 and non_aligned_start_mark=1 are set, if a non-location initial code is detected, an interruption can be produced. Response will depend on application. The initial code of all back will be non-location (until the byte location is resumed). Correspondingly, after the byte location has been lost non_aligned_start-mask=0 being set perhaps is appropriate.
The configuration of Table A .11.5 byte location
    MPEG     JPEG     H.261
  ignore_non_aligned     0     1     0
  non_aligned_start_mask     1     0     0
A.11.7 automatically produce token
In the present invention, great majority have directly been reflected the syntactic element (syntactic elements) of multiple image and video encoding standard by the token of detector for initial code output. Except these " natural " tokens, some useful " invention " token is produced. For example these proprietary tokens are PICTURE_END and CODING_STANDARD. Token also is introduced into to eliminate some the grammer difference between the coding standard, and goes " tidying up " situation in situation about making mistake.
It is (the seeing Figure 61, " detector for initial code ") that is done after the sequence analysis of coded data that this automatic token produces. Therefore, for directly offering the token of spatial decoder input by detector for initial code and having been detected the token that initial code produces by detector for initial code in coded data, the reflection of system is identical. A.11.7.1 indicate the end of an image
Usually, coding standard indicates the end of an image ambiguously. Yet the detector for initial code among the present invention produces a PICTURE_END token when being moved to end detecting the current image of indication.
The token that PICTURE_END is produced is: SEQUENCE_START, GROUP_ START, PICTURE_START, SEQUENCE_END and FLUSH. Stopping after A.11.7.2 image finishes to select
If register stop_after_picture set, so detector for initial code will a PICTURE_END token by after stop. Then, FLUSH token be inserted into after the PICTURE_END with the end of " promotions " coded data by decoder with make system reset. (seeing A.11.5.1). A.11.7.3 for H.261 insetion sequence is initial
H.261 do not possess one and be equivalent to the initial syntactic element of sequence (seeing Table A.11.4). If register insert_sequence_start is set up, detector for initial code will guarantee that a SEQUENCE_START token was arranged before next PICTURE_START so, that is to say, if detector for initial code was not seen a SEQUENCE_START before a PICTURE_START, will insert one. If had a SEQUENCE_START to exist, will can not introduce again.
This function should not used MPEG and JPEG. A.11.7.4 for each sequence coding standard is set
All SEQUENCE_START tokens that leave detector for initial code always have a CODING_STANDARD token before. This token is loaded the present encoding standard of detector for initial code. It is provided with coding standard for each new video sequences to whole decoder chip collection. A.11.8 initial code is looked for
Can be used in an encoded data stream, look for the initial code of a particular type according to detector for initial code of the present invention. This is so that decoder restarts decoding (abandoning any data of its front after) from a specified level in the system of some coded data. Its application comprises:
Initial code is looked for and will be will begin in a minute after a nonzero value is written into the start_code_search register. If this is to finish when detector for initial code enlivens deal with data, then its result is with unpredictable. Therefore, before initiation one initial code was looked for, detector for initial code should be stopped, and do not have data processed this moment. If any detector for initial code event (non-location initiation event etc.) has just produced an interruption, detector for initial code will always be in this state. A.11.8.1 the restriction of using initial code to look for to JPEG
Most of JPEG mark codes have 16 bit length count areas that interrelate with it. The length of the data segmentation that this field indication and mark code interrelate. This segmentation can comprise the value of imitating the mark code. Normally in service, detector for initial code is not sought initial code in these data sectionals.
In the arbitrary access to some jpeg encoded data in such segmentation " manor (lands) ", the mechanism that initial code is looked for can not reliably be used. Usually, to come be arbitrary access identification inlet point to the video of the JPEG coding external information that will need to add. A.12 the A.12.1 general introduction of decoder starting is controlled in the decoder starting
In a decoder, video shows a bit of time that usually is delayed after coded data obtains first. In the middle of this postpones, accumulate in the buffer of coded data in decoder. This buffer pre-filled guaranteed that buffer never can be empty in decoding, thereby and this guaranteed the decoder new image of in the normal space, can decoding.
Usually, a correct initial decoder needs two kinds of gimmicks. The first, must there be a kind of means to measure existing how many data and be provided for decoder. The second, must there be a kind of gimmick to prevent the demonstration of a new video string. Spatial decoder among the present invention provides a digit counter near its input, have how many data to arrive with measurement, carries near its output
After jumping into an encoded data files, start a decoder (for example, arbitrary access) at a unknown position.
In data, look for a known point to help the recovery behind place's error in data.
For example, Table A .11.6 is shown as the difference configuration of start_code_search and the MPEG initial code looked for. Equivalence H.261 with JPEG initial/the mark code can see in Table A .11.4.
Table A .11.6 initial code is looked for pattern
start_code_searchFor ... and the initial code of seeking
    0aNormal operating
    1Keep (will show as and abandon data)
    2
    3Sequence is initial
start_code_searchFor ... and the initial code of seeking
    4Group or sequence are initial
    5bImage, group or sequence are initial
    6Sheet, image, group or sequence are initial
    7Next initial or mark code
A. look in the pattern at this, the FLUSH token is positioned at detector for initial code.
B. this is the default mode after resetting.
When a nonzero value is written into the start_code_search register, detector for initial code abandons all arrival data with beginning, until the initial code of appointment is found. Then the start_code_search register will be reset to 0 and normal operation will continue. Supplied initial with the new video string that prevents from being output of an out gate.
The control of these gimmicks has three kinds of levels of complexity
Out gate is always opened
Basic controlling
Senior control
If out gate always leaves, image outpuT will begin after coded data begins to arrive decoder as early as possible. This decoding or demonstration for still image is just suited by the situation that certain other gimmick postpones.
Difference between the senior control of fundamental sum depends on any moment decoder buffer can hold how many short-sighted frequency strings. Basic controlling is enough in great majority are used. Yet senior control makes user software can help the starting of the several very short video strings of decoder management. A.12.2MPEG video buffer verifier
MPEG has described a kind of " video buffer verifier " that uses for constant data rate system (VBV). (VBV) information of use makes decoder pre-filled its buffer before it begins to show image. Moreover this pre-filled buffer that has guaranteed decoder always can be empty during decoding.
Generally speaking, each MPEG image carries a vbv_delay parameter. This parameter had been specified before the first images is decoded, the data stuffing that how long should be encoded of the coded buffer of " desirable decoder ". After the start delay of observing the first images, the requirement of all back images all will be satisfied automatically.
Thereby MPEG is defined as delay with start request. Yet in a constant bit rate system, one postpones to be converted at an easy rate the position counting. The basis that the starting of space decoder control is moved among the present invention that Here it is. A.12.3 the definition of stream
In this used, term " stream (stream) " was used to avoid to obscure with MPEG term " sequence (sequence) ", and therefore, stream represents multitude of video data of using institute " interested ". Thereby a stream can be many MPEG sequences, perhaps can be a single image.
Decoder starting facility described in this chapter is relevant with the VBV requirement of the first images in the satisfied stream, and the requirement of back image is satisfied automatically in this stream. A.12.4 initial control register
Table A .12.1 decoder plays dynamic register file
Register nameSize/directionReset modeExplanation
startup_ acce CED_BS-ACCESS   1   rw    0This register is write 1 require digit counter and door unlatching logic to stop, to allow their configuration register of access
bit_count CED_BS_ COUNT_   8   rw    0When coded data is left detector for initial code, this digit counter increment, once increasing the desired figure place of bit_ count approximately is 2(bit_count_prescale+1)* 512 at the FLUSH token by behind the digit counter, digit counter begins counting, behind the counting goal satisfaction in place, it just by clear 0, then stops increment.
bit_count _prescale CED_BS_ PRESCALE   3   rw    0
bit_count _target CED_BS _TARGET   8   rwIndefiniteThe desired value of this register pointer counting occurs when following condition becomes true time goal satisfaction event: bit_count 〉=bit_count_ target
target_ met_event BS_TARGET _MET_EVENT   1   rw    0When the counting goal satisfaction of position, this event occurs, and is set to 1 such as mask register, then can produce interruption. Yet digit counter will not stop deal with data. When digit counter increase counting arrives its desired value, event will occur, if the desired value that is written into is less than or equal to the currency of digit counter, event also will occur.Write 0 to bit_count_target and will always satisfy the event of target.
target_ met_mask   1   rw    0
Table A .12.1 decoder plays dynamic register file (continuing)
Register nameSize/directionReset modeExplanation
counter_ flushed_event BS_FLUSH _EVENT   1   rw   0When the FLUSH token by a position counting circuit, this event will occur, being set to 1, one interruption such as mask register can be produced, and digit counter will stop.
counter_ flushed_mask   1   rw   0
counter_ flushed_ too_early _event BS_FLUSH_ BEFORE_ TARGET_ MET_EVENT  1   rw  0By the digit counter circuit, and the position counting is also less than the foot-eye value such as the FLUSH token, and this event will occur, and be set to 1 such as mask register, and interruption can be produced, and digit counter will stop to stop. See A.12.10
counter_ flushed_ too_early _mask   1   rw   0
offchiip_queue CEO_BS_ OUEUE   1   rw   0This register to 1 is set, and configuration door unlatching logic goes for asks the microprocessor support, when this register is set to 0, exports the gate logic and will automatically control the out gate operation. See A.12.6 and A.12.7
Table A .12.1 decoder plays dynamic register file (continuing)
Register nameSize/directionReset modeExplanation
enable_ streamCED_BS_ ENABLE_ NXT_STM   1   rw     0Formation is used outside chip, after the ending of stream is passed through it, writes the behavior of enable_staeam control out gate. 1 of this register allows out gate to open. When accept_enable interruption generation, this register will be reset.
accept_ enable _event BS_STREAM _ENO_EVENT   1   rw     0This event shows the FLUSH token by out gate (causing it to close), and once has an enable signal to allow this door to open. If mask register is set to 1, interruption can produce and register enable_stream will be reset. See A.12.7.1
accept_ enable_mask   1   rw     0
A.12.5 the out gate of often opening
Out gate can be set to be held open. Decoded or in the time can obtaining some other mechanism and come the starting of managing video decoder, this configuration suits at still image.
Needing after resetting following configuration (by to startup_access) to write 1 has obtained starting the access of control logic):
Offchip_queue=1 is set
Enable_stream=1 is set
Guarantee that all decoder start event mask registers are set to 0 and make them can not interrupt (this is the default setting after resetting).
(seeing A.12.7.1 the explanation to why keeping like this out gate to open). A.12.6 basic operation
In the present invention, the basic controlling of start-up logic is enough for most of MPEG Video Applications. In this mode, digit counter is directly communicated by letter with out gate. When the ending of a video flowing was passed through and be indicated by a FLUSH token, out gate was with autoshutdown. Goalkeeper keeps closing, until when a stream had reached its start bit counting, one allowed signal to be provided by digit counter.
After resetting, need following configuration (bywrite 1 access that has obtained initial control logic to startup_access):
For the desired extent of coded data rate roughly arranges bit_count_prescale.
Counter_flushed_too_early_mask=1 is set can be detected this error condition.
Need two kinds of interrupt service routines:
The video distributor service thinks that the first images in each new stream obtains the value that vbv postpones.
Counter is removed (flush) service too early to respond this state.
Video distributor (being also referred to as video parser) can produce an interruption when it is a new video stream (the namely image of first width of cloth arrival video distributor after FLUSH) decoding vbv_ delay. Interrupt Service Routine should be bit_count_target and calculates an appropriate value and write it. When digit counter reached this target, it enabled inserting one in one between digit counter and the out gate short team. When out gate was opened, it was removed one and enables from this team. A.12.6.1 a new stream will begin in a minute after another stream finishes
The mpeg stream that for example, will finish is called as A and the mpeg stream that will begin is called as B. A FLUSH token should be inserted into after the ending of A. It promote its coded data last by decoder, and the new stream of a plurality of parts expectations one of announcement decoder.
Usually, when A has satisfied its starting conditions, digit counter will be reset to zero. After FLUSH, digit counter will begin to count the position among the stream B. Decoded when video distributor and to have flow automatically behind the vbv_delay of the first images among the B, an interruption will be produced, so that digit counter is configured.
When the FLUSH of the ending that has indicated stream A passed through out gate, goalkeeper closed. Goalkeeper closes always until B meets its starting conditions. According to a lot of parameters, such as: start delay and the buffer depth of stream B, might will satisfy its starting conditions by B when out gate cuts out. In the case, in team, will have one to enable waiting for, and out gate will be opened immediately. Otherwise stream B must wait for until it has satisfied its starting conditions. A.12.6.2 some short streams is continuous
The capacity of the team between digit counter and out gate is enough to make 3 minutes other video flowings to reach their starting conditions and waiting for that the stream of a front finishes decoded. In the present invention, only have when some very short stream outside decoded or chip buffer with just compare at decoded pixel format just can this thing happens when being very large.
In Figure 69, stream A just opens at decoded and out gate. Stream B and C have satisfied their starting conditions and have been contained in the buffer of being managed by spatial decoder by whole. Stream D still arrives the input of spatial decoder.
Stream B and C enable in team. Therefore, B can begin immediately after stream A is done. Similarly, C can be immediately following behind B.
If A is still by out gate when D satisfies it and plays moving-target, one enables to be added in the formation to fill formation. If (namely, A is still passing through out gate) also do not enable to be removed from formation when the tail end of D passes through digit counter, will there be new stream to begin to pass through digit counter. Therefore, coded data will input stop until A finishes and when out gate be opened so that B by the time one enable to be removed from formation. A.12.7 higher level operation
Corresponding to the present invention, the senior control of start-up logic can make user software can infinite expanding the length that enables formation described in " basic operation " A.12.6. Other control of this level only must be held a series of ratios at Video Decoder and A.12.6.2 just need during the longer short video flowing of the stream of description in " some short streams continuously ".
Except the required configuration of system's basic operation, be required after following being configured in resets (bywrite 1 access that has obtained initial control logic to the start_up access:
Offchip_queue=1 is set
Accept_enable_mask=1 is set, to enable to allow to send interruption when being removed by formation at one.
Target_met_mask=1 is set, when the position of stream counting target is satisfied, to allow to send interruption.
Also need two additional Interrupt Service Routines:
Acceptance enables to interrupt
Goal satisfaction interrupts
When a goal satisfaction interrupted occurring, service routine should enable one of formation increase and enable outside its chip. A.12.7.1 out gate logic behavior
Write one 1 and namely packed into one to a short formation to the enable_stream register and enable.
When a FLUSH (indicating the end of stream) by out gate, goalkeeper closes. Enable if can obtain one at the end of formation, goalkeeper opens and produces an accept_ enable_event. If accept_enable_mask is set to that 1, one interruption can be produced and one enable to be removed by the end from formation (register enable_stream by position).
Yet, if accept_enable_mask is set to zero, does not interrupt after accept_enable_event, being produced, and enable not removed by the end from formation. This skill can be used for as A.12.5 as described in keep out gate unlatching. A.12.8 count the position
After a FLUSH token passed through digit counter, it began counting. The end of this FLUSH token indication current video stream. In this, digit counter continues counting until it satisfies the position counting target that is arranged in the bit_count_target register. Then a goal satisfaction event is produced, and digit counter is reset to 0 and wait for next FLUSH token.
Digit counter also will stop to increase when it reaches its maximum count (255). A.12.9 in advance scale (prescale) is counted in the position
In the present invention, 2(bit_count_prescale+1)* 512 are required to increase digit counter No. one time. In addition, bit_count_prescale is 3 bit registers that can get value between 0 to 7.
Table A .12.2 digit counter examples of ranges
    nScope (position)Resolution ratio (position)
    00 to 262144     1024
    10 to 524288    2048
    70 to 31457280     122880
The position counting is roughly, because some element of video flowing by token (such as initial code), therefore comprises non-data token. A.12.10 (flushed) counter that is eliminated too early
If a FLUSH token arrived digit counter before reaching position counting target, one can cause an event of interrupting to be produced (if counter_flushed_too_ is early_mask=1). If interrupt being produced, the digit counter circuit will stop so, to prevent further data input. To be responsible for judging after this event occurs when open out gate by user software. Out gate can be written as 0 and is unlocked by the position being counted target. Only have when attempting to decode the video flowing that only continues several images and just these situations should occur. A.13 buffer management
Two logical data buffers of spatial decoder management: coded data buffer (CDB) and token buffer (TB).
CDB cushions the coded data between the input of detector for initial code and Huffman decoder. So just the coding video frequency data for low data rate provides buffering. Data between the output of TB buffering Huffman decoder and the input of space video decoding circuit (countercurrent fashion device, quantizer and DCT). This second logic gate buffer makes the processing time comprise an expansion, processes the image with delta data amount to adapt to.
These two kinds of buffers physically all are contained in the outer single DRAM array of chip. The address of these buffers is produced by buffer-manager. A.13.1 the register of buffer-manager
The spatial decoder buffer manager for use plans to be configured immediately after device reset. In normal operating, do not need to dispose buffer-manager.
Resetting remove from spatial decoder after, it is to be configured that buffer-manager is stopped (its access function resister buffer_manager_access is set to 1 simultaneously) etc. After register is set to, buffer_manager_access can be set to 0 and the decoding can begin.
When buffer-manager in when operation, the most of registers that use in the buffer-manager can not be accessed reliably. Before the register of any buffer-manager was accessed, buffer_manager_access must be set to 1. This is so that in accordance with waiting for that agreement is very important, until can be frombuffer_manager_access readout 1. Obtain and break away to access institute's time spent, at these registers of inquiry, resemble cdb_full and cdb_empty, should be considered during with the look aside buffer situation.
Table A .13.1 buffer manager for use register (continuing)
Register nameSize/directionReset modeExplanation
buffer_ manger_ access   1   rw 1This access bit stops the operation of buffer-manager, so that each register of manager can be accessed reliably, see A.6.4.1 and note: this access function resister is unusual, default setting after it resets is 1, the posterior bumper manager that namely resets stops, and waits for and by MPI it being disposed.
buffer_ manger_ keyhole_address   6   rwIndefiniteKeyhole access expanded address space, this space are that the buffer manager for use register shown in following uses. See A.6.4.3 relevant more information by the keyhole access register.
buffer_ manager_ keyhole_data   8   rwIndefinite
buffer_
  18   rwIndefiniteSpecify the whole size that is connected to the DRAM array on the spatial decoder. All buffer addresses are calculated by delivery by this buffer size, thus its DRAM in overlapping circulation.
Table A .13.1 buffer manager for use register (continuing)
Register nameSize/toResetmodeExplanation
cdb_base
    18     rwIndefiniteThese registers point to the base address of coded data (cdb) and token (tb) buffer.
tb_base
cdb_length     18     rwIndefiniteThese registers point to the length (i.e. size) of coded data (cdb) and token (tb) buffer.
tb_length
cdb_read18 readIndefiniteThese register holds point out with respect to the side-play amount of snubber base location where next data is from reading.
tb_read
cdb_number
18 is read-onlyIndefiniteThese registers show currently have how many data to be retained in the buffer.
tb_number
cdb_full
1 is read-onlyIndefiniteIf coded data (cdb) or token (tb) buffer full, these registers will be set to 1.
tb_full
cdb_empty
1 is read-onlyIndefiniteIf coded data (cdb) or token (tb) buffer-empty, these registers will be set to 1.
tb_empty
A.13.1.1 buffer-manager pointer value
Say that typically data are transmitted between the DRAM outside spatial decoder and chip with 64 byte burst modes (using the fast page mode of DRAM). All buffer pointer and length register all relate to these 64 bytes (512 s') data block. Therefore, 18 bit registers of buffer-manager are described the linear address space (namely 128Mb) of a 256K piece.
64 bytes transmit with DRAM interface width (8,16 or 32) irrelevant. A.13.2 the use of the register of buffer-manager
The buffer manager for use of spatial decoder has the register of two kinds of similar buffers of two cover definition. The upper physical limit of buffering limit register (buffer_limit) definition memory space. All addresses are counted as mould take this and are calculated.
Within the restriction of obtainable memory, the scope of each buffer is by two register definitions: snubber base location (cdb_base and tb_base) and buffer length (cdb_ length and tb_length). So far described all registers must be configured before buffer can be used.
The current state of each buffer all can be observed in 4 registers. Buffer read register (cdb_read and tb_read) indication is with respect to a side-play amount of snubber base location. This volume is moved the next data in address and will be read out. The current data volume that is kept by buffer of number of buffers register (cdb _ number and tb_number) indication. Mode bit cdb _ full, tb_full, cdb_etmpty and tb_empty indication buffer are empty or full.
Described in A.13.1.1, the unit of all registers above-mentioned is 512 bit data block. Correspondingly, the value of reading from cdb_number should be taken advantage of by 512, to obtain the figure place in the coded data buffer. A.13.3 Z-buffer
Do not have still image that " in real time " require use (for example using JPEG) will need to be by the outer buffer of the large chip that buffer-manager is supported. In this case, the DRAM interface can be configured (by writing 1 to the zero_buffers register) ignoring buffer-manager, thereby provides FIFO (first in first out) in the one 128 bit stream sheets for coded data buffer and token buffer.
The Z-buffer selection also can be suitable for the application with low data rate and the operation of little pixel format.
Therefore attention: the zero_buffers register is the part of DRAM interface, should be only be set up during the configuration after the resetting of DRAM interface. A.13.4 buffer operation
Controlled by a Handshake Protocol by the data transmission that buffer carries out. If guaranteed that therefore buffer is full of or full sky does not have the error in data generation yet. If a buffer is filled, attempt so to send the circuit of data to be stopped to buffer, until the space has been arranged in the buffer. If buffer is continuously full, buffer " upstream " (" up stream ")*The more multiprocessing stage will stop until spatial decoder can not be at its input port receive data. Similarly, if buffer is entirely empty, the circuit of attempting so to remove from buffer data will stop, until data can obtain.
Described in A.13.2, the position of coded data and token buffer and size are specified by snubber base location and length register. The user has a responsibility for disposing these registers and guarantees that two storages between buffer do not conflict in using. A.14 multi-channel video distributor (Demux)
Video distributor perhaps is known as video analyzer (parser), finishes by the initial task of coded data being converted to token of detector for initial code. Four main processing blocks are arranged: analyzer state machine, Huffman decoder (comprising an ITOD), count of macroblocks device and ALU in video distributor.
Analyzer or state machine are observed grammer and other unit of instruction of coding video frequency data. The Huffman decoder becomes integer with different lengths coding (VLC) data transaction. The count of macroblocks device is followed the trail of which part of image just decoded. ALU finishes the mathematical computations that needs. A.14.1 video distributor register
Table A .14.1 top layer video distributor register
Register nameSize/directionReset modeExplanation
demux_ access CED_H_ CTRL(7)     1     rw     0This access position stops the operation of video distributor. So that A.6.4.1 its each register is seen by reliably access
huffman error_ code CED_H_ CTRL(6:4     3     roWhen video distributor stops, huffman_ _ event interrupts and then producing, and what this 3 bit register retention value is designated as and produces and interrupt, and sees A.14.5.1
parser_ error_code CED_H_ DMUX_ERR     8     roProduce interrupt requests when video distributor stops to follow parser _ event, what these 8 retention values are designated as and produce interruption, see A.14.5.2
demux_ keyhole_ addressCED_H_ KEYHOLE _ADDR     12     rwIndefiniteThe extended address space of keyhole accessing video distributor. About the more information by the keyhole access register, see A.6.4.3 Table A .14.2, A.14.3 reaching A.14.4 explanation can be through the register of keyhole access.
demux_ keyhole _data CED_H_KEYHOLE     8     rwIndefinite
Table A .14.1 top layer video distributor register (continuing)
Register nameSize/directionReset modeExplanation
  dummy_last   _picture   CED_H_   ALU_REGO   r_rom_   control   r_dummy_   last_frame   _bit     1     rw    0Deposit when this and to be set to 1, video distributor is with regard to generation information, with the base image of " sky " final image as the MPEG sequence. When the time decoding is configured to reorder (seeing that A.18.3.5 image sequence reorders) as automatic image, clear out of temporal decoder in order to make last P or I image, this function is useful. Following situation does not need empty graph to resemble: temporal decoder does not form that other MPEG sequence that reorders will be decoded immediately because this also can dispose final image) the encoder standard is not MPEG
  field_   into   CED_H_   ALU_REGO   r_rom_   control   r_field_   into_bit     1     rw    0When this register puts 1, A.14.7.1 the first byte of any MPEG extra_information_picture is put in the FIEID_INFO token (sees)
  continue   CED_H_ALU   _REGO   r_rom_   control   r_continue   _bit     1     rw    0Analyse when decoder and to measure when being excessive data, this register allows user software control, and its plans what of additional customer's data of receiving or growth data. (see A.14.6, A.14.7)
Table A .14.1 top layer video distributor register (continuing)
Register nameSize/directionReset modeExplanation
rom_revision CED_H_ALU_REG1 r_rom_ revision   8   ro     0Immediately following after resetting, this register obtains version number's copy of microcode ROM. This register also is used for providing the control software data value of reading from coded data. See that A.14.6 " receiving user and growth data " reaches A.14.7 " reception extraneous information "
huffman_ event   1   rw     0If find mistake in coded data, the Huffman event is produced, and sees A.14.5. 1 explanation to these events. If mask register puts 1, produce so and interrupt, the multi-channel video distributor will stop. If mask register sets to 0, produce without interrupting, the multi-channel video distributor is attempted to recover from mistake.
huffman_ mask   1   rw     0
parset_ event   1   rw     0One Parser event is according to the arrival information request software intervention of the mistake in the coded data or demultiplexer, sees A.14.5.2 the explanation to these events. If mask register is set to 1, produce and interrupt, the multi-channel video distributor stops. If mask register is set to 0, occur without interrupting, the multi-channel video distributor will be attempted to continue.
parser_ mask   1   rw     0
Table A .14.2 video distributor map architecture register
Register nameSize/directionReset modeExplanation
component _name_0 component _name_1 component_name_2 component name_3    8    rwIndefiniteKeep 8 place values at JPEG duration of work register component_ name_n, (to certain application) represents the important IDn of which kind of chrominance component.
horiz_ pels   16   rwThese registers keep the horizontal and vertical size of the just decoded video take pixel as unit. (seeing A.14.2 joint)
vert_pels   16   rwIndefinite
horiz_ macroblocks   16   rwIndefiniteThese registers keep the horizontal and vertical size of the decoded video take macro block as unit. See A.14.2 joint
vert_ macroblocks   16   rwIndefinite
max_h
  2   rwIndefiniteThese registers keep the wide and high of the macro block that calculates with piece (8 * 8 pixel). The width/thatvalue 0 to 3 expression is 1 to 4. See A.14.2joint
max_v
  2   rwIndefinite
max_ component_id   2   rwIndefiniteCurrent 1 to 4 the decoded different video component ofvalue 0 to 3 expression. See A.14.2
Table A .14.2 video distributor map architecture register (continuing)
Register nameSize/directionResetmodeExplanation
Nf
  8   rwIndefiniteAt the JPEG duration of work, this register keeps Parameter N f (in frame image component number)
blocks_h_0 blocks_h_1 blocks_h_2 blocks_h_3   2   rwIndefiniteTo in 4 chrominance components each, register blocks_h_n and block_v_n keep corresponding to the chrominance component of component IDn vertical and with horizontal direction on macroblock number. See A.14.2
blocks_v_0 blocks_v_1 blocks_v_2 blocks_v_3   2   rwIndefinite
tq-0 tq-1tq_2 tq_3   2   rwIndefiniteKeep 2 place values by register tq_n, illustrate and when with component IDn decoded data, use which of inverse quantization tables.
A.14.1 register loads and the token generation
A lot of registers in the video distributor have with coding image/video data in the directly related value of the parameter of usually communicating by letter. For example, the horiz_pels register is corresponding to MPEG sequence head information, horizontal_size and JPEG frame head parameter X. These registers are loaded by Video Decoder when suitable coded data is decoded. These registers are also relevant with certain token. For example, register horiz_pels is relevant with token HORIZONTAL_SIZE. (or after this soon) token is produced by Video Decoder when coded data is decoded. Token also can be provided directly to the input of spatial decoder. The value of being carried by token in this case, will dispose associated Video Decoder register.
Table A .14.3 multi-channel video distributor Huffman table register
Register nameSize/dimensionReset modeExplanation
 dc_huff_0  dc_huff_1  dc_huff_2  dc_huff_3    2    rwThe value explanation that is kept by register DC_huff_n is when component IDn which of use Huufman decoding table during to the DC coefficient decoding of data. Equally, ac_huff _ used table of n explanation when decoding AC coefficient. Each timebase JPEG needs 2 Huff-man tables, and the tabular value that provides is 0 and 1.
 ac_huff_0  ac_huff_1  ac_huff_2  ac_huff_3    2    rw
 dc_bits_  0[15:0]  dc_bits_  1[15:0]    8    rwThese show each 16 8 place values, and they provide BITS information (seeing the explanation of JPEG Huffman table) to form the part of 2 DC and the explanation of 2 AC Huffman tables. See A.14.3.1.
 ac_bits_  0[15:0]  ac_bits_  1[15:0]    8    rw
 dc_huffval  _0[11:0]  dc_huffval  _1[11:0]    8    rwThese every in tables have 12 12 place values, and they provide HUFFVAL information (seeing the explanation of JPEG Huffman table) to form the part of the explanation of 2 DC Huffman tables. See A.14.3.1.
Table A .14.3 multi-channel video distributor Huffman table register (continuing)
Register nameSize/dimensionReset modeExplanation
ac_huffval _0[161:0] ac_huffval _1[161:0]     8     rwThese every in tables have 162 8 place values, and they provide HUFFVAL information (seeing JPEG Huffman table specification) to form the part of 2 AC Huffman tables. See A.14.3.1.
dc_zssss _0     8     rwThese eight bit registers keep being used as the value of " special circumstances processing ", to accelerate the decoding of some JPEG VLCS commonly used. The size of dc_ssss-DC coefficient is 0. The end of ac_eob-piece. Ac_zrl-16 the zero distance of swimming.
dc_zssss _1     8     rw
ac_eob_0
    8     rw
ac_eob_1
    8     rw
ac_zrt_0 ac_zrt_1
    8     rw
Other video distributor registers of Table A .14.3
Register nameSize/dimensionReset modeExplanation
buffer _size
  10   rwWhen carrying out the mpeg data decoding with the value of the size that represents a required VBV buffer of desirable decoder, this register is loaded. In decoder chip without this value, yet when the configuration codes data buffer size with when determining whether that decoder can be decoded special mpeg data file, the value of this register maintenance may be useful to user software.
pel_aspect   4   rwWhen decoding the MPEG data with the value that represents the pixel length-width ratio, this register is loaded. This value is 4 integers, as the pointer to the fixed table of MPEG. See that mpeg standard is to the definition of this table. In decoder chip, without this value, but show or during output equipment, its retention value may be usefulness to user software when configuration.
bit_rate   18   rwWhen decoding the MPEG data with the value of presentation code data transfer rate, this register is loaded. See that mpeg standard is to the definition of this value. Without this value, when the configuration decoder played dynamic register file, its retention value came in handy to user software in decoder chip.
pic_rate   4   rwWhen decoding mpeg data with the value of presentation image rate, this register is loaded. See that mpeg standard is to the definition of this value. In decoder chip, without this value, but show and output when arranging that its retention value comes in handy to user software when configuration.
constrained   1   rwWhen the decoding mpeg data when indicating whether that decoded data satisfies the restriction parameter of MPEG, this register is loaded. See that mpeg standard is to the definition of this mark. Without this value, but whether this value determines that to user software the decoder special mpeg data file of decoding may be useful in decoder chip.
Other video distributor table registers (continuing) of Table A .14.4
Figure A9510324602691
Table A .14.1 top layer video distributor register (continuing)
Register nameSize/directionResetmodeExplanation
  vbv_delay
  16   rwWhen representing that minimum start delay value is decoded mpeg data, this register is loaded at the decoding prestart. See for this reason mpeg standard of value definition. Decoder chip is worth without this, but when the configuration decoder played dynamic register file, this value of maintenance may be useful to user software.
  pic_number   8   rwThis register is that image keeps image numbers, and this image is by the current decoding of multi-channel video distributor. When this image arrived here, this number produced by the initial code decoder. See A.11.2 the explanation to image numbers.
  dummy_last   picture   1   rw   0These registers also can be seen at top layer. See Table A.14.1
  field_info   1   rw   0
  continue   1   rw   0
  rom_revision   8   rw
  coding_   standard   2   roThis register is loaded by CODING STANDARD token, with the working method of configuration multi-channel video decomposer. See A.21.1
  restart_   interval   8   rwBefore the decoding beginning, when using the minimum start_up delay value of indication, when jpeg data was decoded, this register was loaded. See that mpeg standard is to the definition of this value.
The interleaving access of Table A .14.5 register pair token
RegisterTokenStandardNote
 component_name_n  COMPONENT_NAME JPEGIn coded data
MPEGNeed not in standard
H.261
 horiz_pels  vert_pels  HORIZONTAL_SIZE  VERTICAL_SIZE MPEGIn coded data
JPEG
H.261Automatically from visual type, obtain
 horiz_macroblocks  vert_macroblocks  HORIZONTAL_MBS  VERTICAL_MBS MPEG JPEGControl software obtains in the horizontal and vertical dimension of picture
H.261Automatically from visual type, obtain
 max_h  max_v  DEFINE_MAX_  SAMPLING MPEGControl software must dispose, and sampling structure is fixed by standard
JPEGIn coded data
H.251It automatically is the 40:2:0 video configuration
 max_component_ld  MAX_COMP_ID MPEGControl software must dispose. Sampling structure is by standard
JPEGIn coded data
H.261The 4:2:0 video is disposed automatically
The interleaving access (continuing) of Table A .14.5 register pair token
RegisterTokenStandardNote
  tq_0   tq_1   tq_2   tq_3   JPEG_TABLE_   SELECT   JPEGIn coded data
  MPEGNeed not in standard
  H.216
  blocks_h_0   blocks_h_1   blocks_h_2   blocks_h_3   blocks_v_0   blocks_v_1   blocks_v_2   blocks_v_3   DEFINE_SAMPLING   MPEGControl software must dispose sampling structure by standard
  JPEGIn coded data
  H.261The 4:2:0 video is disposed automatically
  dc_huff_0   dc_huff_1   dc_huff_2   dc_huff_3In the probe data   JPEGIn coded data
  MPEG_DCH_TABLE   MPEGControl software must dispose
  H.261In standard, do not use
  ac_huff_0   ac_huff_1   ac_huff_2   ac_huff_3In the probe data   JPEGIn coded data
  MPEGIn standard, do not use
  H.261
  dc_bits_0[15:0]   dc_bits_1[15:0]   dc_huffval_   0[11:0]   dc_huffval_   1[11:0]   dc_zssss_0   dc_zssss_1   in DATA Token   follwing   DHT_MARKER Token   JPEGIn coded data
  MPEGControl software must dispose
  H.261Need not in standard
The interleaving access (continuing) of Table A .14.5 register pair token
RegisterTokenStandardNote
ac_bits_0[15:0] ac_bits_1[15:0] ac_huffval_0 [161:0] ac_huffval_1 [161:1] ac_eob_0 ac_eob-1 ac_zrl-0 ac_zrl_1In the DATA token after the DHT_MAKER token JPEGIn coded data
MPEGNeed not in standard
H.261
buffer_size  VBV_BUFFER_SIZE MPEGIn coded data
JPEG H.261Need not in standard
pel_aspect  PEL_ASPECT MPEGIn coded data
JPEGNeed not in standard
H.261
bit_rate  BIT_RATE MPEGIn coded data
JPEGNeed not in standard
H.261
pic_rate  PICTURE_RATE MPEGIn coded data
JPEGNeed not in standard
H.261
constrained  CONSTRAINED MPEGIn coded data
JPEGNeed not in standard
H.261
The interleaving access (continuing) of Table A .14.5 register pair token
RegisterTokenStandardNote
 picture_type  PICTURE_TYPE  MPEGIn coded data
 JPEGNeed not in standard
 H.261
 broken_closed  BROKEN_CLOSED  MPEGIn coded data
 JPEGNeed not in standard
 H.261
 prediction_mode  PREDICTION_MODE  MPEGIn coded data
 JPEGNeed not in standard
 H.261
 h_261_pic_typePICTURE_TYPE (when standard is H.261 the time)  MPEGHave nothing to do in coded data
 JPEG
 H.261
 vbv_delay  VBV_DELAY  MPEGIn coded data
 JPEGNeed not in standard
 H.261
 pic_numberTransmitted by PICTURE_START  MPEGProduced by detector for initial code
 JPEG
 H.261
The interleaving access (continuing) of Table A .14.5 register pair token
RegisterTokenStandardNote
coding_standard  CODING_STANDARD   MPEGArranged in initial code by the control software detector
  JPEG
  H.261
A.14.2 map architecture
In the present invention, visual dimension is illustrated to spatial decoder with two kinds of different units: pixel and macro block. JPEG uses pixel to carry out communicating by letter of visual dimension with MPEG. The dimension communication of carrying out with pixel determines to include the buffer area of legal data; This may be less than buffer overall dimension. Determine the buffer sizes that decoder is required with the dimension communication that macro block carries out. The macro block dimension must be got from the pixel dimension by the user. The spatial decoder register relevant with this information is: horiz_pels, vert_pels, horiz_macroblocks and vert_macrobloks.
Spatial decoder register blocks_h_n, blocks_v_n, max_h, the formation (coding unit of minimum among the JPEG) of max_v and max_component_id indication macro block. Each is 2 bit registers, can preserve the value in 0 to 3 scope. All block counts of all indicating 1 to 4 except max_component _ id. For example, if register max_h gets 1, a macro block is that 2 pieces are wide so. Similarly, max_component_id indicates the number of related different colours composition.
The configuration of the various macro block forms of Table A .14.6
    2:1:1     4:2:2     4:2:0     1:1:1
 max_h    1    1    1    0
 max_v     0     1     1     0
 max_component_id     2     2     2     2
 blocks_h_0     1     1     1     0
 blocks h_1     0     0     0     0
 blocks_h_2     0     0     0     0
 blocks_h_3     x     x     x     x
 blocks_v_0     0     1     1     0
 blocks_v_1     0     1     0     0
 blocks_v_2     0     1     0     0
 blocks_v_3     x     x     x     x
A.14.3 the Huffman of Huffman Table A .14.3.1 JPEG type table explanation
Among the present invention, the form that the explanation of Huffman table is used by JPEG offers spatial decoder to be communicated with the table explanation between encoder. Each table explanation has two element: BITS and HUFFVAL. For how comprehensive description of coding schedule, the user can consult the JPEG specification. A.14.3.1.1 BITS
BITS is a numerical tabular, illustrates how many distinct symbols are each length to VLC have be encoded. Each entrance is one 8 place values. JPEG allow VLC have 16 long, so in every table 16 entrances are arranged.
BIT[0] illustrate there are how many 1 different VLC, and BIT[1] existence how many 2 different VLC are described, by that analogy. A.14.3.1.2 HUFFVAL
HUFFVAL is 8 bit data value tables, by VLC length increase progressively arranged sequentially. The large young pathbreaker of this table decides according to the distinct symbols number that can be encoded by VLC.
How the JPEG specification can be encoded or be decoded as this form for the Huffman coding schedule further describes. A.14.3.1.3 the configuration of token
In a JPEG bit stream, the DHT mark is positioned at before the Huffman table explanation for coding AC and DC coefficient. When detector for initial code identified a DHT mark, it produced a DHT_MARKER token and next DATA token (seeing A.11.3.4) is put in the explanation of Huffman table.
AC and the configuration of DC coefficient Huffman table in spatial decoder can provide DATA and DHT_ MARKER token to be implemented by the input to spatial decoder when spatial decoder is disposed for jpeg operation. This gimmick can be used to the required DC coefficient Huffman table of MPEG operation, yet the coding standard of spatial decoder must be set to JPEG when table is loaded (downloaded) downwards.
Table A .14.7 disposes the Huffman table throughtoken
  E
  7  6  5  4  3  2  1  0Thetoken name
  1  0  0  0  1  0  1  0  1   CODING_STANDARD   1=JPEG
  0  0  0  0  0  0  0  0  1
  0  0  0  0  1  1  1  0  0   DHT_MARKER
  1  0  0  0  0  0  1  X  X   DATA
Table A .14.7 disposes (continuing) through token to theHuffman table
  E
  7  6  5  4  3  2  1  0The token name
  1  t  t  t  t  t  t  t  t   ThExpression Huffman table is loaded, and JPEG allows 4 tables to be loaded by the higher level. Value 0x00 and 0x01 specify DC coefficient coding table 0 and 1 value 0x10 and 0x11 to specify AC coefficient coding table 0 and 1Row are shown in single order with n of permission by bright repetition by talkative in this board order
  1  n  n  n  n  n  n  n  n   Li-carry 16 words of BITS information
  1  n  n  n  n  n  n  n  n
  1  n  n  n  n  n  n  n  n   VDD-carry the word (number of words depends on the distinct symbols number) of HUFFVAL information. If this is that data token end extension bits will be 0 for e-, then will be 1 if in same DATA token, comprise other table explanation.
  e  n  n  n  n  n  n  n  n
A.14.3.1.4 MPI is configured
AC and DC coefficient Huffman table also can directly be write to register by MPI. See Table A. 14.3.
Register dc_bits_0[15:0] and dc_bits_1[15:0] the BITS value of preserving table 0x00 and0x 01.
Register ac_bits_0[15:0] and ac_bits_1[15:0] the BITS value of preserving table 0x10 and0x 11.
Register dc_huffval_0[11:0] and dc_huffvla[11:0] the HUFFVAL value of preserving table 0x00 and 0x01.
Register ac_huffval_0[161:0] and ac_huffval_1[161:0] the HUFFVAL value of preserving table 0x10 and 0x11. A.144 the configuration of various criterion
Video distributor is supported MPEG, JPEG and needs H.261. Coding standard is disposed automatically by the CODING_STANDRAD token that detector for initial code produces. A.14.4.1 H.261 Huffman shows
Decoding H.261 required all Huffman tables is stored among the ROM in the spatial decoder, or rather, is in analyzer (Parser) state machine at video distributor, does not therefore need user's interference. A.14.4.2 map architecture H.261
H.261 be defined as only supporting two kinds of pixel format: CIF and QCIF. Pixel format in being used is partly used signal instruction at the PTYPE of bit stream. When these data are decoded by spatial decoder, it is placed in h_261_pic_type register and the PICTURE_TYPE token. In addition, all images and macroblock structure register are automatically configured.
Information in various registers also is placed in their associated token (seeing Table A.14.5), and has guaranteed that like this other decoder chip (such as temporal decoder) is correctly disposed. A.14.4.3 MPEG Huffman shows
Decoding MPEG required most of Huffman coding schedule is stored among the ROM in the spatial decoder (being again in the analyzer state machine), does not therefore need user's interference. The required table of DC coefficient of decoding intra-macroblock is an exception. Two tables are required, and one for colourity, and another is brightness. These must be disposed by user software before the decoding beginning.
Table A .14.8 is to H.261 automatic setting
Macroblock structure   CIF/   OCIFMap architecture    CIF   OCIF
  max_h
    1   horiz_pels     352    176
  max_v     1   vert_pels     288    144
  max_component_id     2   horiz_macroblocks     22    11
  blocks_h_0     1   vert_macroblocks     18    9
  blocks_h_1     0
  blocks_h_2     0
  blocks_v_0     0
  blocks_v_1     1
  blocks_v_2     0
Table A .14.10 has shown the required sequence of tokens of DC coefficient Huffman table in the configuration space decoder. As selection, identical result also can obtain by this information is write to register through MPI.
Which DC coefficient Huffman table register dc_huff_n control uses to each color component. Table A .14.9 has shown how they should dispose for MPEG operates. This can directly finish or finish by use MPEG_DCH_TABLE token by MPI.
Table A .14.9 is selected MPEG DC Huffman table byMPI
    dc_huff_0
    0
    dc_huff_1     1
    dc_huff_2     1
    dc_huff_3     x
Table A .14.10 MPEG DC Huffman shows configuration
    E  [7:0]Thetoken name
    1  0x15   CODING_STANDARD   1=JPEG
    0  0x01
    0  0x1C   DHT_MARKER
    1  0x04DATA (can be any chrominance component, in this example with 0)
    1  0x00This Huffman table of 0 indication is DC coefficient coding table 0
Table 14.10 MPEG DC Huffman table configuration (continuing)
    E     [7:0]Thetoken name
    1     0x00Carry 16 words of multidigit information, altogether 9 different VLC are described 2,2bit codes 3,3 bit code Isosorbide-5-Nitrae bit codes, 1,5bit code 1,6bit codes 1, if 7 bit codes by MPI configuration rather than with the token configuration, these values will be written to dc_bits_0[15:0] in theregister
    1     0x02
    1     0x03
    1     0x01
    1     0x01
    1     0x01
    1     0x01
    1     0x00
    1     0x00
Table 14.10 MPEG DC Huffman table configuration (continuing)
    E     [7:0]Thetoken name
    1     0x00
    1     0x00
    1     0x00
    1     0x00
    1     0x00
    1     0x00
    1     0x00
    1     0x01Carry 9 words of HUFFVAL information. If by MPI configuration rather than with the token configuration, these values will be written to dc_huffval_0[11:0] in theregister
    1     0x02
    1     0x00
    1     0x03
    1     0x04
    1     0x05
    1     0x06
    1     0x07
    0     0x08
Table 14.10 MPEG DC Huffman table configuration (continuing)
    E     [7:0]Thetoken name
    0     0x1C     DHT_MARKER
    1     0x04DATA (any chrominance component uses 0 in this example)
    1     0x01This Huffman table of 1 expression is DC coefficient coding table 1
    1     0x00Carry 16 words of an information, altogether to 9different VLC 3,2bit codes 1,3 bit code Isosorbide-5-Nitrae bit code 1,5bit code 1,6bit code 1,7bit codes 1, if 8 bit codes by MPI configuration rather than with the token configuration, these values will be written to dc_bits_1[15:0] in theregister
    1     0x03
    1     0x01
    1     0x01
    1     0x01
    1     0x01
    1     0x01
    1     0x01
    1     0x00
    1     0x00
    1     0x00
    1     0x00
    1     0x00
Table 14.10 MPEG DC Huffman table configuration (continuing)
    E     [7:0]Thetoken name
    1     0x00
    1     0x00
    1     0x00
    1     0x00Carry 9 words of HUFFVAL information. If by MPI configuration rather than with the token configuration, these values will be written to dc_huffval_0[11:0] in theregister
    1     0x01
    1     0x02
    1     0x03
    1     0x04
    1     0x05
    1     0x06
    1     0x07
    0     0x08
    1     0xD4MPEG_DCH_TABLE is configured tocomponent 0 use table 0
    0     0x00
Table 14.10 MPEG DC Huffman table configuration (continuing)
    E     [7:0]Thetoken name
    1     0xD5MPEG_DCH_TABLE configuration component 1 uses table 1
    0     0x01
    1     0xD6MPEG_DCH_TABLE configuration component 2 uses table 1
    0     0x01
    1     0x15     CODING_STANDARD     2=JPEG
    0     0x02
A.14.4.4 MPEG map architecture
For the macroblock structure of MPEG definition with H.261 employed identical. The dimension of image is encoded in the coded data.
To standard 4:2:0 operation, the macro block characteristic should be configured as Table A .14.8 indicating. This can be by writing register or the input that identical token (seeing Table A.14.5) offers spatial decoder being finished as indication ground.
Disposing the method token of visual dimension will decide according to using. If pixel format knew that the map architecture register of listing in so among the Table A .14.8 can be by suitable value initialization before the decoding beginning. As selection, visual dimension can be from coded data decoded and configuration space decoder. In the case, the user must repair analyzer mistake ERR_MPEG _ SEQUENCE, sees A.14.8 " in the variation of MPEG sequence layer ". A.14.4.5 JPEG
In baseline jpeg, a large amount of codes selections is arranged, they significantly change the complexity of the required control software of operation decoder. Usually, spatial decoder is designed to support minimum required when following condition is satisfied:
The color component number of each frame is less than 5 (Nf≤4) JPEG Huffman table A.14.4.6
Further, JPEG allows the Huffman coding schedule to be loaded to decoder downwards. These tables are used when decoding illustrates the VLC of coefficient. Every one scan allows two tabulation code DC coefficients, 2 tabulation code AC coefficients.
Jpeg file have three kinds dissimilar: interchange format, for the breviary form of compressed image data be the breviary form of table data. Existing compressed image data also has the definition of all required tables of decoding image data (Huffman quantizes etc.) in an interchange format file. The thumbnail image data format file has omitted the definition of table. The contraction table formatted file only comprises the definition of table.
Spatial decoder will be accepted three kinds of all forms. Yet the thumbnail image data file only could be decoded in the situation that all required tables all have been defined. This definition can be done by any of other two kinds of jpeg file types, and perhaps, table can be arranged by user software.
If each Huffman table that uses a cover different that scans, each scanning that is defined in of table is placed in (by encoder) coded data before so. These by the spatial decoder automatic loading in this or any follow up scan, to be used.
For improving the performance of Huffman decoding, some symbol that often uses is packed into especially. They are: numerical value is 0 DC coefficient, the ending of piece AC coefficient and the management of 16 zero AC coefficients. The value of these special occasions should be written into suitable register. A.14.4.6.1 the selection of table
Register dc_huff_n and ac_huff_n are controlling which kind of color component are being used which AC or DC coefficient Huffman table. In jpeg operation, these relations are by the TD of probe poster methodjAnd TajField Definition. A.14.4.7 jpeg picture structure
Exist two kinds of baseline jpeg decoding ranks of being supported by spatial decoder significantly: every frame is less than or equal to four kinds of composition (Nf≤ 4) and every frame greater than four kinds of composition (Nf>4). If use (Nf>4), control software need to become more complicated. A.14.4.7.1 Nf≤4
The frame component specifications parameter that is contained in the JPEG frame title disposes macroblock structure register (seeing Table A.14.8) when they are decoded. Do not need user intervention, 4 kinds of required all of different colours composition of decoding illustrate all as definition.
For obtaining more details of the selection that JPEG provides, the reader should learn the JPEG specification. Simultaneously, in the jpeg picture form in A.16.1 cutline is arranged also. A.14.4.7.2 the JPEG that has composition more than four kinds
Spatial decoder can be decoded and be had the nearly jpeg file of 256 kinds of different chrominance components (maximum that JPEG allows). Yet if the component more than 4 kinds of decoding, additional user intervention needs. JPEG at most only allows four kinds of components in any scanning. A.14.4.8 non-standard modification
As mentioned above, spatial decoder supports some to exceed JPEG and the defined pixel format scope of MPEG.
JPEG restriction minimum code unit, so the piece that their each scanning comprises is no more than 10. This restriction is not suitable for spatial decoder, because it can pass through blocks_h_n, blocks_v_n, max_h and max_v illustrate treatable any minimum code unit.
MPEG just defines for the macro block of 4:2:0 (seeing Table A.14.8). Yet spatial decoder can be processed three kinds of other composition macroblock structure (for example 4:2:2). A.14.5 Video Events and mistake
Video distributor can produce two class events: analyzer event and Huffman event. About how processing the description of event and interruption, referring to A.6.3 " interruption ". A.14.5.1 Huffman event
The Huffman event is produced by the huffman decoder. Event by huffman event and huff-man_mask indication judges whether to have produced an interruption. To be produced if huffman_mask is set to 1, one interruption, and the Huffamn decoder will stop. Register huffman_errov_code[2:0] will preserve a value of indicating the event reason.
If 1 is write to huffman_event after break in service, the Huffman decoder will be attempted to recover from mistake. Equally, if huffman_mask is set to 0. (interruption masking and do not stop the Huffman decoder), the Huffman decoder will be attempted automatically to recover from mistake. A.14.5.2 analyzer event (Parser events)
The analyzer event is produced by analyzer. Event is indicated by Parser_event. After this parser_mask judges whether that an interruption is produced. 1, one interruption will be produced and analyzer will stop if parser_mask is set to. Register parser_error_code [7:0] will preserve the value of an indication event reason.
If 1 is written into huffman_event after break in service, the Huffman decoder will be attempted to recover from mistake. Equally, if huffman_mask is set to 0, (interruption masking and do not stop the Huffman decoder), the Huffman decoder will be attempted automatically to recover from mistake.
If 1 is written into parser_event after break in service, analyzer will bring into operation again. If bit stream mistake of event indication, video distributor will be attempted to recover from mistake.
If analyzer is set to 0, analyzer will arrange its event bit, but can not produce an interrupting or stopping. It will continue operation and attempt automatically to recover from mistake.
Table A .14.11 Huffman error code
  huffman_eror_codeExplanation
    [2]  [1]  [0]
    0    0    0This mistake does not occur in inerrancy duringnormal operating
    x
    0    1In 16 of VLC, can not find stop code
    x    1    0When wishing token, find serial data
    x    1    1When wishing serial data, find token
    1    x    xCoefficient during decoded information illustrates single is more than 64, and this expression bit stream is wrong. Piece by video distributor output only comprises 64 coefficients.
Table A .14.12 Parser error code
parser_error code[7:0]Explanation
0x00ERR_NO_ERROR, this event does not occur under normal circumstances
0x10ERR_EXTENSION_TOKEN EXTENSION_DATA token is detected by Parser. The detection of this token should priority treatment comprises that A.14.6 the DATA token of growth data see
0x11ERR_EXTENSION_DATA is followed by the detection of EXTENSION_DATA token, and a data token that comprises growth data is detected, sees A.14.6
Table A .14.12 Parser error code (continuing)
 parser_error  code[7:0]Explanation
    0x12ERR_USER_TOKEN USER_DATA token is detected by Parser, and the detection of this token should priority treatment comprises the data token (seeing A.14.6) of user data
    0x13Be the detection of USER_DATA token below the ERR_USER_DATA, comprise that the data token of user data detected, see A.14.6
    0x20ERR_PSPARE H.261 PSARE information is detected, sees A.14.7
    0x21ERR_GSPARE H.261 GSARE information is detected, sees A.14.7
    0x22The ERR_PTYPE H.261 value of visual type changes, and register h_261_ pic_type can be examined, and what is to look into new value
    0x30   ERR_JPEG_FRAME
    0x31   ERR_JPEG_FRAME_LAST
    0x32ERR_JPEG_SCAN image size or number change
    0x33The ERR_JPEG_SCAN_COMP component changes
    0x34   ERR_DNL_MARKER
Table A .14.12 Parser error code (continuing)
 parser_error  code[7:0]Explanation
    0x40ERR_MPEG_SEQUENCE in the MPEG sequence layer in the Transfer Parameters one change, see A.14.8
    0x41ERR_EZTRA_PICTURE MPEG extra_information_picture is detected, sees A.14.7
    0x42ERR_EZTRA_SLICE MPEG extra_information_slice is detected, sees A.14.7
    0x43The VSV_DELAY parameter of ERR_VBV_DELAY first image in new MPEG video sequence detects by video distributor, first image after new length of delay can obtain new sequence in register vbv_delay first image is defined as EOS. FLUSH or reset
    0x80Error format token of ERR_SHORT_TOKEN detects, and this mistake does not occur in normal operation
    0x90ERR_H261_PIC_END_UNEXPECTED is at duration of work H.261, running into visual end signal in side undesirably, and this probably represents to have a mistake in coded data
    0x91ERR_GN_BACKUP is at duration of work H.261, and piece group one group of number that desired value is little on year-on-year basis meets, and this probably represents to have a mistake in coded data
Table A .14.12 Parser error code (continuing)
parser_error code[7:0]Explanation
    0x92ERR_GN_SKIP_GOB is at duration of work H.261, and piece group one group of number that desired value is large on year-on-year basis meets, and this probably represents to have a mistake in coded data
    0xA0ERR_NBSE_TAB attempts to load a Huffman table at the JPEG duration of work, and this table is not by baseline jpeg support (baseline jpeg is only supported the table 0 and 1 of entropy coding)
    0xA1ERR_QUANT_PRECISION attempts to load a quantization table at the JPEG duration of work, and this shows not by baseline jpeg support (baseline jpeg is only supported 8 position predictions in the quantization table)
    0xA2ERR_SAMPLE_PRECISION attempts to specify a sampling precision at the JPEG duration of work, and it is greater than baseline jpeg, the precision of support (baseline jpeg is only supported 8 precision)
    0xA3ERR_NBASE_SCAN JPEG scanning leader parameter S s, Se, A and A one or more are set up a value, and it is not by baseline jpeg support (the continuous approximation value that represents special selection and/or be not supported in baseline jpeg)
    0xA4ERR_UNEXPECTED_DNL in single pass, but is not to run into when scanning for the first time of a frame at DNL mark of JPEG duration of work
    0xA5ERR_EOS_UNEXPECTED is at the JPEG duration of work, and an EOS mark is undesirably just running into
Table A .14.12 Parser error code (continuing)
 parser_error  code[7:0]Explanation
    0xA6ERR_RESTART_SKIP is at the JPEG duration of work, and one is restarted mark has been undesirable in the value that desired location is not run into or is restarted mark. If one to restart mark found when wishing not have when finding, the Huffman event " finds string data " and will be produced when wishing token
    0xB0ERR_SKIP_INTRA is at the MPEG duration of work, and one has macroblock address increment and finds in intra (I) image greater than 1 macro block. This is illegal and may represents the bit stream mistake
    0xB1ERR_SKIP_DINTRA is at the MPEG duration of work, and one has macroblock address increment and only finds in DC (D) image greater than 1 macro block. This is illegal, may represent a bit stream mistake
    0xB2ERR_BAD_MARKER is at the MPEG duration of work, the futureless value of marker bit. This may represent the bit stream mistake
    0xB3ERR_D_MBTYPE is at the MPEG duration of work, and only in DC (D) image, the macro block of a macro block (mb) type except 1 is found. This is illegal, may represent a bit stream mistake
    0xB4ERR_D_MBEND is at the MPEG duration of work, and only in DC (D) image, macro block ending is that 0 macro block is found, and this is illegal, may represent a bit stream mistake
Table A .14.12 Parser error code (continuing)
 parser_error  code[7:0]Explanation
    0xB5ERR_SVP_BACKUP is at the MPEG duration of work, and the upright position of a sheet runs into less than the sheet of desired value, probably is illustrated in a mistake in the coded data
    0xB6ERR_SVP_SKIP_ROWS is at the MPEG duration of work, and the upright position of a sheet runs into greater than the sheet of desired value. Probably be illustrated in the mistake in the coded data
    0xB7ERR_FST_MBA_BACKUP is at the MPEG duration of work, and a macroblock address is run into less than the macro block of desired value. Probably be illustrated in a mistake in the coded data
    0xB8ERR_FST_MBA_SKIP is at the MPEG duration of work, and a macro block has run into a macroblock address greater than desired address. This probably is illustrated in mistake in the coded data
    0xB9ERR_PICTURE_END_UNEXPECTED is at the MPEG duration of work, and a PICTURE_END token is undesirably just running into, and this probably is illustrated in mistake in the coded data
    0xE0...     0xEFMistake is that the close beta program keeps
    0xE0ERR_TST_PROGRAM reaches in estimating program abstrusely
    0xE1If the ERR_NO_PROGRAM test program is not included in wherein
Table A .14.12 Parser error code (continuing)
  parser_error   code[7:0]Explanation
  0xE2ERR_TST_END tests end
  0xF0...   0xFFKeep mistake
  0xF0ERR_UCODE_ADDR gross error
  0xF1   ERR_NOT_INPLEMENTED
Every kind of standard is used different subsets of the analyzer error code of definition.
Table A .14.13 analyzer error code and various criterion
The token name     MPEG     JPEG     H.261
  ERR_NO_ERROR     /     /     /
  ERR_EXTENSION_TOKEN     /     /
  ERR_EXTENSION_DATA     /     /
  ERR_USER_TOKEN     /     /
  ERR_USER_DATA     /     /
  ERR_PSPARE     /
  ERR_GSPARE     /
  ERR_PTYPE     /
  ERR_JPEG_FRAME     /
  ERR_JPEG_FRAME_LAST     /
  ERR_JPEG_SCAN     /
Table A .14.13 analyzer error code and various criterion (continuing)
The token name     MPEG     JPEG     H.251
  ERR_JPEG_SCAN_COMP     /
  ERR_DNL_MARKER     /
  ERR_MPEG_SEOUENCE     /
  ERR_EXTRA_PICTURE     /
  ERR_EXTRA_SLICE     /
  ERR_VBV_DELAY     /
  ERR_SHORT_TOKEN     /     /     /
  ERR_H261_PIC_END_UNEXPECTED     /
  ERR_GN_BACKUP     /
  ERR_GN_SKIP_GOB     /
  ERR_NBASE_TAB     /
  ERR_QUANT_PRECISION     /
  ERR_SAMPLE_PRECISION     /
  ERR_NBASE_SCAN     /
  ERR_UNEXPECTED_DNL     /
  ERR_EOS_UNEXPECTED     /
  ERR_RESTART_SKIP     /
  ERR_SKIP_INTRA     /
  ERR_SKIP_DINTRA     /
  ERR_BAD_MARKER     /
  ERR_D_MBTYPE     /
  ERR_D_MBEND     /
  ERR_SVP_BACKUP     /
  ERR_SVP_SKIP_ROWS     /
  ERR_FST_MBA_BACKUP     /
  ERR_FST_MBA_SKIP     /
  ERR_PICTURE_END_UNEXPECTED     /
  ERR_TST_PROGRAM     /     /     /
  ERR_NO_PROGRAM     /     /     /
  ERR_TST_END     /     /     /
  ERR_UCODE_ADDR     /     /     /
  ERR_NOT_IMPLEMENTED     /     /     /
A.14.6 receive user and growth data
MPEG and JPEG use similar gimmick to deposit user and growth data. There is an initial/mark code data front. Lose interest in if use these class data, detector for initial code will be deleted these data (seeing A.11.3.3). A.14.6.1 recognition data is originated
The analyzer event, ERR_EXTENSION_TOKEN and ERR_USER_TOKEN, indication EXTENSION_DATA or USER_DATA token arrive video distributor. If these tokens are produced (seeing A.11.3.3) by detector for initial code, the value that they will carry initial/mark code makes detector for initial code produce token (seeing Table A.11.4). This value can be read by reading the rom_revision register when the analyzer break in service. Video distributor stops maintenance until 1 be written into parser_event (seeing A.6.3 " interruption "). A.14.6.2 read data
A DATA token that carries expansion or user data should tightly be followed in EXTENSION_DATA and USER_DATA token back. This DATA token arrives video distributor will produce an ERR_EXTENSION_DATA or an ERR_USER_DATA syntax analyzer event. The first byte of DATA token can be read by reading the rom_revision register when break in service.
After event was eliminated, the state of video distributor register " continue " was determining state. If this register holds 0 value, any event can be eliminated and not produce to any remaining data all in the DATA token by video distributor so. If continue is set to 1, each byte of expansion or user data all can have an event to be produced when arriving Video Decoder. This will continue until the DATA token is consumed or continue is set to 0. Attention:
1) always the first byte of expansion/user data is expressed by the rom_revision register no matter the state of continue.
2) event that does not exist last byte of indication extension/user data to be read. A.14.7 receive external information
H.261 allow the information of expansion coding standard to be loaded in image and piece group (H.261) or the sheet (MPEG) with MPEG. Its gimmick is different from expansion and user data employed (A.14.6 in the part explanation being arranged). Do not have initial code in the data front, therefore, it can not be deleted by detector for initial code.
In H.261 operating, analyzer event ERR_PSPARE and ERR_GSPARE indication detect this information. Corresponding event in MPEG is ERR_EXTRA_PICTURE and ERR_ EXTRA_SLICE.
When the analyzer event is produced, the first byte of additional information is presented by register rom_ revision.
Behavior after video distributor register continue decision event is eliminated. If this register value is 0, so any remaining additional information all will be eliminated by video decoder and the event that do not have produces. If continue to be set to 1, each byte of additional information all can have an event to be produced when arriving video distributor. This will last till that additional information runs out or continue is set to 0.
Attention:
1) regardless of the state of continue, the first byte of expansion/user data always manifests by the rom_revision register.
2) event that does not exist last byte of indication extension/user data to be read. A.14.7.1 the generation of FIELD_INFO token
In the MPEG operation, if register field_info is set to 1, the first byte of any extra _ information_picture is put in the FIELD_INFO token. This behavior is not included by the standardization activity of MPEG. Table A .3.2 has shown the definition of FIELD_INFO token.
If FIELD_INFO is set to 1, the first byte of extra_information_picture is not had the analyzer event be produced. Yet, have the event generation for any subsequent byte of extra_infor-mation_picture. If extra_infor-mation_picture only has a single byte, do not have the analyzer event and produce. A.14.8 the variation of MPEG sequence layer
The MPEG sequence-header has been described the following characteristic of video that will be decoded:
The horizontal and vertical size
Pixel the ratio of width to height
The image rate
The coded data rate
Video buffer verifier buffer size
If any one when a sequence-header of spatial decoder decoding in these parameters changes, analyzer ERR_MPEG_SEQUENCE will be produced. A.14.8.1 the change of image size
If image size changes to some extent, the value of horiz_pels and vert_ pels should be read by user software, and calculates the new value that will be loaded into register horiz_macroblocks and vevt_ macroblocks. A.15 space decoding
Corresponding to the present invention, space decoding betides between the output of the output of token buffer and spatial decoder. The formant that three kinds of responsible space decodings are arranged: countercurrent fashion device, inverse quantization device and reverse discrete cosine transform device. Entering this part (from token buffer) input, the DATA token comprises the quantization parameter of a distance of swimming and level expression. At output (inverse-DCT), 8 * 8 pixel informations of DATA token. A.15.1 countercurrent fashion device (The Inverse Modeler)
DATA token in the token buffer comprises about the value of quantization parameter with by the information of 0 number in the coefficient that represents. The countercurrent fashion device is about 0 distance of swimming extend information, so that each DATA token comprises 64 values. In this, the value in the DATA token is quantization parameter.
No matter what the coding standard of current use is, it all is identical that the countercurrent fashion device is processed, and does not need to be configured.
In order to understand better medelling and all needs of countercurrent fashion function, the reader can check any picture coding standard. A.15.2 inverse quantization device
In an encoder, quantizer removes the output of DCT to reduce the resolution ratio of DCT coefficient. In a decoder, the function of inverse quantization device is to take advantage of the DCT coefficient of these quantifications to recover their approximate original values. A.15.2.1 the criterion and quantity scheme is summarized
The quantization scheme of every kind of different coding standard use has remarkable difference. For obtaining the detailed understanding for every kind of employed quantization scheme of standard, the reader can study the correlative coding normative document.
The operation of register iq_coding_standard configuration inverse quantization device is to satisfy the needs of various criterion. In general operation, this register is by CODING_STANDARD token automatic loading. Obtain the more information about coding standard configuration, referring to part A.21.1.
Main Differences between the quantization scheme is the source of the numeral that quantization parameter will be multiplied by. These are summarized below. In required algebraic operation (round off etc.) nuance is also arranged, do not explain herein. A.15.2.1.1 IQ summation H.261
In H.261, one single " scale factor " is used to scale factor. Encoder can this scale factor of periodically-varied to adjust the data transfer rate that produces. Rules slightly different in the in-line coding piece is applied in " DC " coefficient. A.15.2.1.2 JPEG IQ summation
Baseline jpeg allows an images to comprise nearly 4 kinds of different color components in each scanning. Each can specify the quantization table of 64 entrances to these 4 kinds of colors are divided into. Each entrance of these tables is used as in 64 quantization parameters " scale " factor of one.
The value of JPEG quantization table is included in also will be by the quantization table of automatically packing in the coding jpeg data. A.15.2.1.3 MPEG IQ summation
H.261 MPEG uses and the JPEG quantification technique. As JPEG, MPEG can use 4 quantization tables, and each quantization table has 64 entrances. Yet the use of table is completely different.
Consider two " class " data: inner and non-internal data. Every kind of data type is used a different table. Two default tables are defined by MPEG. One is used for internal data and another is used for non-internal data (see Table A.15.2 with Table A .15.3). These default tables must mpeg decode become may before be written into the quantization table memory of spatial decoder.
MPEG also allows two " loading " quantization tables downwards. One is for internal data, and another is for non-internal data. The value of these tables is contained in the mpeg data stream and will be written in the quantization table memory by automatic mounting.
The value of output is by the scale factor correction from table. A.15.2.2 inverse quantization device register
Table A .15.1 anti-phase quantizer register
Register nameSize/directionResetmodeExplanation
iq_access
  1   rw   0This access bit stops the work of inverse quantization device, so its various registers can be by reliableaccess
iq_coding _standard
  2   rw   0The coding standard of being used by the anti-phase quantizer thus register arranges this register and directly loads or use the CODING_STANDARD token. (seeing A.21.1)
iq_keyhole _address   8   rw   xKeyhole access 4 quantization tables wherein see to be the more information of passing through the keyhole access function resister A.5.4.3, see A.5.4.3
iq_keyhole _data   8   rw   x
In the present invention, the iq_acces register must be set up before the quantization table memory is accessed. If attempt to read the quantization table memory when iq_access is set to 0, it will return null value. A.15.2.3 dispose the inverse quantization device
In general operation, need not dispose the coding standard of inverse quantization device, because it will be disposed automatically by the CODING_STANDARD token.
To H.261 operation, quantization table is not used. Without any need for particular arrangement. To jpeg operation, the table that the inverse quantization device needs should with the information that from coded data, extracts together by automatic loading.
The MPEG action need loads and lacks the x quantization table. This should be set to finish in 1 o'clock at iq_access. The 0x00 that value among the Table A .15.2 should be written into the expanded address space of inverse quantization device (can pass through keyhole register iq_keyhole_address and iq_ keyhole_data access) to the 0x3F position. Similarly, the value among the Table A .15.3 should be written into the 0x40 of expanded address space of inverse quantization device to the position of 0x7F.
Table A .15.2 is to the default mpeg table of based encode piece
  Wi.00    i   Wi.0    i   Wi.0    i   Wi.0
    0     1     2     3     4     5     6     7     8     9     10     11     12     13     14     15     8     16     16     19     16     19     22     22     22     22     22     22     26     24     26     27     16     17     18     19     20     21     22     23     24     25     26     27     28     29     30     31     27     27     25     26     26     25     27     27     27     29     29     29     34     34     34     29     32     33     34     35     36     37     38     39     40     41     42     43     44     45     46     47     29     29     27     27     29     29     32     32     34     34     37     38     37     35     35     34     48     49     50     51     52     53     54     55     56     57     58     59     60     61     62     63     35     38     38     40     40     40     48     48     46     46     56     56     58     69     69     83
A quantizes tabular value with respect to the start offset b of quantization table memory
Table A .15.3 is to the default mpeg table of non-based encode piece
  i   Wi.1   i   Wi.1   i     Wi.1     i     Wi.1
  0   1   2   3   4   5   6   7   8   9   10   11   12   13   14   15   16   16   16   16   16   16   16   16   16   16   16   16   16   16   16   16   16   17   18   19   20   21   22   23   24   25   26   27   28   29   30   31   16   16   16   16   16   16   16   16   16   16   16   16   16   16   16   16   32   33   34   35   36   37   38   39   40   41   42   43   44   45   46   47     16     16     16     16     16     16     16     16     16     16     16     16     16     16     16     16     48     49     50     51     52     53     54     55     56     57     58     59     60     61     62     63     16     16     16     16     16     16     16     16     16     16     16     16     16     16     16     16
A.15.2.4 token allocation list
As a kind of the substituting by MPI configuration inverse quantization table, they also can be initialized by token. These tokens are provided by coded data port or MPI.
The QUANT_TABLE token illustrates in Table A .3.2. It has two a bit fields ratio, indicates in 4 (0 to 3) table position which and is defined by token. To MPEG operation, table 0 and 1 default definition need to be loaded. A.15.2.5 quantize tabular value
For JPEG and MPEGA, the quantization table entrance all is 8 figure places. 255 to 1 value is legal. 0 value is illegal. A.15.2.6 the numerical order of quantization table
Quantizing tabular value is used with " Z-type (zlg-zag) " scanning sequency (seeing coding standard). Table should be seen as the one dimension of 64 values and arrange (rather than 8 * 8 matrixes). Table enters at the low address corresponding to low frequency DCT coefficient.
When the quantification tabular value was carried by the QUANT_TABLE token, first value behind the token title was the table entry of " DC " coefficient. A.15.2.7 inverse quantization device detected register
Table A .15.4 vectorization scratchpad register
Register nameSize/directionReset modeExplanation
iq_quant _scale
  5   rwThis register keeps the currency of quantization scaling factor, and it is loaded by the QUANT_SCALE token, and this is obsolete at the JPEG duration of work.
iq_component   2   rw2 component ID that this register maintenance is got from nearest data token head. This value is included in the selection of quantization table. This register also will keep this Table I D to load this table after the OUANT_TABLE token arrives.
iq_prediction _mode   2   rwThis keeps 2 least significant bits of nearest PREDICTION_MODE token.
iq_jpeg_ indirection   8   rwThis register make two component ID numbers of the DATA token relevant with the table number of the quantization tables that should be used. Position 1:0 specifies the table number that will be used bycomponent 0. Position 3:2 specifies the table number that will be used bycomponent 1. Position 5:4 specifies the table number that will be used bycomponent 2. Position 7:6 specifies the table number that will be used bycomponent 3. This register is loaded by JPEG_TABLE_SECE--CL token.
Table A .15.4 inverse quantization scratchpad register (continuing)
Register nameSize/directionReset modeExplanation
iq_mpeg_ indrection
   2    rwWhether this two bit registers record loads quantization table with default value or by inner or non-internal data by the higher level. Should be used at the default table of bit position expression for one 0, one 1 expression should be used by higher level's load table.Position 0 relates to internal data, andposition 1 relates to non-internal data, and register is loaded by token MPEG_TABLE SELECT usually.
A.15.3 reverse discrete cosine transform
H.261 reverse discrete transform processor among the present invention meets CCITT recommends, the requirement of IEEE specification P1180, and meet the requirement of describing in the current MPEG revision draft.
Where no matter use coding standard, the reverse discrete cosine transform processor is identical. Do not need the user to be configured.
There are two events relevant with reverse discrete transform processor.
The anti-phase DCT event registers of Table A .15.5
Register nameSize/directionReset modeExplanation
 ldct_too_  few_event
  1   rw     0Oppositely DCT requires all DATA tokens to comprise accurately 64 values. If find less than 64 values, that generation event produces interruption so if mask register is set to 1 very little, and oppositely DCT will stop. This event should occur over just immediately following mistake in coded data.
 idct_too  _few_mask   1   rw     0
The anti-phase DCT event registers of Table A .15.5 (continuing)
Register nameSize/directionReset modeExplanation
idct_too_ many_event
  1   rw   0Oppositely DCT requires all DATA tokens to comprise accurately 64 values. If find that greater than 64 values it is too many to produce so event, if mask register is set to 1, can produce interruption so, oppositely DCT stops. This event should occur over just immediately following mistake in coded data.
idct_too_ many_mask   1   rw   0
For understanding better DCT and inverse-DCT function, the reader can check any picture coding standard. A.16 the output of connection space decoder
The output of spatial decoder is a standard token port with 9 bit wide data words. Want to know the more information of interface electrical property, referring to part A.4.
The token that output shows will depend on the coding standard of use. For example, this part of announcement is to consider the output of spatial decoder when disposing for jpeg operation. The sequence of tokens of having observed at the output of temporal decoder when this part has also been described jpeg operation is not because temporal decoder changes the sequence of tokens that gets from decoding JPEG.
Yet, MPEG and H.261 all need the use of temporal decoder. Want MS in for MPEG with H.261 operate the information of the output binding of the temporal decoder when disposing, referring to part A.19.
In addition, this part determines can obtain which token at the output of spatial decoder, and which token is the most useful when design circuit is exported to show. Other token also occurs, but need not be used for showing output, does not therefore discuss at this.
This part mainly shows:
How beginning and the end of recognition sequence.
How beginning and the end of recognition image.
When how to confirm shows image.
Where the how to confirm pictorial data should be placed into demonstration. A.16.1 the structure of jpeg picture
This part provides the summation of some feature of JPEG system. Want to know full details, see also coding standard.
JPEG provides multiple gimmick for the single image of encoding. JPEG does not attempt to illustrate that how together a large amount of images of coding are to provide a kind of gimmick of encoded video.
According in the present invention, the baseline that spatial decoder is supported JPEG is (baseline sequential) mode of operation continuously. Three main ranks are arranged: image, frame and scanning in the system. One width of cloth consecutive image includes only a single frame. One frame can comprise 1 to 256 kind of different image (color) component. These image components can be grouped into various ways and be scanning. Every one scan can comprise 1 to 4 kind of image component (seeing Figure 81, " JPEG baseline continuous structure summation ").
If one scan comprises a single image component, it is non-interlaced; Surpass a kind of image component if it comprises, it is an interlacing scan. One frame can comprise the mixing of interlacing and non-interlace. 256 restrictions of the image component number that the number of scans that one frame can comprise can be comprised by a frame and determining.
In interlacing scan, data are organized into minimum coding unit (MCU), and these coding units are similar with the macro block of H.261 middle use to MPEG. These MCU press grating and arrange in an images. In a non-interlace, MCU is one single 8 * 8. These equally also are that grating is arranged.
Spatial decoder can be decoded easily and be comprised the jpeg data of 1 to 4 kind of different colours composition. Describe that more the information of multicomponent amount number also can be decoded. Yet for adapting to the component that next group will be decoded, some between needs can being scanned reconfigures. A.16.2 sequence of tokens
The jpeg marker code is converted to the token (see Table A.11.4, see Figure 82 " token jpeg picture ") of a similar MPEG name by detector for initial code. A.17 temporal decoder
The 30MHz operation
For MPEG and H.261 Video Decoder time decoder is provided
H.261 CIF and QCIF form
The MPEG video resolution reaches 740 * 480,30Hz, 4:2:0
Chroma flexibly
Can reset the MPEG image sequence
Glue_Less DRAM interface
Single positive 5 volts of power supplys
208 pin PQFP encapsulation
Maximum power consumption 2.5w
Application standard page DRAM
Temporal decoder is a paired chip of spatial decoder. It provides H.261 required with MPEG time decoder.
Temporal decoder is finished MPEG and required all predictions Formation and characteristics H.261. By means of a single 4Mb DRAM (for example, 512k * 8), the temporal decoder H.261 video of CIF and QCIF of can decoding. By means of 8Mb DRAM (for example, 2256k * 16), 740 * 480,30Hz, 4:2:0 MPEG video can be decoded.
For in-line coding scheme (such as JPEG) when not required between decoder. If temporal decoder is included in more than one in the standard decoder, it will make decoded jpeg picture by its output. Attention: above-mentioned value only is for example as the explanation of one embodiment of this invention, need to be as restriction. Do not depart from the present invention and use other numerical value and scope will to be appreciated yet. A.17.1 temporal decoder signal
Table A .17.1 temporal decoder signal
Signal name     I/OPin numberExplanation
in_data[8:0]     I   173,172,171,   169,168,167,   166,164,163Input port, this is a standard two wire interface, links the spatial decoder output port when normal. See A.4 and A.18.1
in_extn     I   174
in_valid     I   162
in_accept     O   161
enable[1:0]     I   126,127MPI (MPI). See A.6.1 page orleaf 59
rw     I   125
addr[7:0]     I   137,136,135,   133,132,131,   130,128
data[7:0]     O   152,151,149,   147,145,143,   141,140
irq     O   154
Table A .17.1 temporal decoder signal (continuing)
Signal name   I/OPin numberExplanation
 DRAM_data  [31:0]   I/O 15,17,19,20, 22,25,27,30, 31,33,35,38, 39,42,44,47, 49,57,59,61, 63,66,68,70, 72,74,76,79, 81,83,84,85,The DRAM interface. See A.5.2
 DRAM_addr  [10:0]     O 184,186,188, 189,192,193, 195,197,199, 200,203
  RAS     O 11
  CAS[3:0]     O 2,4,6,8
  WE     O 12
  OE     O 204
  DRAM_enable     I 112
  out_data   [7:0]     O 89,90,92,93, 94,95,97,98Output port, this is a standard two wire interface. See A, 4 and A, 19
  out_extn     O 87
  out_valid     O 99
  out_accept     I 100
Table A .17.1 temporal decoder signal (continuing)
Signal name     I/OPin numberExplanation
 tck     I     115Jtag port. See A.8
 tdi     I     116
 tdo     O     120
 tms     I     117
 trst     I     121
 decoder_  clock     I     177The main decoder clock. See A.7.2
 rest     I     160Reset
Table A .17.2 temporal decoder test signal
Signal name   I/OPin numberExplanation
tphOish   I   122If override=1, Tphoish and tphish are as the input of two phase clock on the chip so. Correct operating and setting override=0 tphoish and Tphiish be left in the basket (thereby link GND or VCC)。
tphlish   I   123
override   I   110
chiptest   I   111Correct work arranges chiplist=0.
tloop   I   114When correct work, link GND or VDD
Table A .17.2 temporal decoder test signal (continuing)
Signal name     I/OPin numberExplanation
ramtest     I     109If ramtest=1 allows RAM on the test pieces, it is normal operating that ramtest=0 is set.
pllselect     I     178If pllselect=0, the phaselocked loop on chip is under an embargo. In normal operation, establish pllselect=1.
ti     I     180During test job, the DRAM interface is needed 2 clocks. Receive ground or V in normal work periodDD
tq     I     179
pdout     O     207These two pins are that phaselocked loop connects external filter
pdin     I     206
Table A .17.3 temporal decoder pin assignment
The signal name pinThe signal name pinThe signal name pinThe signalname pin
nc
             208 nc        156nc             104  nc             52
test pin       207 nc        155nc             103  nc             51
test pin       206 irQ       154nc             102  nc             50
GND            205 nc        153VDO            101  DRAM_data[15]  49
OE             204 data[7]   152out_accept     100  nc             48
DRAM_addr[0]   203 data[6]   151out_vand       99  DRAM_data[16]  47
VDO            202 nc        150 out_data[0]    98  nc             46
nc             201 data[5]   149 out_data[1]    97  GND            45
DRAM_addr[1]   200 nc        148 GND            96  DRAM_data[17]  44
DRAM_addr[2]   199 data[4]   147 out_data[2]    95  nc             43
GND            198 GND       146 out_data[3]    94  DRAM_data[18]  42
DRAM_addr[3]   197 data[3]   145 out_data[4]    93  VDD            41
nc             196 nc        144 out_data[5]    92  nc             40
Table A .17.3 temporal decoder pin assignment (continuing)
The signal name pinThe signal name pinThe signal name pinThe signal name pin
DRAM_addr[4]     195 data[2]      1<3 VDD            9l  DRAM_data[13]    39
VDD              194 nc           1≤2 out_data[6]    90  DRAM_data[20]    38
DRAM_addr[5]     193 data[1]      1≤t out_data[7]    89  nc               37
DRAM_addr[6]     192 data[0]      1≤D nc             88  GND              36
nc               191 nc           139 out_extn       87  DRAM_data[21]    35
GND              190 VDD          138 GND            86  nc               34
DRAM_addr[7]     189 addr[7]      137 DRAM_data[0]   85  DRAM_data[22]    33
DRAM_addr[8]     188 addr[6]      136 DRAM_data[1]   84  VDD              32
VDD              187 addr[5]      135 DRAM_data[2]   83  DRAM_data[23]    31
DRAM_addr[9]     186 GND          134 VDD            82  DRAM_data[24]    30
nc               185 addr[4]      133 DRAM_data[3]   81  nc               29
DRAM_addr[10]    184 addr[3]      132 nc             80  GND              28
GND              183 addr[2]      131 DRAM_data[2]   79  DRAM_data[25]    27
nc               182 addr[1]      130 GND            78  nc               26
VDD              181 VDD          129 nc             77  DRAM_data[25]    25
test pin         180 addr[0]      128 DRAM_data[5]   76  nc               24
test pin         179 enable[0]    127 nc             75  VDD              23
test pin         178 enable[0]    126 DRAM_data[6]   74  DRAM_data[27]    22
decoder_clock    177 r w           125 VDD            73  nc               21
nc               176 GND          124 DRAM_data[7]   72  DRAM_data[23]    20
GND              175 test pin     123 nc             7l  DRAM_data[29]    19
in_extn          174 test pin     122 DRAM_data[8]   70  GND              18
in_data[8]       173 trst         121 GND            69  DRAM_data        17
in_data[7]       172 tdo          120 DRAM_data[9]   68  nc               16
in_data[6]       171 nc           119 nc             67  DRAM_data[31]    15
VDD              170 VDD          118 DRAM_data[10]  66  VDD              14
in_data[5]       169 trst         117 VDD            65  nc               13
in_data[4]       168 tdi          116 nc             64   WE               12
in_data[3]       167 lck          115 DRAM_data[11]  63   RAS              11
in_data[2]       166 test pin     114 nc             62  nc               10
GND              165 GND          113 DRAM_data[12]  61  GND              9
in_data[1]       164 DRAM_enable  112 GNO            60   CAS[0]           8
in_data[0]       163 test pin     111 DRAM_data[13]  59  nc               7
in_valid         162 test pin     110 nc             58   CAS[1]           6
in_accept        161 test pin     109 DRAM_data[14]  57  VDD              5
Table A .17.3 temporal decoder pin assignment (continuing)
The signal name pinThe signal name pinThe signal name pinThe signal name pin
 reset   160  nc    108  VDD    56   CAS[2]  4
 VDD     159  nc    107  nc     55  nc      3
 nc      158  nc    106  nc     54   CAS[3]  2
 nc      157  nc    105  nc     53  nc      1
A.17.1.1 " nc " do not link pin
The pin that indicates nc among the Table A .17.3 is currently among the present invention not use and keep for product in the future. These pins should not be connected. They should not be linked to VDD, GND should not be connected to each other or be connected in any other signal. A.17.2 VDDWith the GND pin
All V that provide are provided as peopleDDMust be linked to corresponding power supply with the GND pin. All VDDAll need correct the use with the GND pin, otherwise device will can not move normally. A.17.1.3 the pin of general operation links test
9 pins of temporal decoder are given over to close beta and are used.
The default test pin of Table A .17.4 links
Pin numberLink
For normal operation links ground
For V is linked in normal operationDD
Unsettled for working
A.17.1.4 the JTAG pin of general operation
See A.8.1 part
The guide look of Table A .17.5 temporal decoder storage image
The addressRegister nameSee Table
  0x00...0x01The break in service district     A.17.6
  0x02...0x07Need not
  0x08The chip access     A.17.7
  0x09...0x0FNeed not
  0x10Picture sequences     A.17.8
  0x11...0x1FNeed not
  0x20...0x2EDRAM interface configuration register     A.17.9
  0x2F...0x3FNeed not
  0x40...0x53Buffer configuration     A.17.8
  0x54...0x5FNeed not
  0x60...0xFFScratchpad register     A.17.11
Table A .17.6 break in service district register
The addressThe positionRegister namePage number
    0x00
  7   chip_event
  6:2Need not
  1   chip_stopped_event
  0   count_error_event
    0x01
  7   chip_mask
  6:2Need not
  1   chip_stopped_mask
  0   count_error_mask
Table A .17.7 chip access resistance
The addressThe positionRegister namePage number
  0x08     7:1Need not
    0   chip_access
Table A .17.8 image sequencing
The addressThe positionRegister namePage number
  0x10     7:1Need not
    0   MPEG_reordering
Table A .17.9 DRAM interface configuration register
The addressThe positionRegisterPage number
  0x20     7:5Need not
    4:0  page_start_length[4:0]
  0x21     7:4Need not
    3:0  read_cycle_length[3:0]
  0x22     7:4Need not
    3:0  write_cycle_length[3:0]
  0x23     7:4Need not
    3:0  refresh_cycle_length[3:0]
  0x24     7:4Need not
    3:0  CAS_falling[3:0]
  0x25     7:4Need not
    3:0  RAS_falling[3:0]
  0x26     7:1Need not
    0  interface_timing_access
  0x27     7:0Need not
  0x28     7:6  RAS_strength[2:0]
    5:3  OEWE_strength[3:0]
    2:0  DRAM_data_strength[3:0]
  0x29     7Need not
    6:4  DRAM_addr_strength[3:0]
    3:1  CAS_strength[3:0]
    0  RAS_strength[3]
Table A .17.9 DRAM interface configuration register (continuing)
The addressThe positionRegister namePage number
  0x28
    7Need not
    6:4  DRAM_addr_strength[3:0]
    3:1  CAS_strength[3:0]
    0  RAS_strength[3]
  0x29     7:6  RAS_strength[2:0]
    5:3  OEWE_strength[3:0]
    2:0  DRAM_data_strength[3:0]
  0x2A     7:0  refresh_interval
  0x2B     7:0Need not
  0x2C     7:5Need not
    5  DRAM_enable
    4  no_refresh
    3:2  row_address_bits[1:0]
    1:0  DRAM_data_width[1:0]
  0x2D     7:0Need not
  0x2E     7:0  Test regisiers
Table A .17.10 buffer configuration register
The addressThe positionRegister namePage number
  0x40   7:0Need not
  0x41   7:2
  1:0  picture_buffer_0[17:0]
  0x42   7:0
  0x43   7:0
  0x44   7:0Need not
  0x45   7:2
  1:0  picture_buffer_l[17:0]
  0x46   7:0
  0x47   7:0
Table A .17.10 buffer configuration register (continuing)
The addressThe positionRegister namePage number
  0x48     7:0Need not
  0x49     7:1
    0  component_offset_0[16:0]
  0x4A     7:0
  0x4B     7:0
  0x4C     7:0Need not
  0x4D     7:1
    0  component_offset_1[16:0]
  0x4E     7:0
  0x4F     7:0
  0x50     7:0Need not
  0x51     7:1
    0  component_offset_2[16:0]
  0x52     7:0
  0x53     7:0
Table A .17.11 scratchpad register
The addressThe positionRegister namePage number
  0x2E
  7…4ThePLL resistor
  3…0
  0x50   7…6Need not
  5…4  coding_standard[1:0]
  3…2  picture_type[1:0]
  1  H251_filt
  0  H261_s_f
  0x51
  7…6  component_id
  5…4  prediction_mode
  3…0  max_sampfing
  0x52
  7…0  samp_h
  0x63   7…0  samp_v
Table A .17.11 scratchpad register (continuing)
The addressThe positionRegister namePage number
  0x54
  7…0   bacx_h
  0x65   7…0
  0x56   7…0   back_v
  0x57   7…0
  0x68   7…0   forw_h
  0x59   7…0
  0x5A   7…0   forw_v
  0x5B
  7…0
  0x5C   7…0   width_in_mo
  0x6D
  7…0
A.18 temporal decoder operation A .18.1 data input
The input FPDP of temporal decoder is a standard token port with 8 bit wide data words. In great majority were used, this will directly link to each other with the output token port of spatial decoder. Wish obtains the more information about this interface electric characteristics, referring to part A.4. A.18.2 automatically configuration
Register in the token automatic mounting angle of incidence decoder that the parameter relevant with the pixel format of encoded video produced by spatial decoder.
Table A .18.1 is through the configuration of the temporal decoder of token
TokenConfiguration
CODING_STANDARDThe coding standard of temporal decoder is configured automatically by the CODING_STANDARD token. At every turn, a new sequence is started, and produces these configurations by spatial decoder, sees Figure 58.
DEFINE_SAMPLINGHorizontal and vertical chroma samples information to each chrominance component is configured automatically by the DEFINE_SAMPLING token.
HORIZONTAL_MBSHorizontal width in the image of macro block is disposed automatically by HORIZONTAL_ MBS token.
A.18.3 human configuration
The user must dispose (passing through MPI) on using the parameter of deciding. A.18.3.1 when dispose
Temporal decoder should only be configured when not carrying out the data processing. Here it is at the default setting that resets after being removed. Temporal decoder can be stopped, and reconfigures to write 1 by face chip_ access register. After configuration is finished, should in chip_access, write 0.
About when disposing the details of DRAM interface, referring to part A.5.3. A.18.3.2 DRAM interface
The DRAM interface sequence must might predictability decoding and coding video (for example H.261 or MPEG) be configured before. See A.5 part, " DRAM interface ".
Table A .18.2 temporal decoder register
Register nameSize/directionResetmodeExplanation
chip_access
    1     rw   1For reconfiguring, chip_access iswrite 1, the request time decoder quits work. Temporal decoder will continue normal operation, until arrive the end of current video series. Except chip_access=1, namely decoder is stopped the time after resetting. When chip stops, produce chip and stop event. If chip_stopped_mask=1 produces interruption.
chip_stopped _event     1     rw   0
chip_stopped _mask     1     rw   0
count_error _event     1     rw   0Temporal decoder has an adder, and adder is added to prediction on the misdata. If different between misdata byte number and the prediction data byte number, so miscount event produces. If count_error_mask=1 produces interruption, forming prediction will stop. This event only produces immediately following hardware error.
count_error _mask     1     rw   0
picture_ buffer_0     18     rw   xThese specify the base address of imagebuffer
picture_ buffer_1
    18     rw   x
component_ offset_0     17     rw   xThese specify image buffer pointer offset amount, each chrominance component of storage in buffer. The data of ID=n component are to be stored as starting point by the position of component_ offset_n indication. See A.3.5.1 " component identification number ".
component_ offset_1     17     rw   x
component_ offest_2     17     rw   x
Table A .18.2 temporal decoder register (continuing)
Register nameSize/directionReset modeExplanation
 MPEG  reordering
    1     rw   0This register is set to 1, makes temporal decoder change visual order, becomes correct display order from non-causA1 MPEG image sequence. See A.18.3.5. JPEG and H.261 this register of duration of work should ignore.
A.18.3.3 the number in the image buffer register
Image buffer pointer (18) and component skew (17) register piece (8 * 8 byte) address of indication rather than a byte address. A.18.3.4 image buffer distributes
For the video of the predictability ground coding of decoding (H.261 or MPEG), temporal decoder must be managed two image buffers. Wish obtains more about how using the information of these buffers, referring to part A.18.4 and A.18.4.4.
The user must guarantee at each image buffer pointer (picture_buffer_0 and picture_buffer_1) the single image (with other image buffer not overlapping) of enough memory spaces to store a required video format arranged. Usually, in the image buffer pointer one will be set to 0 (be memory at the bottom of), and another will be set to point to the middle part of memory space. A.18.3.4.1 MPEG or a general configuration H.261
H.261 and MPEG all using one 4 between the different chrominance components: 1: 1 ratio (that is, the luminance pixel number is four times in the number of pixels in a colourity composition).
Described in A.3.5.1 " component identification number ",component 0 will be the brightness composition, andcomponent 1 and 2 will be colourity.
A kind of example configuration of component biasing register is, component_offset _ 0 be set be 0 so thatcomponent 0 begin at image buffer pointer place. Similarly, component_ offset_1 can be set to 4/6 of image buffer size, andcomponent_offset_ 2 can be set to 5/6 of image buffer size. A.18.3.5 image sequence is reset
MPEG uses three kinds of different visual types: inner (I), prediction (P) and two-way insertion (B). The category-B image is based on the prediction that derives from two images: a width of cloth is from the future, and a width of cloth was from the past. Picture sequences is modified at encoder, so that I and P class image can be decoded from coded data before it is required to decode the category-B image.
Image sequence must be corrected before these images can be shown. Temporal decoder can provide this image to reset (by register MPEG_reodering=1 is set). As selection, the user may wish the image rearrangement is finished as his part of display interface function. Setup time decoder with provide visual rearrangement may reduce can be decoded video resolution, referring to part A.18.5. A.18.4 prediction forms
H.261 the prediction of decoding and mpeg decode formation requirement is distinct. CODING_ STANDARD token automatic setup time of decoder is to adapt to the prediction requirement of various criterion. A.18.4.1 jpeg operation
When disposing for jpeg operation, because JPEG without any need for time decoder, does not therefore predict. A.18.4.2.H.261 operation
In H.261, prediction is only from firm decoded image. Motion vector only refers to the pixel accuracy of integer. The result that encoder will indicate a low pass filter to be used to predict.
Because every images is all decoded, it is written into an image buffer in the outer DRAM of chip, so that DRAM can be used to next image of decoding. Come across the output of temporal decoder during the DRAM of decoded image outside they are written into chip.
About the details of prediction and the mathematical operation that relates to, the reader can be with reference to standard H.261. Temporal decoder of the present invention is consistent fully with H.261 requirement. A.18.4.3 MPEG operation (not resetting)
For three kinds of different MPEG image types (I, P and B) each, the operation of clock coder changes to some extent.
The further decoding of decoder that do not take time of " I " class image, but must be stored in an image buffer (frame storage), in order in decoding P and category-B image, use in the future.
Decoding P class image need to form prediction by decoded P or I class image from a width of cloth front. Decoded P class image is stored in the image buffer to use in decoding P and category-B image. MPEG allows motion vector indication half-pixel accuracy. Wave filter provides and inserts to support this half-pixel accuracy in the chip.
The category-B image need to be from the prediction of two kinds of image buffers. Similar P class image, half picture element movement vector resolution ratio precision need interpolation on the chip of pictorial information. The category-B image is not stored in the outer buffer of chip. They are instantaneous.
All images come across the output of temporal decoder when it is decoded. Like this, image sequence will identical with its sequence in the encoded MPEG data (seeing the top of Figure 85).
About predicting details and the mathematical operation that relates to, the reader can be with reference to the mpeg standard draft that proposes. These require to be satisfied by temporal decoder of the present invention. A.18.4.4 MPEG operation (with resetting)
When be image reset the MPEG operative configuration time, (MPEG_reordering=1), the operation that forms prediction as top described in the part A.18.4.3. Yet additional data transmit and are carried out in order to reset image sequence.
The decoding of category-B image as A.18.4.3 the part described in. Yet I and category-B image are not output when they are decoded. They are written into the outer buffer (as previously mentioned) of chip, and only arrive at a follow-up I or P class image and just be read out when decoded. A.18.4.4.1 decoder starting characteristic
The output of the first width of cloth I class image is delayed, until the P of back (or I) class image begins decoding. This should be considered when estimating the Video Decoder starting characteristic. A.18.4.4.2 decoder closing property
Temporal decoder dependence back P or I class image go out buffer the chip (frame storage) with the clear picture of front except (Flush). This is making a difference with at the new video sequence of beginning the time in the ending of video sequence. Spatial decoder provides equipment, produces " puppet " P/I image to remove last P (or I) image with the ending at video sequence. Yet this " puppet " image will be eliminated when a postorder video sequence begins.
Spatial decoder provides to be selected to get rid of this " puppet " image. Knowing after Geju City sequence is moved to end and will have a new video sequence to be provided for the decoder place, this will be useful at once. The first images of this new sequence will brush last images of last sequence. A.18.5 video resolution
The video resolution that the time decoder can be supported when decoding MPEG is subject to the restriction of the memory bandwidth of its DRAM interface. For MPEG, need to consider two kinds of situations: use and reset without the MPEG image.
A.18.5.2 and A.18.5.3 part has been discussed the requirement of the required worst condition of the current draft of MPEG specification. The subset of MPEG can expect to have lower memory bandwidth requirement. For example, only use integer resolution ratio motion vector, perhaps, as selection, do not use the category-B image, can reduce significantly the memory bandwidth requirement. These subsets are not done analysis at this. A.18.5.1 DRAM interface features
The periodicity of taking by DRAM interface transmission data depends on many factors:
The DRAM interface is for adapting to the time configuration of selecting DRAM used
Data-bus width (8,16 or 32)
The type that data transmit
8 * 8 read or write
Be the prediction half-pixel accuracy
Be prediction integer pixel accuracy
About the details of DRAM interface configuration, referring to A.5 the part " DRAM interface ".
Table A .18.3 has shown how many DRAM interfaces " cycle " the data transmission for every type needs.
The data transfer time of Table A .18.3 temporal decoder
Data-bus width (position)Read and write 8 * 8From fallout predictor (half-pixel accuracy)From fallout predictor (integer pixel accuracy)
 81 page address+64transmission4 page address+81transmission4 page address+64transmission
 161 page address+32transmission4 page address+45transmission4 page address+40transmission
 321 page address+16transmission4 page address+27transmission4 page address+24 transmission
Table A .18.4 has adopted the numerical value among the Table A .18.3 and " typically " DRAM has been estimated them. In this example, supposed the clock of 27 megahertzes. It should be understood that althoughuse 27 megahertzes at this, it is not as a kind of restriction. Access is initial to have occupied 11 tick marks (Tick) (102ns) and data transmit and occupied 6 tick marks (56ns). A.18.5.2 without the MPEG resolution ratio of resetting
The load of memory peak bandwidth occurs when decoding category-B image. In one " worst condition " scheme, the B frame can form by prediction, and prediction derives from 2 image buffers, and all predictions all are half-pixel accuracies.
Table A .18.4 " typical case " DRAM explanation
Data-bus width (position)Read or write 8 * 8From the fallout predictor half-pixel accuracy)From fallout predictor (integer pixel accuracy)
    8   3657ns   4907ns   3963ns
    16   1880ns   2907ns   2185ns
    32   991ns   1907ns   1741ns
Use for example data among the Table A .18.4 to find out, for reading two kinds of required data (by the interface of 32 bit wides) of accurate half-pixel accuracy prediction of x, the DRAM interface will spend 3815ns. The resolution ratio that temporal decoder can be supported is determined by the quantity of these predictions that can finish within a visual time. In this example, temporal decoder can be processed 8737 8 * 8 (for example, to one 30 hertz video) in single visual cycle of 33 milliseconds.
If required video format is 704 * 480, so every images comprises 7920 8 * 8 (considering the 4:2:0 chroma samples). Can find out that this video format takies about 91% (before taking into account such as factors such as DRAM refresh) of obtainable DRAM interface bandwidth. Correspondingly, temporal decoder can be supported this video format. A.18.5.3 the MPEG resolution ratio that has rearrangement
When adopting the MPEG image to reset, when decoding P class image, can run into worst case scenario. At this time, 3 kinds of loads are arranged on the DRAM interface:
Form prediction
Write back the result
Read P or the I image of front
Use is from for example numeral of Table A .18.3, and we can find in the time can obtaining the interface of 32 bit wides, each time that will spend in this work. Form prediction cost 1907ns/n, and read and write respectively to be used 991ns, is 3899ns altogether. This is so that temporal decoder can be processed 8485 8 * 8 piece within the cycle of 33ms.
Therefore, the video of processing 704 * 480 will use about 93% (ignore and refresh) that can obtain the memory bandwidth. A.18.5.4 H.261
H.261 only support two kinds of pixel formats (30 hertz of CIF (352 * 288) and QCIF (172 * 144) and visual rates. One width of cloth CIF image comprises 2376 8 * 8. Required unique memory operation is to write 8 * 8 piece and form prediction with the integer precision motion vector.
Memory interface for 8 bit wides uses from for example numeral among the Table A .18.4, can find out that writing each piece will use 3657ns, and be that a piece formation prediction will be used 3963 ns/n, so each piece will spend 7620ns. Therefore, the processing time of each single CIF image is about 18ms, much smaller than supporting 30 hertz of 33ms that video is required. A.18.5.5 JPEG
The JPEG that can be supported " video " resolution ratio will be determined by capacity or the display interface device of spatial decoder of the present invention. Temporal decoder does not affect JPEG resolution ratio. A.18.6 event and mistake A.18.6.1 chip stop
In the present invention, for writing 1 to chip_access, require the temporal decoder shut-down operation so that reconfigure. Once be received, temporal decoder will continue normal operation, until it arrives the end of current video sequence. Then, temporal decoder is stopped.
When chip stopped, a chip stopped event and will occur. If chip_stopped _ mask=1 will have one to interrupt producing. A.8.6.2 miscount
Temporal decoder of the present invention comprises an adder, prediction is added in the misdata goes. If between the byte number of the byte number of misdata and prediction data difference is arranged, can produce so a miscount event.
If count_error_mask=1, an interruption will be produced, and the formation of prediction will stop.
Write 1 to count_error_event and just removed event, and temporal decoder is continued. Thereby the DATA token that produces that leads to errors will continue. Yet, cause that wrong DATA token will can not be correct length (64 byte). This might cause further problem. Therefore, only when having occured, a remarkable hardware error just can produce a miscount. A.19 link the output of temporal decoder
The output of temporal decoder is one to have the standard token port of 8 bit wide data words. Further information about interface electric characteristics sees also A.4 part.
Whether the token that appears at the temporal decoder output will depend on employed coding standard, in the situation of MPEG, then depend on image and be rearranged. In this part identification token which can obtain at the output of temporal decoder, and which is the most useful when showing that output at design circuit. Other token will occur, but need not to show output, therefore be not discussed at this.
This part focuses on display:
How the initial sum of recognition sequence is terminal
How the initial sum of recognition image is terminal
How to identify and when show image
Where A.19.1 the JPEG how the recognition image data should be put in the demonstration exports
In decoding during jpeg data, by the sequence of tokens of temporal decoder output and the output of spatial decoder seen the same. In retrospect, JPEG when not required between the processing carried out of decoder. Yet temporal decoder testing inner data token (from the accurate result of the finite mathematics of IDCT in the spatial decoder) replaces with zero negative value.
About the further discussion of the output sequence in jpeg operation, observed, referring to part A.16. A.19.2 H.261 export the A.19.2.1 starting and ending of process
H.261 in video data, there is not the starting and ending signal in the video flowing. But, this will be implicit by using. For example, begin when telecommunications links time series, when line interruption, finish. Like this, the highest layer is " image layer " in the video system.
According to the present invention, the detector for initial code of spatial decoder so that SEQUENCE_START and CODING_STANDARD token be automatically inserted into before first PICTURE_START. A.11.7.3 and A.11.7.4 referring to.
At last (namely when circuit close) of process H.261, the user should insert a FLUSH token in the back of coded data end. This has many effects (referring to appendix A .31.1):
It has guaranteed that PICTURE_END is produced, to indicate the end of last images.
It has guaranteed that the end of coded data is pushed through decoder. A.19.2.2 the acquisition of image
The layer of each image in system is comprised of the element that is called layering. When decoding H.261, reflect this structure in the sequence of tokens of temporal decoder output. A.19.2.1 visual layer
There is a PICTURE_START token each visual front, and the back is followed by a PICTURE END token. H.261 do not comprise originally that an image finished. This token is to insert by the detector for initial code of spatial decoder is automatic.
Behind the PICTURE_START token, TEMPORAL_REFERENCE and PICTURE _ TYPE token will be arranged. The TEMPORAL_REFERENCE token carries 10 figure places (wherein only having 5 lowest orders (LSB) to be used) in H.261, when this number indication image should be shown, this should be considered by any display system, because H.261 encoder may omit image (to realize lower data transfer rate) from sequence. Because time mark has increased between two consecutive images not only No. one, the omission of image can be detected.
Next step, the PICTURE_TYPE token carries the information about pixel format. One display system can consider that this information is to check that whether CIF or QCIF image are just decoded. Yet, also can obtain by the register in the research Huffman decoder about the information of pixel format. (please refer to the Huffman decoder section) be piece layer group A.19.2.2.2
Every width of cloth H.261 image is made of a large amount of " piece group ". There is a SLICE_START token (by H.261 group number and group initial code obtain) each front in them. This token carries one 8 value, and where this 8 place value indicator collet group should be placed in the demonstration. This just makes its secondary synchronization again after error in data for decoder provides a chance. In addition, do not need additional information to describe their image region if having, aforesaid way also provides a kind of gimmick of jumping piece for encoder. When SLICE_START decoder time of advent output, this information just becomes very unnecessary, because spatial decoder and temporal decoder have used this information to guarantee every images and comprise correct piece number and these pieces are in correct position. Therefore, only by calculating the number of the piece that has been output after the initial beginning of image, just might calculate and place wherein the data block of being exported by temporal decoder.
The number that is carried by SLICE_START is than the piece number of group H.261 little by 1 (about more information, referring to H.261 standard). Figure 94 has shown in CIF and QCIF image the H.61 location of piece group. Attention: in the present invention, shown block number is identical with the number that is carried by SLICE_START. Be different from H.261 convention for these groups of numbering.
Between SLICE_START (it indicates the initial of each piece group) and the first macro block, also other token may be arranged. These can be ignored, because subsequently displaying transmitted image data does not need them. A.19.2.2.3 macroblock layer
Macroblock sequence in each piece group is by H.261 definition. There is not the special token information of describing each macro block position. The user should calculate by macroblock sequence, determines to show wherein each bar information.
Figure 96 has shown the order that macro block is placed in each piece group.
Each macro block comprises 6 data tokens. Per 6 one group DATA token sequentially is by H.261 macroblock structure is defined. Each DATA token should comprise 64 data bytes just for 8 * 8 pixel areas of a single chrominance component. Chrominance component is carried (referring to A.3.5.1) by 2 figure places in the DATA token. Yet the order of chrominance component is defined in H.261.
Before each group DATA token a large amount of tokens are arranged, these token communications are about the information of motion vector, quantization scaling factor etc. Show that image does not need these tokens, therefore can be left in the basket.
Each DATA token comprises 64 data bytes to its single chrominance component of 8 * 8. Deposit with raster order. A.19.3 MPEG output
Have more multi-layered in the system of MPEG. They are accepted such as video sequence and group of picture. A.19.3.1 MPEG sequence layer
A sequence can have a plurality of entrances (sequence is initial), but should only have an exit point (EOS). Decoded when the heading code of a MEPG sequence, spatial decoder produces a CODING_STANDARD token, is thereafter a SEQUENCE_START token.
After SEQUENCE_START, the token of a large amount of sequence-header information will be arranged, their explanation video formats etc. About the information of indication in the sequence-header, referring to the mpeg standard draft; How to be converted into the information of token about these data, referring to Table A .3.2. The information of this explanation video format also can obtain in the register in the Huffman decoder.
If a MPEG sequence has several entrances, this sequence standard information can occur several times in this sequence. A.19.3.2 the image layer is organized
MPEG group of picture provides dissimilar " entrance " points in that sequence is initial. Sequence-header provides the information about image/video format. Correspondingly, if decoder is not known the video format that uses in the sequence, it must begin at a sequence section start, yet, in case video format is configured into decoder, should in any one group of picture, begin decoding.
MPEG is not limited in visual number in the group. Yet in the middle of many application, a group was corresponding to about 0.5 second, because it provides random-access reasonable unit.
The initial of one picture group elephant indicated by a GROUP_START token. The header that provides after GROUP_START comprises two kinds of useful token: TIME_CODE and BROKEN_ CLOSED.
TIME_CODE carries a subset of smpte time code information. This is useful making in the middle of Video Decoder and other signal synchronization. BROKEN_CLOSED carries MPEG closed_gap and broken_link position. About the more information of the enforcement of arbitrary access and the video sequence edited of decoding, can be referring to A.19.3.8. A.19.3.3 visual layer
The initial indication by the PICTURE_START token of the new image of one width of cloth. After this token, TEMPORAL_REFERENCE and PICTURE_TYPE token will be arranged. If temporal decoder is not to dispose in order to provide image to reset, time tag information may be useful. If display system is wished the initial special treatments B class image at an open GOP, visual type information may be useful (seeing A.19.3.8).
Each image consists of by many. A.19.3.4 lamella
A.19.2.2.2 employed group in having discussed H.261. Sheet among the MPEG plays similar effect. Yet chip architecture is not determined by standard, and the 8 place values ratio that is carried by the SLICE_START token is little by 1 with " the sheet upright position " of MPEG communication. About the explanation of lamella, referring to the mpeg standard draft.
When the output of the SLICE_START decoder time of advent, this information function is unnecessary, because spatial decoder and temporal decoder have used this information to comprise the piece of correct number in correct position to guarantee each images. Therefore, only by count the quantity of the piece that has been output after image begins, where the data block that just should calculate temporal decoder output places.
About using the discussion of the effect that the MPEG image resets, referring to A.19.3.7. A.19.3.5 macroblock layer
Each macro block comprises 6 pieces. They appear at the output (pointed such as MPEG specification draft) of temporal decoder with raster order. A.19.3.6 piece layer
Each macro block comprises 6 DATA tokens. Each contains in the group of 6 tokens, and the sequence of DATA token is by MPEG specification draft defined (this with H.261 macroblock structure is identical). To a single chrominance component, each data token should comprise at 8 * 8 pixel area lucky 64 data bytes. Chrominance component is carried (seeing A.3.5.1) by 2 figure places in the DATA token. Yet the chrominance component sequence among the MPEG is defined.
Before each group DATA token a large amount of tokens are arranged, these tokens are to carrying out communication about the information of motion vector, quantization scaling factor etc. Show that image does not need these tokens, so they are left in the basket. A.19.3.7 the MPEG image effect of resetting
Described in A.18.3.5, temporal decoder can be configured to provide the MPEG image to reset (MPEG_reordering=1). The output of P and I class image is delayed, until temporal decoder begins next the P/I image in the decoded data stream. At the output of temporal decoder, replace the DATA token of the P/I image of new decoding from the DATA token of old P/I image.
When resetting the P/I image, outside image is written into chip during image buffer, visual PICTURE_START, TEMPORAL_REFERENCE and PICTURE_TYPE token temporarily are stored on the chip. When image was read out to show, the token of these storages was resumed. Correspondingly, the P/I image of rearrangement has correct PICTURE_START, TEMPORAL_ REFERENCE and PICTURE_TYPE value.
All other tokens below the image layer are not rearranged. When the P/I image after resetting was read out to show, it picked up the non-DATA token of low level of the image that just has been rearranged. Thereby the token of these image subsection layers should be left in the basket. A.19.3.8 arbitrary access and editor's sequence
Spatial decoder provides equipment to help carrying out correct video decode for the video data of the MPEG video data behind the editor and formation after arbitrary access. A.19.3.8.1 open GOP (Open Gops)
One group of picture (GOP) can be with the beginning of category-B image, and this B image is to be got by the prediction of the image of the P class among the previous GOP, and this is called " Open GOP ". Figure 107 is illustrated it. Figure 17 and 18 is B images that second GOP begins. If GOP is open to the outside world, encoder may use from the prediction ofP image 16 and Iimage 19 this two images is encoded so. As selection, encoder also can be limited to the prediction of only using fromI image 19. In this case, second GOP is one " closing GOP ".
If a decoder just begins video decode at first GOP, when running into second GOP, it will not have any problem, even GOP is open. TheP image 16 because it has been decoded. Yet if decoder has carried out an arbitrary access and begun decoding at second GOP, its can not decode B17 and B18 is if these two images are (that is to say, if GOP is open) that rely on P16.
If first GOP that spatial decoder of the present invention runs into after once resetting is open GOP, perhaps it receives a FLUSH token, and it will suppose an arbitrary access to opening GOP has occured. In this case, the Huffman decoder will be take general fashion as B the image usage data. Yet it will export the B image of predicting with (0,0) motion vector outside the I image. The result will be that visual B17 and B18 (going up in the example) will be identical with I19.
This specific character has guaranteed the work of MPEG VBV rule. Simultaneously, it has guaranteed that the B image is present in the output, and is in the desired position of other data channel in output stream. For example, the mpeg system layer provides the displaying time information that voice data and video data are interrelated. Video displaying time mark refers to the shown image of first width of cloth in a GOP, and namely the time benchmark is 0 image. In the above example, first width of cloth after second GOP carried out arbitrary access shows that image is B17.
The BROKEN_CLOSED token carries MPEG closed_gop position. Therefore, can judge at the output of temporal decoder, the B image of output is genuine or spatial decoder is introduced " substitute ". In some applications, when " substituting ", these may wish to take some special measurement when image occurs. A.19.3.8.2 the video of being edited
If an application is edited a MPEG video sequence, it may interrupt two contacts between the GOP. If GOP is an open GOP after editor, it can not be correctly decoded the B image that is positioned at the GOP section start. After editor, the application of editing MPEG data can arrange broken_link among the GOP with the indication decoder its these B image of can not decoding.
If spatial decoder runs into a GOP with a binding that has been interrupted, Huff_ man decoder will be take general fashion as B image decoding data. Yet it will export the B image of predicting with (0,0) motion vector outside the I image. The result will be. Image B17 will be identical with I19 with B18 (in upper example).
The BROKEN_CLOSED token carries MPEG broken_link position. Therefore, just might judge that at the output of temporal decoder the B image of output is really or by spatial decoder to introduce " substitute ". When " substituting " image occurring, some application may wish to take some special measurement. A.20 write late DRAM interface (Late Write DRAM Interface)
Interface can dispose with two kinds of methods:
The detailed timing of interface can be configured to adapt to multiple different DRAM type.
" width " of DRAM interface can be configured to price/Performance Ratio of providing superior.
Table A .20.1 DRAM interface signal
Signal nameI/OExplanation
DRAM_data [31:0]Enter/go outThe DRAM data/address bus of 32 bit wides. Optional this bus can be configured to 16 or 8 bit wides.
DRAM_addr [10:0]Go out22 bit wide DRAM interface IP addresses are the timesharing multichannels on 11 bit wide buses.
RASGo outThe DRAM rwo address strobe signals.
CAS[3:0]Go outThe DRAM column address gating signal, each byte of the data/address bus of interface provides a signal. All CAS signals are simultaneously driven.
WEGo outDRAM writes enable signal.
OEGo outDRAM output enable signal.
DRAM enableEnterWhen input signal when low, make that all output signals are high impedance on the interface, the DRAM interface is stopped action.
Table A .20.2 DRAM interface configuration register (continuing)
Register nameSize/directionReset modeExplanation
modify_ DRAM_timing  1bit  rw   0This function allows the addressable DRAM interface of register timing configured register. When this register keeps null value, should not revise. This register is write 1, revise this configuration register with regard to request access. Zero write this register after, the DRAM interface will be brought into use the new value in the visual configuration register.
page_start _length  5bit  rw   0Specify the initial length of access take tick as unit, what this minimum of a value can be used was 4 (representing 4 ticks). 0 selectsmaximum length 32 ticks.
read_cycle _length  4bit  rw   0Specify the soon length of page or leaf read cycle with ticks, available minimum of a value is 4 (i.e. 4 tick). 0 selects the maximum length of 16ticks.
write_cycle _length  4bit  rw   0Specify the rear write cycle time length of fast page or leaf with ticks, available minimum of a value is 4 (representing 4 tick) 0 to select the maximum of 16ticks.
refresh_ cycle_length  4bit  rw   0With the length of ticks specified refresh period, its available minimum of a value was 4 (representing 4 ticks), and 0 selects the maximum of 16ticks.
RAS_falling  4bit  rw   0Specify in after the access starting, it was 4 (representing 4 ticks) that the available minimum of a value of this ticks number is specified in RAS decline, and 0 selects the maximum length of 16ticks.
Table A .20.2 DRAM interface configuration register (continuing)
Register nameSize/directionReset modeExplanation
CAS_faliing 4bit rw   8In the starting read cycle, after the write cycle time, perhaps specify this tick number after the CAS decline access starting. Available minimum of a value was 1 (representing 1 tick), and 0 selects the maximum length of 16ticks.
DRAM_data _width 2bitrw   0Appointment is used in DRAM interface data bus DRAM _ data[31:0] on figure place. See A.20.4.
row_address _bits 2bitrw   0Appointment is used in the figure place of the row address part on the DRAM interface IP address bus. See A.20.5.
DRAM_enable 1bit rw   1To this register value of writing 0, force the DRAM interface to enter high-impedance state. If the DRAM_enable signal is low or zero having write in this register, then 0 will from then on read in the register.
refresh_interval 8bit rw   0This value specify 16 decode clock cycles be cycle unit refresh with during the interval. The scope of value is configurable in 1......255.Value 0 automatic loading after resetting, and force the DRAM interface continue to carry out the refresh cycle until after an effective refresh interval was configured to reset, refresh_interval should dispose once at every turn.
no_refresh.1bit rw   0To this register value of writing 1, prevent the execution of any refresh cycle.
Table A .20.2 DRAM interface configuration register (continuing)
Register nameSize/directionReset modeExplanation
CAS_strength     3bit     rw   6The output that these 3 bit registers arrange the DRAM interface signal drives intensity. Allow the various different loads of interface configuration to see A.20.8.
RAS_ strength
addr_ strength
DRAM_data_ strength
OEWE_ strength
A.20.1 interface timing (tick marks)
In the present invention, the DRAM interface is regularly from a clock, and this clock is with the speed operation (decoder_clock) of the input clock rate that is four times in equipment. This clock is produced by PLL on the chip.
For for simplicity, the cycle of this high-frequency clock is called as tick marks (tick). A.20.2 interface operation
Interface uses the DRAM fast page mode. Three kinds of dissimilar accesses can be supported:
Read
Write
Refresh
Each reads or writes access and transmits a string pulse between 1 to 64 byte of a single DRAM page address. In a single access, the transmission of read and write does not mix mutually. For a new DRAM page, each consecutive access is all treated with an arbitrary access. A.20.3 access structure
Each access is made of two parts:
Access begins
Data transmit
Each time access is with an initial beginning of access, follows thereafter one or more data transfer cycles. In the access initial sum data transfer cycles any one has reading and writing and refreshes.
In the ending that last data of primary access transmit, interface enters its default setting and keeps this state, until beginning is prepared in a new access. Finish the preparation that beginning has been carried out in stylish access if work as last access, so new access will begin immediately. A.20.3.1 access starting (Access start)
The access starting provides the page address for reading or writing to transmit, and sets up some initial signal condition. Three kinds of different access startings are arranged:
Read starting
Write starting
Refresh starting
In each case, RAS and row address sequential are all by RAS_falling and the control of page_ start_length register. OE and DRAM_data[31:0] state will remain to RAS from the end that upper data transmit always and become low level. The difference of three kinds of different initial types of access only is they are how to drive OE and DRAM__ data[31:0 when RAS is in low level]. Referring to Figure 109.
Table A .20.3 accesses start-up parameters
NumberCharacteristicMinimum of a valueMaximumUnitNote
 38RAS preliminary filling by register RA S_falling setting     4     16  tick
 39Access duration of starting by register page_start_length setting     4     32
 40CAS pre-charging time by register CAS_falling setting     1     16     a
 41Fast page or leaf read cycle length by register read_cycle_Length setting     4     16
 42Fast page or leaf write cycle time length by register write_cycle_Length setting     4     16
 43Tick of WE decline afterCAS
 44By register refresh_cycle refresh cycle length is set     4     16
A, this value must refresh generation to guarantee CAS less than RAS_falling before RAS. A.20.3.2 data transmit
Three kinds of dissimilar data transfer cycles are arranged:
The quick page read cycle
Write cycle time behind the quick page
Refresh cycle
One is refreshed initial back a refresh cycle is only arranged. Reading (or writing) initial back for one can follow one or more quick pages and read (or writing) cycle.
In the beginning of read cycle CAS, RAS is driven to high level, and new column address is driven.
Write cycle time is used after one. WE is driven to low level than the slow label of CAS. The output data are more driven than the slow tick marks in address (Tick).
Owing to be refreshed initial initialization of cycle at the CAS of RAS before the refresh cycle, within the refresh cycle, do not exist interface signal effective. The purpose of refresh cycle is to reach the required minimum RAS low-level period of DRAM. A.20.3.3 interface default setting
In the ending of primary access, interface signal enters a default setting:
RAS, CAS and WE high level
Data and OE keep its original state
Addr keeps stable A.20.4 data-bus width
Two bit register DRAM_data_width so that the width in DRAM interface data path can be configured. This is so that the price of DRAM is minimum when it is worked with little pixel format.
Table A 20.4 arranges DRAM_data_width
 DRAM_data_width
    0aTo DRAM_data (31:24) 8 bit wide data/address bus
    1To DRAM_data (31:16) 16 bit wide data/address bus
    2To DRAM_data (31:0) 32 bit wide data/address bus
A, the default value after resetting
B, obsolete signal keep A.20.5 address bit of high impedance
One 24 address is produced at chip. How this address is used to form the row and column address depends on the width of data/address bus and the figure place of selecting for row address. Some configuration does not allow to use all internal address bit (and therefore generation " hidden bit ").
Row address is extracted out from the mid portion of address. This speed that DRAM is refreshed naturally is maximum. A.20.5.1 low level column address bit (Low order column address bits)
Low 4 to 6 of column address is used to provide the address for the fast page mode that reaches 64 bytes transmits. Control transmits the width (seeing A.20.4) that required address size depends on data/address bus. A.20.5.2 row address bit
The number of the position of taking out from the mid portion of 24 home addresses, be used for providing row address is disposed by register row_address_bits.
Table A .20.5 arranges row_address_bits
    row_address_bitsRow address is wide
    09
    110
    211
The width of employed row address will depend on employed DRAM type, and whether the MSB of row address is decoded with a plurality of bodies of access DRAM outside chip.
Attention: row address is that the middle of address extracted out internally. If some of row address is decoded to select the DRAM body, all probable values of these " body is selected the position " must be selected a DRAM body so. Otherwise the cavity will appear in address space.
Table A .20.6 selects the value of row_address_bits
  row_address_bitsRow address bitBody is selectedTheDRAM size
  0  DRAM_addr[8:0]  256k
  1  DRAM_addr[8:0]  DRAM_addr[9]  256k
 DRAM_addr[9:0]  512k
 DRAM_addr[9:0]  1024k
  2  DRAM_addr[8:0]  DRAM_addr[10:9]  256k
 DRAM_addr[9:0]  DRAM_addr[10]  512k
 DRAM_addr[9:0]  DRAM_addr[10]  1024k
 DRAM_addr[10:0]  2043k
 DRAM_addr[10:0]  4095k
A.20.6 DRAM interface enable
There are two kinds of ways can make all signals on the DRAM interface become high impedance. DRAM_ enable register and DRAM_enable signal. Register and signal all must be that logic is 1 could move for the DRAM interface. If any one is low level, interface just becomes high impedance so, and is ended by the data transmission that interface carries out.
It is in order to make other parts can test or use the DRAM that is controlled by spatial decoder (or temporal decoder) when not being used at spatial decoder (or temporal decoder) that the ability that makes the DRAM interface take high impedance is provided. Normally in service, do not wish that other parts share memory. A.20.7 refresh
The DRAM interface will refresh DRAM with CAS automatically within the intermittent phase that is determined by register refresh_interval, unless owing to writing register no_refesh it can not be done like this.
Value in the refresh_interval indicated in a stage with 16 decoder_ clock cycles, the interval between the refresh cycle. Can dispose from 1 to 255 value. After resetting, 0 value is forced the DRAM interface to carry out the refresh cycle constantly (in case can carry out), until a legal refresh interval is configured by automatic loading. Suggestion is only disposed once refresh_interval after resetting at every turn. A.20.8 signal strength signal intensity
The output of DRAM interface drives intensity and can be disposed by the user, uses 3 bit register CAS_strength during configuration, RAS_strength, addr_strength, DRAM_data_ strength, OEWE_strength. The value of the highest significant position of this 3 place value is selected fast or border rate at a slow speed. Two lower outputs to different loads electric capacity of validity are configured.
Default intensity after resetting is 6, and it disposes output, if load is 12pF, and GND and VDDBetween the driving signal be approximately 10ns.
Table A .20.7 output intensity arranges
Intensity levelDrive characteristic
    0When load is 6pf near 4ns/v
    1When load is 12pf near 4ns/v
    2When load is 24pf near 4ns/v
    3When load is 48pf near 4ns/v
    4About 2ns/v when load is6pf
    5About 2ns/v when load is12pf
    6aAbout 2ns/v when load is24pf
    7About 2ns/v when load is 48pf
A, rear default value resets
When an output by the load of its driving by general arrangement, it will meet the AC electrical characteristics that Table A .20.11 points out in A.20.12. In configuration when appropriate, each output all with its load approximate match, thereby after the signal conversion minimum overshoot will appear. A.20.9 after resetting
After resetting, DRAM interface configuration register will all reset to their default value. In these default configurations the most meaningfully:
The DRAM interface is made can not and to be allowed to become high impedance
Refresh interval newly is configured toparticular value 0, and it means lasting the execution refresh cycle after interface is enabled again.
The DRAM interface is provided as it and disposes the most slowly.
Most of DRAM need to be had 100 μ s to " time-out " between the 500 μ s at power supply after applying first before can carrying out general operation, be thereafter many refresh cycles.
After resetting, the DRAM interface remains static immediately, until DRAM_enable register and DRAM_enable signal are set up. After these are set up, the DRAM interface will be carried out the refresh cycle (every approximately 400ns once decides on employed clock frequency), until the DRAM interface is configured.
The user should be responsible for guaranteeing DRAM after energising " time-outs " and enabling after the DRAM interface time enough is arranged, to guarantee before the data transmission, to have the refresh cycle appearance of required number.
After affirmation resetted, the DRAM interface can not refresh DRAM. Yet decoder chip is enough lacked required resetting time, reset them before therefore should disappearing in the content of DRAM, then again enables the DRAM interface. This may need in debugging (debugging) process.
Table A .20.8 maximum rating
SymbolParameterMinimum of a valueMaximumUnit
    VDDSupply voltage relatively   -0.5   6.5   V
    VTNAny pin input voltage   GND-0.5   VDD+0.5   V
    TAOperating temperature  -40  +85  ℃
    TSStorage temperature   -55   +150   ℃
Table A .20.9 DC condition of work
SymbolParameterMinimum of a valueMaximumUnit
    VDDSupply voltage relatively   4.75   5.25Voltage
    GNDGround
  0   0Voltage
    VIHInput logic 1 voltage   2.0   VDD+0.5Voltage
    VILInput logic 0 voltage   GND-0.5   0.8Voltage
    TAOperating temperature   0   70   ℃aTemperature
A, with the crossflow of TBA wire feet per minute clock
Table A .20.10 DC electrical characteristics
SymbolParameterThe minimum of a value maximumUnit
    VOLOutput logic " 0 " voltage   0.4   Va
    VOHOutput logic " 1 " voltage   2.8   V
    lOOutput current   ±100   μAb
    lOZOutput disabled leakage current   ±20   μA
    lLZThe input leakage current   ±10   μA
    lDOThe RMS source current    500   mA
    CINInput capacitance    5   pF
    COUTOutput/IOelectric capacity    5   pF
A, AC parameter VOLmax=0.8V is designated as measuring level
B, this is the stable state driving force of interface, and immediate current can be much larger. A.20.10.1 AC characteristic
Table A .20.11 is poor to the nominal value of gating
NumberParameterMinimum timeMaximumtimeUnitNote
  45Cycle time, routine tPC     -2     -2     ns
  46Cycle time, routine tRC     -2     +2     ns
  47High impulse such as tRP, tCP, tCPN     -5     +2     ns
  48Low pulse such as tRAS, tCAS, tCAC, tWP, tRASP, tRASC     -11     +2     ns
  49Cycle time is such as tACP/tCPA     -8     +2     ns
A. the driving intensity of signal must be configured to adapt to its load.
Nominal value is poor between two gatings of Table A .20.12
NumberParameterMinimum timeMaximum timeUnitNote
  50Be strobed into the time-delay between gating, such as tRCD, tCSR     -3     +3     ns
  51The low level retention time, such as tRSH, tCSH, tRWL, tCWL, tRAC, tOAC/OE, tCHR     -13     +3     ns
  52Be strobed into preliminary filling such as the tCRP of gating, tRCS, tRCH tRRH, tRPC     -9     +3     ns
In DRAM between any two CAS signals the pulse of CAS preliminary filling for example tCP or RAS rise with CAS decline between preliminary filling, such as tRPC     -5     +2   ns
  53Preliminary filling before forbidding is tRHCP/CPRH for example     -12     +3   ns
The driving intensity of a, 2 signals must be configured to adapt to their load. B.1 B.1.1 detector for initial code (Start Code Detector) is summarized
As shown in Figure 11, detector for initial code (SCD) is first piece in the spatial decoder as front. Its main application is to detect MPEG in input traffic, and JPEG and initial code are H.261 also replaced them with appropriate token. It also allows the user to access input traffic by MPI, and finishes pre-formatting and " arrangement " (tidying up) token data stream. Remember that detector for initial code both can be accepted unprocessed byte data or the data that have been assembled into the token form.
MPEG, H.261 the typical initial code with JPEG is respectively 24,16 and 8 bit widths. Detector for initial code reads by byte the input data, and no matter these data are come or come from token/byte port from MPI (upi), and make the displacement of input data by three shift registers. First register is 8 parallel-by-bit input string line outputs. Second register programmable length is 16 or 24, and here initial code is detected. The 3rd register width is 15, is used for data are reformated into 15 token. Also have two " mark " shift registers (SR) and the second and the 3rd shift register parallel running. Whether these registers comprise mark has off-position good to indicate in the data register. The byte that arrives is not the part of data token, and detector for initial code can not be identified them. These bytes are allowed to walk around shift register, and are output when all three shift registers are eliminated (clearancen), and these contents successfully are output. The non-data token that is identified is used for disposing detector for initial code, skips trap, or sign is set. They also walk around shift register, and by not output with changing. B.1.2 main circuit block
The hardware of detector for initial code comprises 10 state machines (State machines). B.1.2.1 input circuit (scdipc.sch.iplm.M)
Input circuit has three kinds of modes of operation: token, byte and MPI. These modes allow data with any following form input: as a unprocessed byte stream (but still using two line interfaces), input by MPI as token streams or by the user. In all cases, input circuit produces the data token head in suitable place, so it always exports correct data token. Be transformed into the MPI mode or be transformed into alternate manner from the microprocessor mode synchronous with system clock. MPI can be forced to wait for, until a point of safes in the data flow just obtains access after arriving. The byte mode pin determines that input circuit is in the token mode or in byte mode. In addition, under arbitrary mode of three kinds of modes, can accomplish that reporting system is about to any standard decoding when beginning. (therefore can produce a CODING_STANDARD token). B.1.2.2 token decode device (scdipnew.sch, scdipnem.M)
The token that this piece decipher arrives is also given an order to other piece.
B.1.1, the input token that table is identified
The input tokenThe order of sendingNote
EmptyWait forSky is removed
DataNormallyNext byte first shift register of packing into
Coding _ standardWalk aroundBrush clear shift register, fill, export and be transformed into bypass mode. CODING_ STANDARD register is loaded
Brush is clearWalk aroundWith filling the clear shift register of brush, export and be transformed into bypass mode
Other (token that can not identify)Walk aroundWith filling the clear shift register of brush, export and be transformed into bypass mode
Annotate: after shift register is brushed clearly, the change of coding standard will be sent to all pieces by two line interfaces. Guaranteed like this to occur in the whole detector for initial code on certain correct point from the change that data flow to another data flow. This principle is applicable to whole explanation, so the change of coding standard can be flow through whole chip before new stream. B.1.2.3 JPEG (scdjpeg.sch scdjpegm.M)
Initial code among the JPEG (sign) is very different, so that JPEG has a state machine that belongs to it fully. In this invention, this state machine is processed the detection of JPEG sign, length computation/checks, and clear data and wait whole work. The JPEG that is detected sign is marked as initial code (with the text of v_not_t-see below), and goes out of use (overridden) and be forced to bypass from the order that scdipnew comes. The most handy code is described its mechanism.
switch(state){ case(LOOKING):  if(input==0xff)  {    state=GETVALUE;/*Found a marker*/    remove;/*Marker gets removed*/}<!-- SIPO <DP n="354"> --><dp n="d354"/>    state=LOOKING;break;case(GETVALUE);    if(input==0xff)    {    state=GETVALUE;/*Overlapping markers*/    remove;    }    else if(input==0x00)    {    state=LOOKING;/*Wasn’t a marker*/    insert(0xff);/*Put the 0xff back*/    }else{  command=BYPASS;/*override command*/    if(lc)/*Does the marker have a length count*/    state=GETLC0;    else    state=LOOKING;  break;  case(GETLC0):    loadlc0;/*Load the top length count byte*/    state=GETLC1;    remove;break;<!-- SIPO <DP n="355"> --><dp n="d355"/>  case(GETLC1)    loadlcl;    remove;    state=DECLC;  break;  case(DECLC):    lcnt=lcnt-2    state=CHECKLC;  break;  case(CHECKLC):    if(lcnt==0)    state=LOOKING;/*No more to do*/  else if(lcnt<0)    state=LOOKING;/*generate Illegal_Length_Error*/  else  state=COUNT;  break;  case(COUNT):    decrement length count until 1    if(lc<=1)    state=LOOKING;}
B.1.2.4 input shift unit (scinshft.sch, scinshm.M)
The basic role of this piece is very simple. The byte that it fetches data from input circuit, its shift register of packing into, output after the displacement. Yet, it also obeys the order of input decoder, and process conversion (clear other shift register of brush) from the alternate manner to the bypass mode or from the bypass mode to the alternate manner: when receiving a bypass order, not the halfbyte shift register of packing into is arranged. And " rubbish (rubbish) " (tag=1) shifted out, make any data that keep in other shift register be forced to output. Then this piece is waited for " brushing clearly " signal, and this signal indicates this " rubbish " and appeared at the token reconstructor. So the input byte directly passes to the token reconstructor. B.1.2.5 detector for initial code (scdetect.sch, scdetm.M)
This piece comprises two shift registers, and initial code detects logic and " effectively content " detects logic. The width of two shift registers is programmed for 16 or 24. The MPEG initial code requires whole 24, and H.261 only requires 16.
In the present invention, first shift register is used for data. Second shift register carries sign. These signs show in the data shift register everybody whether effectively-very close to each other and stop (on the meaning of two line interfaces) in shift register. But when they were just clear by brush, the position that they comprise can be invalid (rubbish). When detecting an initial code, some positions of sign shift register are set up, and are invalid in order to make the content of detector shift register.
Only all just can detect initial code effectively the time in all the elements of shift register. The initial code that non-byte align is good can be detected and be made mark. In addition, when detecting an initial code, overlapping initial code by verification before, this initial code can not be identified definitely. In order to realize this function, " value " of the initial code that is detected (following a byte after initial code) is shifted, and directly passes through scinshift, and scdetect enters scoshift. Because it arrives scoshift and does not remove to detect another initial code, overlapping initial code is eliminated. This initial code just is identified as effective initial code. B.1.2.6 export shift unit (scoshift.sch, scoshm.M)
The basic role of output shift unit is to get serial data (and mark) from scdetect, data is made up (pack) and becomes 15 words and output. Other function is: data stuffing (Data padding) B.1.2.6.1
Output is comprised of 15 words, but input can be any digit. So in order to brush clearly (flush), need to increase figure place and make last word reach 15. These extra positions are called filling, and they must and be removed by the identification of Huffman (Huffman) piece. Filling is defined as:
After last position of data, to insert one " zero ", the back forms one 15 word along with abundant " 1 ".
Comprise data word when output of filling with a low extension bits, show that it is the ending of data token. B.1.2.6.2 " brush clearly " generation (flushed)
According to the present invention, the effect that produces " brushing clearly " signal comprises that detecting all shift registers is all brushed the clear moment, and input shift register is signaled. " rubbish " that inserts when the input shift unit reaches the terminal point of output shift unit, and Output Shift Register just produces " brushing a clearly " signal when having finished its filling. Before the input shift unit can enter safely bypass mode, this " brushes clearly " signal must pass through the token reconstructor. B.1.2.6.3 the effective sign of initial code
If scdetect shows that it has found an initial code, filling is done, and current data is output. The value of initial code (next byte) is shifted by detector, to eliminate overlapping initial code. If when this " value " reaches the output shift unit, do not detect other initial code, then it is not superimposed. This value will with the output of v_not_t (ValueNot.Token) sign, be an initial code value to show it. But, as when the output shift unit is just being waited for this value, detect another initial code (detecting with scdetect), then produce one overlapping initial (overlapping_start_error) mistake. In the case, first value is cancelled, and system waits for second value subsequently. Second value also may be superimposed. Repeat so above step, until find a underlapped initial code. B.1.2.6.4 the later housekeeping of initial code
After having detected and exported a good initial code, when data (not being rubbish) when beginning to arrive, produce a new data head. B.1.2.7 data flow reconstructor (sctokrec.sch, sctokrem.M)
The data flow reconstructor has two-wire interface input: input from scinshift for the token of bypass for one, another is for data splitting and open and begin yard to input from scoshift. When only having current token (any from two sources) to be done (low extension bits arrives), just allow between two sources, to exchange. B.1.2.8 initial value converts starting symbol (scdromhw.sch, schrom.M) to
The step that initial value is transformed to token was divided into for two steps. This piece is mainly processed the problem that relies on coding standard, and it reduces to 16 call numbers that have nothing to do with coding standard to more than 520 possible codes.
As previously mentioned, the different of initial value (initial value that comprises JPEG) and all other numerical value are in mark (value_not_token). Be high such as v_not_t, this piece just is transformed to 4 starting symbols that have nothing to do with standard to 4 or 8 place values according to CODING_STANDARD, and any unrecognized initial code is made sign.
It is as follows that initial code number is listed:
Show B.1.2 initial code number (index)
Initial/flag codeIndexToken as aresult
 not_a_start_code
 0
 sequence_start_code  1  SEOUENCE_START
 group_start_code
 2  GROUP_START
 picture_start_code
 3  PICTURE_START
 slice_start_code
 4  SUCE_START
 user_data_ start_code  5  USER_OATA
 extension_start_code
 6  EXTENSION_DATA
 sequence_end_code
 7  SEOUENCE_END
Jpeg marker
 DHT
 8  DHT
 DQT
 9  DOT
 DNL
 10  DNL
 DRI
 11  DRI
Can be mapped to the JPEG mark of MPEG/H.251 token
 SOS  picture_start_code  PICTURE_START
 SOI  sequence_start_code  SECUENCE_START
Show B.1.2 initial code number (index)
 Start/Marker Code  Index(start_number)  Resulting Token
 EOI  sequence_end_code  SEOUENCE_END
 SOF0  group_start_code  GROUP_START
Produce the jpeg marker of expansion or user data
 JPG  extension_start_code  EXTENSION_DATA
 JPGn  extension_start_code  EXTENSION_DATA
 APPn  user_data_start_code  USER_DATA
 COM  user_data_start_code  USER_DATA
Annotate: all unrecognized jpeg markers produce an extn_start_code index
B.1.2.9 initial code number is to the conversion (sconvert.sch, sconverm.M) of token
The second step of conversion is that above-mentioned initial code number (or index) is converted to token. Also process the token expansion at suitable this piece of place, discarded expansion, user data and retrieval mode. Retrieval mode is a kind of method that enters data flow at a certain random point. Retrieval mode can be set to one of following eight values:
0: normal operating---search next initial code.
1/2: do not retrieve in the level of the system that spatial decoder is realized.
3: retrieve sequence or higher person
4: retrieval group or higher person
5: searching image or higher person
6: retrieval sheet (slice) or higher person
7: retrieve next initial code
Before the initial code of the needs higher person of (or on the syntax (in the syntax)) was detected, any non-zero retrieval mode all made data go out of use.
These parts also add the token expansion to the initial token of figure and sheet:
PICTURE_START uses PICTURE_NUMBER as expansion. PICTURE_NUMBER is one four count value of image.
SLICE_START expands with SVP (sheet upright position) conduct. It is that initial code " value " subtracts 1 (MPEG, H.261), or " value " of initial code subtracts OXDO (JPEG). B.1.2.10 the format (scinsert.sch, scinserx.M) of data flow
In this invention, the format of data flow is closed and PICTURE_END, FLUSH, and CODING_STANDARD, the insertion of having ready conditions of SEQUENCE_START token is relevant with the generation of STOP_ AFTER_PICTURE event. Software below its function is the most handy is simplified and is described:
switch(input_data) case(FLUSH)  1.if(in_picture)    output=PICTURE_END  2.output=FLUSH  3.if(in_picture &amp; stop_after_picture)    sap_error=HIGH    in_picture=FALSE;  4.in_picture=FALSE; breakcase(SEQUENCE_START)  1.if(in_picture)    output=PICTURE_END  2.if(in_picture &amp; stop_after_picture)     2a.output=FLUSH     2b.sap_error=HIGH        in_picture=FALSE  3.output=CODING_STANDARD  4.output=standard  5.output=SEQUENCE_START  6.in_picture=FALSE;breakcase(SEQUENCE_END)case(GROUP_START):  1.if(in_picture)    output=PICTURE_END  2.if(in__picture &amp; stop_after_picture)<!-- SIPO <DP n="362"> --><dp n="d362"/>      2a.output=FLUSH      2b.sap_error=HIGH         in_picture=FALSE  3.output=SEQUENCE_END or GROUP_START  4.in_picture=FALSE;break  case(PICTURE END)  1.output=PICTURE_END  2.if(stop_after_picture)     2a.output=FLUSH     2b.sap_error=HIGH  3.in_picture=FALSEbreakcase(PICTURE_START)  1.if(in_picture)     output=PICTURE_END  2.if(in_picture &amp; stop_after_picture)     2a.output=FLUSH     2b.sap_error=HIGH  3.if(insert_sequence_start)     3a.output=CODING_STANDARD     3b.output=standard     3c.output=SEQUENCE_START      insert_sequence_start=FALSE  4.output=PICTURE_START    in_picture=TRUEbreakdefault:Just pass it through
B.2 Huffman (Huffman) decoder and Parser foreword B.2.1
This section is described Huffman decoder and Parser circuit according to this invention.
Figure 118 is the high level block diagram of Huffman decoder and Parser. For clarity, many signals and bus all have been omitted on this figure, particularly in several places (within illustrated systemic circulation) that data are presented backward. In fact, the Huffman decoder of this invention and Parser form (bottom along figure illustrates) by many dedicated processes parts. These parts are controlled by a programmable state machine.
From the coded data buffer, accept data with " input displacement " piece. Mainly will run into two category informations here: coded data and initial code. Coded data is entrained by data token. Initial code is replaced with their tokens separately by detector for initial code. Also may run into other token, but all tokens (except the data token) are treated equally all. Token (initial code) is treated as special circumstances, the overwhelming majority of data will be encoded (H.261, among JPEG or the MPEG).
In this invention, all data of being carried by data token are sent to the Huffman decoder, and they transmit with serial mode (one one). Certainly, it is not the field of Huffman coding that these data comprise many, but they are with fixed-length code (FLC). Yet these data are still delivered to the Huffman decoder with serial mode. If the Huffman coded data, then the Huffman decoder is only finished the first step of decoding. In this step, real Huffman code is replaced by a call number. If N different Huffman code arranged in the specific code table that will decode, then this " Huffman index " arrives in the scope of N-1 0. In addition, the Huffman decoder has a kind of " noop ", i.e. " inoperation " mode. This mode allows the Huffman decoder that data or token information are not done any processing and are sent to next stage.
Data directory unit (Index to Data Unit) is a relatively simple piece in the circuit. It finishes look up table operations. It draws its name from the second level that the Huffman decoding is processed. In this one-level, be transformed into real decoded data by the call number that obtains in the Huffman decoder with a simple look-up table. Data directory unit and the cooperation of Huffman decoder are as the unity logic cell operation.
Next piece is ALU, and it is used for realizing to other conversion of decoded data. The suitable work in data directory unit relatively shone upon arbitrarily, and it is more suitably local that ALU can be used on algorithm. ALU comprises a register group. ALU can operate the various parts that it realizes decoding algorithm. Particularly keep the register of vector prediction and DC prediction to be also contained in this piece. ALU is based on a simple adder with operand selection logic. It also comprises the special circuit of sign extended type operation. Probably provide shifting function, but this may finish under serial mode, not have barrel shifter (barrel shifter).
According to this invention, the token formatter is the decline of video Parser. Its task is that decoded data are assembled to token at last. These tokens can be sent to the remainder of decoder. So far, the token that will use for this width of cloth particular image of decoder has all had.
Parser state machine width is 18, in order to work with two-wire interface and to adopt it. Its task is to coordinate the operation of other parts. In fact, it is a very simple state machine, and it produces very wide " microcode " control word. This control word is sent to other parts. Figure 118 show coding line with data from a block movement to another piece. Situation is so really. Understanding is very important by the transmission between the two-wire interface control different masses.
In this invention, a two-wire interface is arranged between each piece in video Parser. In addition, the Huffman decoder namely input ground input data of shift unit (inshifter), and the two can be worked all to control token to serial data. Correspondingly, two kinds of modes of operation are arranged. As by data token the Huffman decoder being inputted data, then ground of data passes through shift unit. Have again, between input shift unit and Huffman decoder, two line interfaces are arranged. Yet other token is not one time one (serial) displacement, but in the head displacement of token. As input a data token, and the head that then comprises address information is deleted, and the data in the back, address are shifted by a ground. Not a data token such as input, whole token then, namely head and all is all delivered to the Huffman decoder immediately.
In this invention, the unusual of two line interfaces of understanding video Parser is that it has two active lines. This point is important. Single line is that serial is effective, and single line is that token is effective in addition. In addition, two lines can not be confirmed simultaneously. One or another root can be identified, perhaps, as there not being valid data, though two active lines are then arranged, neither may be identified. Should be understood that at other direction and only have an acceptance line. Yet this is not a problem. The Huffman decoder knows that it needs serial data or token information, and this depends on what current grammer next step this does. Thereby useful signal and acknowledge(ment) signal are correspondingly arranged. Send an acknowledge(ment) signal Accept to input shift unit (inshifters) from the Huffman decoder. If have suitable data or token to exist, the input shift unit is just sent a useful signal.
For example, a typical instruction can be done conversion to it to a Huffman code decoding in data directory unit the Index to Data Unit, revise its result in ALU, and then this result is formed a token word. A single micro-code instruction word is produced. This word has comprised the full detail of doing this work. The Huffman decoder is directly delivered in order, and latter's request transmits from one one ground of data that " input shift unit " parts transmit, until it is complete a complete symbol decoding. The control token is parallel input. In case when input was arranged, decoded index value and original microcode word were sent to data directory unit (the Index to Data Unit) together. Notice that the Huffman decoder will need several cycles to finish this operation. In fact, the periodicity that needs is by wanting decoded data to determine really. The data directory unit will shine upon this value with a table. This table is identified in the micro-code instruction word. This value is sent to again next parts, ALU. Microcode word is originally transmitted in the past together. In case ALU has finished suitable operation (periodicity again can be relevant with data), just suitable data and microcode word are sent to token format part together. The mode that microcode word control token word forms.
ALU has many condition lines or title " condition code ", and a little codes send back to the Parser state machine. This just allows this state machine executive condition transfer instruction. In fact, all instructions all are conditional branch instructions. One of selectable condition is to be hard wired to " False (vacation) ". Method with this condition of selection can consist of one " non-transfer " instruction.
According to this invention, the token formatter has two inputs: a data field from ALU and/or the emit field of coming from the Parser state machine. In addition, an instruction is arranged, it tells how many positions the token formatter should get from a source, then fetches other required when altogether gathering together enough 8 position from another source. For example, HORIZONTAL_SIZE has 8 bit fields, and it is a constant address, shows that it is a HORIZONTAL_SIZE token. In this case, 8 are all come from emit field, do not have data to come from ALU. Yet if a data token probably will have 6 from emit field, what show chrominance component hangs down two from ALU. Correspondingly, the token formatter is got this information and it is put into a token, and is used for the other parts of system. Notice that from the figure place in each source just for illustrative purposes, the people that one's own profession is had general technical ability will appreciate that the figure place that arbitrary source is come all is variable in the upper example.
ALU has one group of (bank) counter. Counter is used for to the structure counting of whole image. The size of image is programmed sends into the register relevant with counter. " micro-programmer " regards these counters as the part of register group. Some condition codes are from this counter group output. The counter group allows the conditional jump based on " image beginning " situations such as " macro block begin ".
Notice that the Parser state machine is also referred to as " demultiplex state machine " Demultiplex State Machine ". Two nouns are all used in the presents.
The input shift unit
In this invention, the input shift unit is in the circuit very simple one. It is comprised of the Zcells (" hfi ") of two-stage pipeline data path (" hfidp ") and control usefulness.
At first order streamline, carry out token decode. Only have data token to be identified in this one-level. Ground of the data that comprise in the data token moves into the Huffman decoder. Second level streamline is shift register. In the most last word of data token, adopted specific coding, so may send by the coded data buffer position of arbitrary number. The below is all possible pattern of last data word.
Table is the possible model in the digit order number in the end B.2.1
  E   D   C   B   A   9   8   7   6   5   4   3   2   1   0   No.of Bits
  0  1  1  1  1  1  1  1  1  1  1  1  1  1  1  None
  x   0   1   1   1   1   1   1   1   1   1   1   1   1   1   1
  x   x   0   1   1   1   1   1   1   1   1   1   1   1   1   2
  x   x   x   0   1   1   1   1   1   1   1   1   1   1   1   3
  x   x   x   x   0   1   1   1   1   1   1   1   1   1   1   4
  x   x   x   x   x   0   1   1   1   1   1   1   1   1   1   5
  x   x   x   x   x   x   0   1   1   1   1   1   1   1   1   6
  x   x   x   x   x   x   x   0   1   1   1   1   1   1   1   7
  x   x   x   x   x   x   x   x   0   1   1   1   1   1   1   8
  x   x   x   x   x   x   x   x   x   0   1   1   1   1   1   9
  x   x   x   x   x   x   x   x   x   x   0   1   1   1   1   10
  x   x   x   x   x   x   x   x   x   x   x   0   1   1   1   11
  x  x  x  x  x  x  x  x  x  x  x  x  0  1  1  12
  x   x   x   x   x   x   x   x   x   x   x   x   x   0   1   13
  x   x   x   x   x   x   x   x   x   x   x   x   x   x   0   14
When in shift register, one one of data bit is when moving to left, and the bit pattern of " 0 heel with all be 1 " is (filling) of expectation. This show in shift register remaining everybody be invalid. They go out of use. Notice that This move only occurs in the last word of data token.
As previously mentioned, all other tokens are the parallel Huffman decoders that import into. They still are loaded into second level streamline, but are not subjected to displacement. Notice that data head goes out of use, and does not deliver to the Huffman decoder fully. Two " effectively " lines (out_valid and serial_valid) are provided. Only have a line to be identified in preset time, it shows that the data in which kind of type of this moment are provided. B.2.2 Huffman decoder
The Huffman decoder has many modes of operation. The most significant is that it can to the decoding of Huffman code, become the Huffman call number to them. In addition, it can be decoded by the fixedly long code that coding line determines to its length (with some bit representations). The Huffman decoder is also accepted token from input displacement (inshift) piece.
The Huffman decoder comprises a very little state machine. It uses to piece level (block-level) information decoding the time. This be because Parser state machine needed time of making decision oversize. (because it must wait data flow to cross after data directory unit (the Index to Data Unit) and the ALU, could make decision and send a newer command these data). When using this state machine, Huffman decoder oneself says the word to data directory unit and ALU. The state machine of Huffman decoder can not be controlled all micro-code instruction positions, so it can not send FR order to other parts. B.2.2.1 operating principle
The Huffman decoder of this invention adopts a kind of computational methods that the code that arrives is decoded into the Huffman call number when separating the Huffman code. This number be (to a code table that N (entries) arranged) between 0 to N-1. From one one ground of input shift unit received bit signal.
For the ease of the operation of controlling organization, need many tables. These table regulations are for each possible figure place in the code (1 to 16), the code with this length have what. Can expect that this information is used for stipulating that a common Huffman code is sufficient not in a typical case. Yet, at MPEG, H.261 with JPEG in, what select is the Huffman code, so that singly be that this information just can regulation Huffman code table. The exception of this being only had a misfortune: from the T coefficient table (T Coeefficient table) that H.261 comes. This table also is used among the MPEG. This just requires an additional table, and this table is illustrated elsewhere. (in H.261, have a mind to introduce this exception, imitate to avoid initial code).
Some used in Huffman decoder tables are accurately identical with the table that transmits in JPEG, and it is very important understanding this point. This just allows directly to use these tables. And if other Huffman decoder design then will require to generate internal table from the table that transmits. This conversion just may require extra memory and extra processing. At MPEG and H.261 because the table in (it has above-mentioned exception) can be described with the same manner, standard decoder becomes feasible more than one.
One section following " C " program description decode procedure:
    int total=0;    int s=0;    int  bit=0;    unsigned long code=0;    int index=0;    while(index>=total)    {    if(bit>=max bits)    fail(“huff_decode:ran off end of huff table\n”);      code=(code<<1)Inext_bit0;      index=code-s+total;    total+=codes_per_bit[bit];      s=(s+codes_per_bit[bit])<<1;      bit++;   }
In general, this process is directly converted to the realization of silicon chip, although also can utilize such fact, namely before some intermediate value of needs, can with clock phase they be calculated first.
We see from the source code section:
EQ 1. totaln+1=totaln+cpbn
EQ 2. ′sn+1=2(′sn+cpbn)
EQ 3. coden+1=2coden+bitn
EQ 4. indexn+1=2coden+bitn+totaln-′sn
Yet, easier use of the equation group of revising of proof in hardware. Variable in this equation group " shifted " is used to replace variable " S ". In this case:
EQ 5. shiftedn+1=2shiftedn+cpbnConsequently:
EQ 6. in=2shiftednIt so for returningequation 4, can be found out:
EQ 7. indexn+1=2(coden-shiftedn)+totaln+bitn
Except calculating continuous " index " value, must know also when calculating is finished. From " C " source code section, can find out, when:
EQ 8. indexn+1<totaln+1The time we have just finished.
Withformula 7 andformula 1 substitution, can find out, when:
EQ 9.2 (coden-shiftedn)+bitn-cpbn<0 o'clock, we just finished.
In hardware implementation process of the present invention, the public keys (code inequation 7 and the equation 9n-shiftedn) than the first phase calculation of the remainder of these equations, the information of " finishing " to provide end product and calculating.
Some warning: in different " C " chip segments, especially at characteristic coding and decoding (behavioral compiled code) Huffman decoder with in Sm4code scheme (projects), " C " program segment almost directly is used, but variable " S " in fact is known as " shifted ". So there are two different variablees to be known as " shifted ". One is in " C " code, and another is in hardware is realized. These two variablees differ afactor 2. B.2.2.1.1 the negate of data bit (inverting)
The Huffman code is correctly decoded also needs in addition some knowledge. The polarity of Here it is coded data. Originally H.261 adopt opposite agreement with JPEG. Itself is reflected in this fact, and everybody is 0 for initial code H.261, and the flag byte in JPEG everybody be 1.
In order to process two kinds of agreements, when coded data is read into the Huffman decoder, must be every negate of coded data, to the H.261 Huffman code decoding of style. Obviously, just can finish with an XOR gate. Attention: only to the upset of Huffman code, be nonphase-inverting in data when fixedly long code is decoded.
MPEG adopts the mixing of two kinds of agreements. To the mode from H.261 coming, adopt agreement H.261. Those from (decoding of the inner coefficient of DC) that JPEG comes, are adopted the JPEG agreement. B.2.2.1.2 conversion coefficient table
When using the conversion coefficient table that H.261 reaches among the MPEG, have some abnormal conditions. At first, the table among the MPEG is a superset (super-set) of H.261 showing. In hardware of the present invention is realized, these two kinds of standards are shared everything; This means, contain from the table the extension the code (being the MPEG code) H.261 code stream will be decoded " correctly ". Certainly, the other side of compression standard is probably impaired. For example: these extended codes can cause the emulation of initial code in H.261.
Secondly, the conversion coefficient table has a kind of unusual, and this occurs when code length is six unusually, and it means can not normally use by bit code (codes_per_bit) table describes. These six bit word are substituted code word regularly and are replaced. In encoder, at first obtain correct result with the normal mode coding. Then, for all six or longer code, with simple table lookup operation once with its first six digits with other six alternative. According to the present invention, in decoder, just the 6th solve before, decode procedure interrupts, and uses look-up method to replace code word, then continues decoding.
In this situation, only have ten kinds of six bit codes that may occur, thereby required look-up table capacity is minimum. To operation advantageously, after operation, high two invariant positions of code. Therefore the genuine look-up table of unnecessary use can be finished suitable conversion by the circuit that a small amount of door is barricaded as. The module of doing this work is called " hftcfrng ". The replacement of this type codes can be described as " ring " at this, because each code of possibility code collection collects another code alternative (not introducing new code, also the old code of not cancellation) with this.
In addition, first first coefficient of piece has been used unique implementation method. Since it is so the symbol of frequent appearance, the code of ' this termination of a block ' just can not occur, therefore, the conversion coefficient table be improved, so that can use this to be considered in other cases the code of this termination of a block. This just can save one. The result shows, uses decode structures system of the present invention, and making easily should above-mentioned improvement. In brief, for first of the first coefficient, if " index " (" index ") is null value, think that then decoding " finishes ". In addition, after an independent decoding, " index " only has two probable values, is not 0 to be exactly 1, need only test one and get final product. B.2.2.1.3 the figure place of register and adder
Huffman decoder of the present invention can be processed the Huffman code that may reach 16. Yet decoder only has 8 bit widths. This may accomplish, because the maximum of the known Huffman index that solves is 255. In fact, this just occurs in the JPEG of expansion; In current application, greatest limit is than this slightly low (but greater than 128, therefore still disliking not enough for 7).
The result shows that for all legal Huffman codes, the not only end value of " index ", and all medians is also all in 0 to 255 scope. But, for improper code, namely to attempt to a code in current code table when not decoding (this yard may factor data make mistakes due to), index value just may exceed 255. Because our usefulness is eight machines, might in decoding at the end be dropped because inform each higher significance bit of makeing mistakes, the final value of " index " does not exceed 255. If therefore whenever exponential quantity the wrong decoding of also abandoning just occurs greater than 255 (that is: the adder carries of formation index) in decoding.
" code " 12 are saved. For translating the Huffman code, this is also unnecessary, and eight bit register should be enough. Needing these high positions is in order to read the fixed-length code of 12 of as many as. B.2.2.1.4 for the operation of fixed-length code
With regard to the code of regular length, the value of " by bit code " is forced to zero. In other words, " total " and " shifted " remains zero in whole operating process, thereby " index " is identical with code. In fact, adder etc. standard is that " index " produces one 8 value. Therefore, when fixed-length code was decoded, each high position of output word was directly taken from " code " register. When separating the Huffman code, these high positions are forced zero.
Infer whether read enough figure places from input in obvious mode. Comparator is compared desirable figure place with " position " counter. B.2.2.2 to the decoding of coefficient data
According to the present invention, the Parser state machine is general only for quite senior decoding. 8 take advantage of 8 data blocks with interior lowermost level decoding, directly process without this state machine. The Parser state machine says the word to the Huffman decoder with the form of " piece is decoded ". Under the control of the state machine (it is within the Huffman decoder in fact) of a special use, Huffman decoder, data directory unit (Index to Data Unit) and calculation are stated logical block and are worked harmoniously. This arrangement allows that the coefficient data to encoding with entropy carries out the high decoding of performance. In this mode of operation, also has the other feedback path in action. For example: in the decoding of the JPEG that VLC is decoded obtain SIZE (code is wide) and RUN (operation) information, SIZE information directly feeds back to the Huffman decoder by the output of data directory unit, has how many FLC positions to read to inform the Huffman decoder. In addition, also have some accelerators to replenish. For example, use same example, observed the Huffman index value before this one-level of data directory unit, suppressing in this way all VLC values generation SIZE is null value. In other words, as long as SIZE value non-zero, the Huffman decoder can just read of FLC before knowing the actual value of SIZE. That is to say, do not waste the clock cycle, search required single clock cycle overlaid because read first of FLC with in the data directory unit, finishing. B.2.2.2.1 MPEG and H.261 AC coefficient data
Figure 127 is illustrated in MPEG and reaches the H.261 coding/decoding method of middle AC coefficient. The flow chart of displaying Huffman decoder operation detail is seen Figure 119.
Process is from reading the VLC code. In the normal procedure of event, the Huffman index is mapped directly to the value of six RUN of expression and expression coefficient absolute number. Then read a FLC to provide the symbol of coefficient. ALU obtains the final value of coefficient with this sign bit and the combination of coefficient absolute value.
Notice that data format herein is symbol-numerical value, this operation does not have any difficulty. The RUN value is transferred to six secondary buses, and coefficient value (LEVEL) is then delivered to normal data/address bus.
Have two kinds of special circumstances and observed first the index value that process is decoded before the operation that is become data by index, both of these case can enter sunken, and they are the termination of a block (EOB) and ESC (Escape) coded data. If EOB, the fact that this situation occurs is imported data directory unit and ALU unit into, so token formatter (Token Formatter) can correctly close the DATA token that make and break.
The Escape coded data is comparatively complicated. Six on the head of RUN is read, and directly imports the data directory unit into and deposit the ALU unit in. Then read FLC one. Here it is at MPEG and the highest significant position of 8 Escape codes describing H.261, and it provides the symbol of level (level). Symbol directly reads in this device, this be because for negative value and on the occasion of, must send different orders to ALU, thereby allow that ALU is sign magnitude with twos complement value transform in the bit stream. No matter be to bear just, then read seven of the remainders of FLC. If its value is zero, just must read again 8.
In the present invention, the internal state machine of Huffman decoder is responsible for producing both having drawn oneself up and is also controlled the various command of data directory unit, ALU and token formatter. Shown in Figure 124, the command source of Huffman decoder is in following thrin: the Parser state machine, the Huffman state machine is stored in an instruction in certain register and this instruction receives from the Parser state machine before this. In brief, be retained in the register from the presumptive instruction of Parser state machine (it makes the Huffman state machine take over control and reads coefficient value), just use it when namely needing new VLC at every turn. All remaining instructions of decoding usefulness are all provided by the Huffman state machine. B.2.2.2.2 MPEG DC coefficient data
The processing method of these data is with JPEG DC coefficient data. Use identical (can pack into) table, be responsible for guaranteeing that by the microprocessor of control the content of each table is correct. Be that with the unique real difference of mpeg standard each prediction device is reset to zero (as in JPEG), the correction of for this reason carrying out is then finished in inverse quantizer. B.2.2.2.3 JPEG coefficient data
Figure 120 is that the present invention is to the hardware block diagram of JPEG AC coefficient decoding. Because the process of DC coefficient is the simplification of JPEG process in itself, this block diagram is all applicable to AC and DC coefficient. Unique real replenishing to aforementioned MPEG AC coefficient block diagram be, " SSSS " field is fed and may be taken as the part of Huffman decoder order, the FLC figure place that will read with regulation. The remaining section of order is provided by the Huffman state machine.
The flow chart of Huffman decoding is made in Figure 121 displaying to AC and DC coefficient.
At first process the decode procedure of AC coefficient. This process is from using suitable table (have two AC table) to read the VLC value. In the data directory unit, the Huffman index is transformed into RUN and SIZE value. At Huffman index level (stage), two values suppressed (trapped) are arranged, they are corresponding with EOB and ZRL respectively. This is not read everybody two unique values of FLC. If when the decoding index was not one of this two value, the Huffman decoder read of FLC immediately, meanwhile, table lookup operation is finished in its wait data directory unit, to determine really to need several. If EOB, the Huffman state machine in the Huffman decoder no longer further processes, and reads another order from the Parser state machine.
If ZRL just do not need each FCL position, but piece finishes not yet. If so, the Huffman decoder begins immediately again to next VLC decoding (using as same front table).
Follow ZRL and EOB in detection and aspect the index value that produces, a special problem is arranged. This owing to (from H.261 to reach MPEG different) but each table of Huffman is the higher level pack into (down-loadable). To two AC table of JPEG, respectively provide two registers (to ZRL, give EOB for). When the table higher level packed into, two register civilian dresses entered. They keep index value and suitably symbol is relevant.
ALU must become suitable symbol-numerical value with the FLC code conversion of SIZE position. When the table higher level packed into, the sign magnitude civilian dress entered. They keep index value and suitably symbol is relevant.
ALU must become suitable symbol-numerical value with the FLC code conversion of SIZE position. This can finish like this, at first symbol is had the value of mistake to carry out sign extended. If sign bit is set, then with all the other everybody negate (complement code).
If some DC coefficients, then the decision-making at the Huffman decode phase is easy a little, because there be not the thing suitable with the ZRL field. Everybody is that unique symbol of zero is that indication DC difference is that symbol of zero to make the FLC that will read. This is suppressed (trapped) in the Huffman index stage again, because for every (but (downloadable) that the higher level packs into) JPEG DC table provides a register, to keep this index.
ALU of the present invention has task to keep the portion copy of last DC coefficient value, is used for forming the final DC coefficient through decoding (this cries prediction usually). Needing four predicted values, is that four effective chrominance components are respectively established one. When the DC difference was decoded, ALU added that suitable predicted value is to form decoded value. This value is stored up again, as the poor predicted value of next DC of this chrominance component. Because the DC coefficient has negative (because the DC biasing is arranged) just arranged, require to transform to symbol-numerical value from twos complement. Then, this is worth output, and the RUN of companion is zero. In fact, finish a few stages of these last several stage work, its instruction is not to be provided by the Huffman state machine. They are just carried out by the Parser state machine.
Similar to AC coefficient situation, it is poor that ALU must form DC by each SIZE position of FLC. But, in this situation, need a twos complement value to be added to predicted value and get on. This can form as follows: at first carry out sign extended as front with wrong symbol. For negative, then must add 1 to produce correct value such as the result. Certainly, carry digit can be filled in adder, add 1 when adding predicted value with such method. B.2.2.3 error handling processing
Error handling processing is worth mentioning. In fact the error source of finding has following four kinds:
The tail that exceeds table.
Hope is token, but is serial data.
Hope is serial data, but is token.
Coefficient in one is too much.
Error of the first kind occurs under two kinds of occasions. If digit counter arrives 16 (legal value is 0 to 15), then make mistakes, because the longest legal Huffman code is 16. If arbitrary median of index surpasses 255, then appearance as B.2.2.1.3 described mistake.
The second mistake appears at when wishing to be token, but runs into serial data. The third mistake appears at opposite situation.
Last a kind of mistake appears in the piece to be had in the situation of multiple index. This is actually and detects in the data directory unit.
No matter which kind of situation occurs, mistake is write down in the Huffman error register, and Pars-er state machine is interrupted. Process reparation order wrong and that send necessity, this is the responsibility of Parser state machine.
In order to ensure correct operation, interrupting constantly, the Hfffman state machine is combined with the Parser state machine. When the Huffman decoder interrupts the Parser state machine, may there be a new order just waiting in the output of Parser state machine and be accepted. The Huffman decoder can not accepted this order in two complete cycles after it interrupts the Parser state machine. This just allows that the Parser state machine removes once the order there (it not should in present execution), and changes it into a suitable order. After these two period expires, the enabling of Huffman decoder. If there is an effective order there, just accept this order. Otherwise just do nothing, until the Parser state machine provides an effective order.
No matter which kind of mistake occurs, the event bit of " Huffman mistake " (event bit) is set. If mask bit (mask bit) is set, then piece will stop, and the microprocessor of control also will be interrupted with normal mode.
Under some occasion, look like it is wrong thing, in fact but not mistake, this has just brought complex situations. The most important occasion that this situation occurs is when reading the address of macro block. MPEG, H.261 with the syntax of JPEG in, it is legal token occurring in the place that is contemplated to macroblock address. If this situation occurs legally, the Huffman error register is loaded into zero (meaning inerrancy), but the Parser state machine still is interrupted. It is " inerrancy " state that the code of Parse state machine must identify this, and responds by this. In this case, then the event bit of " Huffman mistake " is with not set, and piece does not stop to deal with yet.
Must tackle several situations. The first, token occurs immediately, and the front does not have some serial bits. As if this is really true, " wishing serial data, but is the mistake of token " can appear originally. But, in fact " inerrancy " mistake of just now describing appears.
A few serial bit is arranged before the second, token. In the case, forming one determines. If before the token everybody is 1 (do not forgotten H.261 reaching coded data is anti-in the MPEG, so in encoded data files, these positions all are zero), therefore do not make mistakes. If but wherein having zero, they just are not effectively to load the position, then make mistakes, and be the mistake of " wishing serial data, but is token " really.
The third has many positions before token. In the case, carry out identical decision. If all 16 all is 1, they are taken as the filling position, once " inerrancy " mistake just occurs. If wherein have zero, the mistake of " exceeding Huffman table scope " then appears.
Another kind of occasion of not wishing to occur token is in JPEG. When processing some Huff-man tables or quantizer table, the table of any amount can appear in same marker field (Marker segment). The Huffman decoder is not known this quantity. Therefore after finishing each table, it reads another 4 FLC, thinks that this is the number of new table. If but new marker field begins, can meet with token rather than 4 FCL. This requirement is unpredictable, has therefore added the command bit of " ignoring mistake (Ignore errors) ". B.2.2.4 Huffman order
Here be that the Parser state machine is in order to control some positions and the definition thereof of Huffman decoder component. Notice that the number order position of data directory unit is also contained in Ben Biaonei. From microprogram programmer's viewpoint, Huffman decoder and data directory unit resemble works the logical block of a compactness.
Show B.2.2 Huffman decoder order
ThepositionTitleFunction
  11Ignore Errors (ignoring wrong)For the mistake of forbidding occuring in some cases.
  10Download (lower dress)Name for the table of removing stage makeup and costume, or data are discharged into this table.
  9Alutab (ALU table)Remove to specify table number (or figure place of FLC) with the information that the ALU register comes.
  8Bypass (walking around)Walk around Index toData unit
  7Token (token)Token (rather than to FLC or VLC) is deciphered.
  6First Coeff (first coefficient)Select first coefficient skill, be used for Tcoeff table and other particular form.
  5Special (special order)Such as this position, position, the control of Huffman state machine adapter.
  4VLC (non-FLC)Specify VLC or FLC.
Show B.2.2 Huffman decoder order (continuing)
ThepositionTitleFunction
    3  Table[3]The table of specifying VLC to use
    2  Table[2]Or the FLC figure place that will read
    1  Table[1]
    0  Table[0]
B.2.2.4.1 read FLC
Ignore Errors in the manner, Download, ALutab, Token, First Coeff, Special and VLC are zero, and Bypass is set, because exist without the conversion that indexes data.
Binary number in table [3:0] is pointed out to read those several.
Numeral 0 to 12 is legal, value zero really reads 0 (as desired) and this instruction is Huffman decoder NOP instruction,value 13,14 and 15 meanings are not worked, and when the Huffum state machine represented to use " SSSS " to go to read as the figure place of FLC, its value was 15. B.2.2.4.2 read VLC
In the manner, Ignore Errors, Download, Alutab, Token, First Coefficient and Special are zero. VLC is 1. Bypass normally 0, therefore exists from the conversion that indexes data.
In the manner, Token, First Coefficient and Special are zero, and VLC is 1.
Table[3:0] in binary number indication will be with table, as follows
Show B.2.3 Huffman table
    Table[3:0]The VLC table of using
    0000TCoefficient (H.261 MPEG reaches)
    0001The encoding block sample
    0010Macroblock address
    0011The activity vector data
    0100Interior M type
    0101The M type of prediction
    0110The M type of interpolation
    0111H.261 M type
    10x0   JPEG(MPEG)DC Table 0
    10x1   JPEG(MPEG)DC Table 1
    11x0   JPEG ACTable 0
    11x1   JPEG AC Table 1
Note that if remain on table in the RAM (being the JPEG table) do not useposition 1, therefore the selection of table occurs twice. If what consist of is non-baseline (non-baseline) JPEG decoder, four DC tables and four AC tables are just arranged, just need Table[1].
If Table[3] be zero, then input data-conversion, for example, for each table correctly as H.261 form read. If Table[3:0]=0, then also carry out suitable Ring (ring) and revise. B.2.2.4.3 NOP instruction
As aforementioned, read one everybody be that the action of zero FLC is taken as an inoperation (No Operation) instruction and uses. Do not have data to read (token, serial data) from input port, the Huffman decoder is zero data in company with coding line output numerical value. B.2.2.4.4 the first coefficient of TCoefficient
H.261 with MPEG TCoefficient table a special non-Huffman code is arranged, it is as the first coefficient on the head in the piece. For the top at piece solves TCoefficient, first coefficient bits (First Coefficient bit) can be set in company with a VLC instruction that obtains with table zero (Table[0]). One of many effects of first coefficient bits are that this code is solved.
Note, under normal operation, seldom send " simply " order of reading TCoefficient VLC. This is because usually give the Huffman decoder control to special bit (Special Bit) set. B.2.2.4.5 read token word (Token Words)
In order to read the token word, token position (Token bit) must put 1. Special bit and first coefficient bits are necessary for zero. If Table[0] position correctly works, and the VLC position also must set.
In the manner, Table[0] position and Table[1] to be used for changing the characteristic of reading token as follows in the position:
The positionMeaning
    Table[0]Abandon each filling position of serial data
    Table[1]Abandon all serial datas
If Table[0] and Table[1] all be zero, serial data then before token, occurs and be considered to mistake, can write down the mark of makeing mistakes.
If Table[1] set, then all serial datas are dropped, until run into the token word. The appearance of these serial datas can not produce mistake.
If Table[0] set, then each filling position is dropped. Certainly, must know the polarity of each filling position. This is by Table[3] determine, and identical when reading the VLC data. If Table[3] be zero, at first will input data-conversion, then abandoning any is the position of " 1 ". If Table[3] put 1, the not negate of input data, but abandon position, each " 1 ". Because according to Table[3] position data negate This move is depended on the VLC position, this position must put 1. If the position that runs into is not filling position (that is, in the position, each " 1 " that H.261 reaches among the MPEG), reporting errors.
Attention is only read an independent token word in these instructions. Do not remove to comprehend the state of extension bits (extension bit), testing this and moving accordingly is the responsibility of Demux. The present invention also provides some instructions of reading a plurality of words-see special instruction one joint. B.2.2.4.6 the ALU register is specified table
" if Alutab " position, position, each register in the register group of ALU can be with deciding the actual table number (table number) that will use. Which ALU register the table number that provides in order together with the VLC position, determines to have used.
Show the B.2.4 selection of ALU register
    VLC     table[3:0]The ALU table
    0     X0XX     fwd_r_size
    0     X1XX     bwd_r_size
    1     X0XX     dc_huff[compid]
    1     X1XX     ac_huff[compid]
If the code of some regular lengths is just read for the figure place to each vector decoding. If r_size is zero, just produce a NOP no-operation instruction (NO-OP).
If some Huffman codes, the table number that then produces makes table[3] put 1, therefore, the result draws number refers in each JPEG table one. B.2.2.4.7 special order
The instruction (or mode of operation) of describing so far all is considered to " simply " instruction. For each order of receiving, the right quantity of input data (serial data or token data) is read, and result data is output. If do not find mistake, each order just produces an output just.
In the present invention, special order has such characteristics: a single instruction can produce more than one output word. In order to realize this function, the internal state machine of Huffman decoder is obtained control, and understands and oneself send some essential instructions, until it judges that the instruction that Parser asks finishes.
In all special orders, sequence the first pending real instruction (first real instruction) is arranged along with special bit (Special bit) puts 1 and send. This means that all sequences all must have the first instruction of a monodrome. The benefit of this scheme is first the real instruction that can obtain sequence, and needn't carry out the search operation done based on the order of receiving from Parser.
The special order of approval has following four kinds:
·TCoefficient
·JPEG DC
·JPEG AC
·Token
Conversion coefficient (Transform coeffic-ients) that H.261 reaches MPEG etc. is read in the first instruction, until read the symbol of this termination of a block. If piece (non_ intra block) in this piece right and wrong, this order can be read whole. In the case, " first coefficient " position (" First coefficient " bit) should set, to use first coefficient skill. If this piece is interior (intra block), then the DC item should read, so " first coefficient " position should be zero.
If be interior in H.261, the DC item is read with " simply " instruction, in order to read 8 FLC values. In MPEG, following explanation " JPEG DC " special order is used.
" JPEG DC " order is used for reading the DC item (comprising the SSSS position FLC by the VLC indication) of JPEG form. This order is also used in MPEG. The necessary set of first coefficient bits is so that the counter in the data directory unit (number of design factor) is reset.
" JPEG AC " order is used for reading piece after reading the DC item remaining section, until meet with EOB, or until read the 64th coefficient.
" Token " order is used for reading complete token. Token word (token words) is read, until the extension bits zero clearing. This is a kind of method that makes things convenient for of processing some unrecognised tokens. B.2.2.4.8 higher level's table of packing into
In the present invention, the table of Huffman decoder can both carry out the higher level with " higher level packs into " position (" Download " bit) and pack into. The first step is which table of indication wants the higher level to pack into. Position and first coefficient bits pack into all in the situation of set the higher level, finish even if send an order of reading FLC. Process because be used as NOP, thus in fact do not read everybody, but table number be stored into certain register and be used for the higher level of recognition sequence pack in which table pack into.
Show B.2.5 JPEG table
    table[3:0]The table of being named
    10XX     JPEG DC Codes per bit
    11XX     JPEG AC Codes per bit
    00XX     JPEG DC Index to Data
    01XX     JPEG AC Index to Data
As above shown in the table, or AC table or the DC table can be loaded into table[3] what then determine to pack into is step-by-step code table (codes-per-bit table) or data directory (Index to Data table).
In case show designatedly, data just higher level are packed into wherein. Its method is that the higher level packs into and sends an order to read required FLC number (always 8 figure places) in the situation of position, position (and first coefficient bits zero setting). This just writes in the specified table data through decoding. At the current address data writing, then address counter rises in value according to the address Counter Value. Whenever a designated address counter of table is reset to zero.
When the higher level packs into when indexing data and respectively showing, monitoring data and address. Notice that the address is the Huffman call number, and this address of packing into is final symbol through decoding. This information is used for automatic load register, and the interested symbol of these register pairs has kept the Huffman call number. Therefore, in JPEG AC table, when having numerical value with the ERL analog value when being identified, the current address just writes register CED_H_KEY_ ZRL_INDEXO or the CED_H_KEY_ZRL_INDEX1 that is indicated by table number.
Because the data phase place (phase) after decoding through decoding just writes by bit code (codes-per-bit) table, can not be during this position from show read data. Therefore sending immediately the instruction of attempting to read VLC higher level's load after will be failed. In any practical application (when namely carrying out JPEG), such sequence has no reason occur. But, might set up the simulated test of so doing. B.2.2.5 Huffman state machine
According to the present invention, the Huffman state machine provides various command for the Huffman decoder, and these orders are inner generations in some cases. Also all may offer the Huffman decoder with Demux with issuable all orders of internal state machine.
The basic structure of state machine is as follows. When a certain order is sent to the Huffman decoder, it also is stored in a series of auxiliary latch, so can re-use afterwards. This order is also carried out by the Huffman decoder and is analyzed by the Huffman state machine. If it is known command sequence and special bit (SPECIAL bit) set that order is recognized, then Huffman decoder states machine is from the control of Parser state machine reception to the Huffman decoder.
Here, three sources are arranged for the various instructions of Huffman decoder:
1) Parser state machine-after special order finishes (in decoded such as EOB), just do this selection, and next demux order is accepted.
2) Huffman state machine. The Huffman state machine can be to oneself providing certain bar to order arbitrarily.
3) initial order of being sent by the Parser state machine is with enabled instruction.
In situation (2), might be provided by the feedback from the data directory unit by table number, this will change the field (field) in Huffman state machine ROM.
In situation (1), under some occasion, each table number provides (for example, if AC and DC table number and F-table number) by the value that ALU register group obtains. These values are stored in the auxiliary order reservoir, so when this order was reused afterwards, table number was exactly that stored. This table number no longer restores from ALU, because in general, each counter has advanced in order to point to next piece.
Owing to the selection of next the bar instruction that will use just depended on to make resolution in the later stage of one-period in decoded data. Therefore, general structure is such: all possible instruction all is ready to parallel mode, and composite combined, determines actual instruction in the later stage in cycle.
Notice that in each case the ROM of state machine also determines the instruction that will depend on current data, then to be added on the ALU because these data pass to the data directory unit except the instruction that determines will be used by the Huffman decoder in next cycle. All three kinds of instructions are got ready with parallel mode all identically, then make one's options in the later stage in cycle.
In addition, this part of instruction also has three kinds of selections corresponding to three kinds of selections of next Huffman decoder instruction:
1) is applicable to constant (constant) instruction of block end.
2) Huffman state machine. The Huffman state machine may provide certain bar arbitrary instruction for the data directory unit.
3) by the issued initial order of Parser, with enabled instruction. B.2.2.5.1 EOB comparator
The output of EOB comparator forces in fact to select constant instruction, gives the data directory unit with it, and it also can make next Huffman instruction is next instruction from Parser. The definite function of comparator is controlled by some positions in the Huffman state machine ROM.
The backing of device has four registers as a comparison, and they remain on the index of EOB symbol in each AC and the DC JPEG table. If the DC table does not have end of block character certainly, but zero-width (zero-size) symbol is arranged, it produces by zero DC is poor. Because this makes the FLC figure place that will read with the EOB symbol is zero just the samely, so entirely same to their processing.
Four index values in remaining on each register,constant value 1 also can use. This is at the index that H.261 reaches the EOB symbol among the MPEG. B.2.2.5.2 ZRL comparator
In the present invention, this is more general comparator. It is used for the initial order of selecting Huffman state machine instruction or data directory to use.
There are four values to make the backing of ZRL comparator. Wherein two in register, they remain on the index of ZRL code in the AC table. Two values are constants in addition, the one, and value of zero, another is 12 (index of ESCAPE in H.261 MPEG reaches).
Constant is used separately in the situation that FLC occurs. When table number during less than 8 (and VLC), use constant 12. If table number then uses one of two registers greater than 7 (and VLC), this low level by table number determines.
One among the state machine ROM offers and enables comparator, and another one has been used for inverting function.
If the position, token position in the instruction, then the output of comparator is out in the cold, and substitutes with expansion (extn) position. Till this just allows that running to token finishes. B.2.2.5.3 the ROM of Huffman state machine
Each instruction field in the Huffman state machine is as follows:
nxtstate[4:0]
This is the address that will use in next cycle. It can be changed.
statect1
Permission is changed the NextState address. If be zero, the state machine address does not change, otherwise the least significant bit of address is substituted as follows by the value of one of two comparators:
    nxtstate[0]
    0When conforming to, EOB substitutes leastsignificant bit
    1When conforming to, ZRL substitutes least significant bit
Attention: under any circumstance, if next Huffman instruction is chosen as " reruning original order (re-run original command) ", state will forward to orderingsuitable position 0,1,2 or 3.
eobct[1:0]
It goes to control the selection of next Huffman instruction according to EOB comparator and extension bits, and is as follows:
    eobctl[1:0]
    00Without result-see Zrlct[1:0]
    01If EOB then gets new (Parser)order
    10If extn is low, then get new (Parser)order
    11Unconditional Demux instruction
Zrlct[1:0]
It goes to control the selection of next Huffman instruction according to the ZRL comparator. If satisfy condition, it just gets state machine instruction, otherwise it reruns original instruction. No matter which kind of situation occurs, adopt a demux instruction such as a certain eobctl*+ condition, then this (eobctl *+) gets following priority:
    Zrlct[1:0]
    00Always do not get SM (always reruning)
    01Always get theSM order
    10If ZRL conforms to, SM then
    11If ZRL does not conform to, SM (state machine) then
smtab[3:0]
In the present invention, if selected instruction is state machine instruction, then table number is exactly the table number that the Huffman decoder will use. But, if the ZRL comparator conforms to (mat-ches), preferentially use zrbtab[3:0] and field.
If do not require look ZRL whether meet use different table numbers, then smtab[3:0] and zrltab[3:0] can have same value. But, notice that this can cause the problem of modelling that some are strange in Lsim. If MPEG, those indicate the register (the unique structure that has of a kind of JPEG-a JPEG only construction) of the Huffman call number of ZRL not exist obvious requirement to go to pack into. But, these registers are still selected, the output of ZRL comparator becomes " the unknown ", although and in the ZRL comparator may all situations for " the unknown ", smtab[3:0] with zrltab[3:0] identical value (so which is selected unimportant) is both arranged, next state what one turns to for guidance or support belongs to " the unknown ".
zrltab[3:0]
If the instruction of selecting is that state machine instruction and ZRL comparator conform to, Here it is table number that the Huffman decoder will use.
smvlc
If the instruction of selecting is state machine instruction, each VLC position of Huffman decoder use that Here it is.
aluzrl[1:0]
This field control passes toward the selection of the instruction of ALU. Instruction or from the order of Parser state machine (it when command sequence begins, store under), or from the order of state machine:
    aluzrl[1:0]
    00The Parser state machine command of always going bail for and depositing
    01Always get the Huffmanmstate machine command
    10As not being that EOB then getsHuffman SM order
    11As not being that ZRL then gets Huffman SM order
alueoq
This line passes toward the change of the instruction of ALU according to the situation control of EOB comparator. It forces into " zinput " with the way of output of ALU simply. This is a kind of choosing at random; Any way of output is just enough so long as not " what neither (none) ". This is in order to ensure the block end command word being passed to token formatter piece (Token Formatter block), and this command word is controlled the correct format of each data token (DATA Tokens) there:
    alueob
    0Do not changeALU outsrc field
    1If EOB conforms to, then force " zinput " to enter outsrc
The remainder of various field is the ALU instruction field. Relevant data is provided in the explanation of ALU rightly. B.2.2.5.4 the modification of Huffman state machine
In a kind of specific embodiment of state machine, the data directory unit need " knowing " when the RUN part of the TCoefficient of an escape code (escape-coded) passing toward the data directory unit. Although in control ROM, use certain suitable position can accomplish this point, for fear of revising ROM, used a kind of alternative. Here it is, monitors to enter the address of ROM and findaddress value 5. It is exactly the correct position of the RUN field of designated treatment in ROM. Certainly, obviously ROM uses some other selected address values when programming. That narrated in the past in addition, uses the method for a certain position originally also can utilize in control ROM. B.2.2.6 sketch is browsed
In the present invention, the Huffman decoder is called " hd ". From in logic, " hd " in fact comprises data directory unit (this is that the various restrictions that generate of the code through compiling are desired). Therefore, " hd " comprises following critical piece:
Show B.2.6 Huffman module
Module nameExplanation
    hddpHuffman decoder (arithmetic) data path
    hdstdpHuffman state machine data path
    hfitodThe data directory unit
By the general, explain in subsystems zone (areas) being finished the following explanation to each Huffman module. These subsystems have been done more detailed displaying in the drawings. These figure understand easily to the people with the general technical ability of one's own profession. B.2.2.6.1 the explanation of " hd "
The logic of two-wire (two-wire) interface control usefulness generally includes three ports by two-wire interface control: data input, data output and order. In addition, there are two from " effectively " line of input shift unit; Token _ effectively (token_valid), it indicates certain token just appearing at in_data[7:0] on, and serial _ effectively (serial_valid) it, designation data is just in serial transfer.
In the signal that produces, the most important thing is that those are sent to the enable signal of each latch. And wherein most important be the enable signal of using to each ph1 latch. Most of pho latch and are not to enable, and only have wherein two to enable, and they are: the eo relevant with serial data and the relevant eot with token data.
In the present invention, the signal relevant with " finishing " (done, done0 and notdone0 that notdone and ph0 derive from) indicates the order of original Huffman when to finish.
In the situation that the Hufffman state machine command is performed, when each the original order that forms whole state machine command was finished, " done " was identified. The notnew signal prevents from accepting the newer command from the Parser state machine, until whole Huffman state machine command is finished.
As for the control of the information of receiving from the data directory unit, during the decoding of JPEG coefficient, the control logic of " size " field is fed back to the Huffman decoder. In fact two kinds of situations can appear. If width (size) just in time is 1, just feed back with special signal notfboneO. Otherwise width is by the output of data directory unit (out_data[3:0]) feedback, and this part thing is just occuring in signal fbvalid1 indication. Signal muxsize occurs, and with control the feedback data multichannel is combined into command register (seeing sheet10).
Also have in addition just intime 64 feedbacks that coefficient has been decoded. Because in JPEG, EOB is not encoded in the case, so produce signal forceeob. In fact this has two kinds of methods to finish, and is similar with above-mentioned two kinds of signals with the width feedback. Or use jpegeob (this is a kind of ph1 signal), or use jpegeboO. Notice that if that carry out is normal feedback (jpegebo), latch i-971 just is fed data and packs into, until admit ability zero clearing behind the new Parser state machine command. Only after the Huffman code was decoded, in fact signal forceeob just generated. Therefore, fixed-length code (being the every of size) is unaffected, but the compulsory block end of next Huffman coded message substitutes. When width is 1 and when using jpegeboO, only read 1, so i_ 1255 and i_1256 with signal delay to constantly correct. Attention is in this situation, and width can not occur is zero, only has EOB and ZRL because have the symbol of width zero.
To producing tcoeff_tabO (using the Huffman decoding of Tcoeff table), mba-tabo (using the Huffman decoding of MBA table) and the decoding of nop (inoperation) order are quite at random. Produce nop several reasons are arranged. First fixed code code is wide to be zero, and it two is forceeob signal (even because export to signal EOB, also should not from input shift unit read data), and is last, the table higher level pack into name be its three.
Notfrczero (producing by the FLC of width zero or by NOP) guarantees that the result is zero when using the NOP instruction. In addition, when invert indicates the serial data should negate (seeing a B.2.2.1.1 joint) before the Huffman decoding, and when the ring indication should add conversion coefficient ring (seeing B.2.2.1.2 joint).
Finishing of decoding is also relevant with addressing code_Pr_bit ROM. Code_per_bit ROM is made by some ROM of short data route (small dataPath). (such as csha and csla) copies to each signal, is in order to obtain enough drivings purely, and the latter is by with each ROM
Be divided into two to reach. The address is desirable from digit counter (bit[3:0]) or take from MPI address (key_addr[3:0]), and this depends on that UPI accesses that selected part.
The UPI that additional decoding involves some registers reads, such as the register (EOB, ZRL etc.) that keeps the Huffman index value for some JPEG tables. Also comprise the control of the relevant three-state driver of these registers and respectively read by the UPI of bit code RAM.
The data path decoding of arithmetic still is provided for some important item. The first coefficient skill of the use of first_bit and Tcoeff is relevant, and bit_five involves in the Tcoeff table and uses ring (ring). Note the usage of forceeob, the action that its simulation EOB comparator is consistent with decoded index value.
As for extension bits, if shift unit reads is token from input, then related extension bits is read out with token. Otherwise the last value of expansion is saved. This just allows whenever to expand (extn) position with the micro code program test after reading token.
When assert be zerodat the time, high 4 of Huffman output data are forced zero. Because only just there is virtual value these positions when separating fixed-length code, when when decoding VLC or token or because of any reason NOP, carrying out instruction with they zero setting.
When each order is finished, other electric circuit inspection to and produce " done (finishing) " signal. In fact, the reason that becomes " done " is divided into two classes, and they are: normal reason and exception reason. Each is processed by one of multiplexer (MUX of two three tunnel.
Following multiplexer (MUX (i-1275) is processed normal reason. If the situation of FLC is just used the ndnflc signal. This is the comparator output that digit counter is compared with table number. If the situation of VLC is just used the ndnvlc signal. This is the output of arithmetic data path (arith-metic datapath), and it has directly reflectedformula 9. If the situation of NOP instruction or token is only required one-period, therefore, native system unconditionally " is finished ".
In the present invention, upper multiplexer (MUX (i-1274) is processed exception. If decoder is just expecting that the feedback (fbexpctdO) of size is 1 (notfboneo) from size in the JPEG decoding, then decoder is finished, because only need one. If decoder is first of first coefficient with Tcoeff table, if then theposition 0 of current index is zero, even if (seeing B.2.2.1.2 joint) finished in decoding. If none satisfies these conditions, then there is not the exception reason of finishing.
NOR door (i-1293) solves the condition of " finishing " at last. Force " finishing " by the condition (being the invalid conditions of data) that i_570 produces. This may it seems a bit strange. It mainly uses after just resetting, and enters " finishing " state (" finishing " resets all counters, register etc.) to force machine when preparing first command. Notice that any error condition also " is finished " all by force.
When detecting various mistake, need to use the notdonex signal. Can not use each normal " finishing " signal, this is because when detecting mistake, also force at " finishing " state. Use " finishing " may cause (combinatorial) backfeed loop of combination.
Error detection and processing are finished in the circuit that can detect all possible errors conditions. ' or ' together in i_1190 for these conditions. In the case, i_1193, i_585, i_584 form three Huffman error register. Please note i_1253 and i-1254, they forbid mistake (seeing B.2.2.3 joint) in the various situations that do not have " real " mistake.
In addition, i_580, i_579 and the circuit that links thereof become a simple state machine, and it is in the reception that detects wrong rear control first command.
As previously shown, each control signal is delayed time, to be engaged in some pipelining delays among data directory unit and the ALU.
Itod_bypass is the actual by-passing signal that passes to the data directory unit. When fixed-length code was decoded, the Huffman state machine was in the driver's seat, and it forces bypass, and this moment, Itod_bypass was just changed.
Aluinstr[32] be to make ALU be able to the position that (with each condition code) feeds back to the Parser state machine. In addition, when the Huffman state machine was controlled at hand, importantly each signal only was identified once (rather than when one of each original order is finished).
Aluinstr[36] be the position that allows ALU stepping (step) block counter (if some other ALU command bits also designated increments). This is also only permitted is identified once.
In addition, these requirement can only be applicable to data are outputed to those ALU instructions of token formatter. Otherwise each counter may with regard to increment before outputing to the token formatter for the first time, cause incorrect " CC " value in certain DATA token.
Have in the illustrated embodiment in the present invention, if ALU exports to token formatter, alunode[1] or alunode[0] be low just.
Figure 118 and Figure 27 are similar, and the Huffman state machine data path that is called as " hdstdp " is shown. Also have a UPI decoder, be used for the output of reading Huffman state machine ROM.
In order to process this situation of being specified (seeing B.2.4.6) table number by the ALU register cell, provide frequency multiplexing technique.
Aluinstr[3:2] change to be used for ALU outsrc instruction field is forced to not be what neither (non_none) (seeing the explanation that B.2.2.5.3 saves alueob).
As for the command register of establishing for Huffman decoder block (X), each of order has the various relevant multiplexer (MUXs that may originate of select command. Four control signals are controlled this selection:
Selhold, it makes register keep its current state.
Selnew, it is loaded into a newer command from the Parser state machine. It also enables packing into of each register, and these registers keep original Parser state machine command for future use.
Selold, it causes packing into from the next original Parser state machine command of above-mentioned each register.
/ selsm, it causes packing into from the order of Huffman state machine ROM.
If table number, then situation is complicated a little, and (selholdt and mux size) latch remains on the current address among the Huffman state machine ROM because table number also may be packed into the output data of data directory unit. The order that the logic circuit detection is being carried out is which bar in possible four. If a newer command, then these signals mix to form low two of initial address mutually.
Logic circuit also detects the when output meaningless (usually ordering because order is one " simply ") of state machine ROM. Signal notignorerom is the operation of illegal state machine effectively, especially forbids doing any change towards the instruction of ALU.
The circuit that produces fixstateO is controlled the limited transfer ability of this state machine.
Also provide decoding to drive signal to the ROM of Huffman state machine. This ROM is the combination ROM of data path form.
Being created in B.2.2.5.4 of escape_run narrated in the joint.
Keep register such as the call number of ZRL and EOB symbols for those, also provide decoding for them. These registers can be packed into by UPI or data path. Central authorities (es[4:0] and ZS[3:0] the Port Multiplier that is decoded as generate to select signal, these Port Multipliers to select which register or constant value to want and compare through the Huffman index of decoding.
The control logic of using about the Huffman state machine. Herein, mix with various different conditions from each " instruction " position of Huffman state machine ROM, to determine what is next and how to go to change the coding line that arranges as ALU.
In the present invention, signal notnew, notsm and notold are because being used for the operation of control Huffman decoder command register on thesheet 10. Clearly, they are to be produced together by some control bit states and Huffman index comparator (neobmatch and nzrlmatch) among the state machine ROM (seeing B.2.2.5.3 explanation).
Selecting also is to be made by the command source that leads to ALU. Actual multiplexing is finished in Huffman state machine data path " hfstdp ". Produce four kinds of control signals.
When never running into the situation of the end of determining, can produce one of aluseldmx (selecting the instruction of Parser state machine) or aluselsm (selecting the instruction of Huffman state machine).
In the situation that never runs into block end, can produce one of aluseleobd (selecting the instruction of Parser state machine) or aluseleobs (selecting the instruction of Huffman state machine). " outsrc " field of ALU instruction is changed in addition, becomes " zinput " to force it.
During the higher level packed table into, a register kept having named the table number of table. For respectively providing decoding by bit code codes_per_bit RAM. When additional decoding can be identified the symbol of EOB and ZRL and so on and packed into by the higher level, so each Huffman call number register can be packed into automatically.
As for digit counter, a comparator can record when read correct figure place when reading certain FLC. B.2.2.6.2 the explanation of " hddp "
Each comparator detects the explicit value of Huffman index. Each register shows to keep these values for the energy higher level packs into. Port Multiplier (meob[7:0] and mzr[7:0] select used value, gating (gating) and some XOR gates form each comparator.
Each formula of explanation during some adders and register directly calculate and B.2.2.1 save. Here just needn't do again more speak more bright. Used an XOR gate, with the data (i_807) of negate explanation in B.2.2.1.1 saving.
The width of " code " register is 12. " ring " shape of explanation was replaced during the multichannel structure had realized B.2.2.1.2 saving.
Consider through the serial data (index[7:0]) of decoding and the data between the token data (ntokenO [7:0]) and multiplex various streamline and delay time that the index value of Huffman has decided with ZRL and EOB symbol.
Which Codes_Per_bit ROM and their multiplexed being used for determine to use open table. Use this structure to be because table selection information arrives lately. Then access all tables and choose correct table.
Consider Codes_Per_bit RAM, (Codes_per_bit) ROM's is multiplexed at last, and the output of Codes_Per_bit RAM is carried out in " hdepbram " piece. B.2.2.6.3 the explanation of " hdstdp "
In the present invention, " Hdstdp " has two modules. " hdstdel " participates in each control bit of Parser state machine is deferred to suitable flow line stage, for example when these positions are added to ALU and token format. It only processes about coding line that is sent to ALU for half, and all the other are then processed by another module " hdstmod ".
" Hdstmod " comprised Huffman state machine ROM. Some position of this instruction returns Huffamn state machine control logic to use. Everybody is used for the part ALU coding line (it is from the Parser state machine) that replacement is disregarded at " hdstdel " for all the other.
" Hdstmod " is apparent, need not to explain-only have some flowing water delay registers.
" Hdstdel " is also very simple, and it changes some Port Multipliers manipulations that the ALU instruction is used by a ROM and some. The remaining part of circuit is about UPI the half that each Huffman state machine ROM exports to be carried out read access. Also used buffer for number control signal. B.2.3 token formatter
According to the present invention, Huffman decoder token formatter is positioned at the end of Huffman piece. Such as its title prompting, its function is that the data format from the Huffman decoder is changed into suitable token structure. Under the control of microinstruction word command field, the data multiplex in input data and the microinstruction word. This piece has two kinds of mode of operation: DATA_WORD and DATA_TOKEN. B.2.3.1 microinstruction word
B.2.7, the microinstruction word that table is comprised of 7 fields
Field nameFigure place
Token (token)     0:7
Mask (mask)     8:11
Piece type (Bt)     12:13
Outside Extn (Ee)     14
    Demux Extn(De)     15
Block end (Eb)     16
Order (Cmd)     17
  17                  16        15                 14                12              8            0
    Cmd     Eb     De     Ee     Bt     Mask   Token
The management that microinstruction word equally is subjected to same accept with data word is mode of operation B.2.3.2
Show the B.2.8 distribution of Bitposition
  CmdMode
  0     Data_Word
  1     Data_Token
B.2.3.2.1 data word
Under the manner, the high eight-bit of input feeds back to output. Minimum eight is input or the token field of microinstruction word, or both mixing, and this depends on the mask off code field. Mask off code represents the input figure place in the mix, that is:
out_data[16:8]=in_data[16:8]
out_data[7:0]=(Token[7:0]&(ff<<mask))indata[7:0]
When mask off code is made as 0 * 8 or when larger, the output data will equal to input data. The manner is used for exporting various words in some non_DATA (non-data) token. Mask off code is set at 0 o'clock, out_data[7:0] will be the token field of microinstruction word. The manner is used for exporting the various token heads (Token headers) that do not comprise data. When the token head comprised data, the figure place of data was provided by the mask off code field.
If outside Extn (Ee) is set, out_extn=in_extn then, otherwise out_extn=Deo
Bt and Eb are " needn't be concerned about (don ' t care) ". B.2.3.2.2 data token
The manner is used for the data token is formatd, and it has two kinds of functions, depends on signal first_coefficient. When resetting, first_coefficient set. When first data coefficient arrived with a microinstruction word being put 1 by cmd, out_data [16:2] was set to 0 * 1, and out_data[1:0] get the value of Bt field in the microinstruction word. The head of Here it is data token. When this word is received, followed the coefficient of order to be loaded into register RL, and first_coefficient get the value of Eb. When next coefficient arrives, out_data[16:0] get one and be stored in the interior coefficient of RL. Then upgrade RL and first_coefficient. This just guarantees when running into block end and Eb set, with first_coefficient set, for next data token is got ready, that is:
            If(first_coefficient)                {                   out_data[16:2]=0x1                   out_data[1:0]=Bt[1:0]                 RL[16:0]=in_data[16:0]                }                else                {                   out_data[16:0]=RL[16:0]                  RL[16:0]=in_data[16:0]                }                out_extn=-Eb
B.2.3.3 explanatory discussion
According to the present invention, most of command bits are normally provided by the Parser state machine. But in fact two in the field provided by other circuit. Above-mentioned " Bt " field directly links to each other with the output of ALU piece. This one or two bit field provides the currency of " CC " or " chrominance component (color component) ". Therefore, when certain data token head is set up, minimum two from get chrominance component from the ALU counter. Secondly, whenever End_of_block (block end) accords with when decoded, (perhaps, if JPEG, when being assumed to 1, because the last coefficient of piece is through coding) " Eb " position just is identified in the Huffman decoder.
The in_extn signal obtains in the Huffman decoder. Only have when extension bits normally provides together with the token word, this signal could be meaningful to each token. B.2.4 Parser state machine
Parser state machine of the present invention is actually very simple circuit. Complicated is the programming of microcode ROM, and B.2.5 this discussing in joint.
In brief, the machine keeps the register of current address to form by one. In microcode ROM, search this address to produce microcode word. This address increment in a simple incrementer, the address behind the increment are that NextState will one of may addresses with two. Another address is the field in microcode ROM itself. Therefore every instruction may be transfer instruction all, can be transferred to the position of program appointment. If do not shift, control the next unit of the ROM that arrives.
Astring 16 condition code bits are provided. Can select these conditions any (being selected by certain field in the microcode ROM). In addition, each condition code bit can negate (being again in microcode ROM). The signal that draws behind increment the address or the jump address in microcode ROM in select one. One of condition is hard wired (hard_wired), to be judged as " vacation ". If choose this condition, just can not shift. On the other hand, if choose the then in addition negate of this condition, redirect always occuring then, namely becomes unconditional branch.
Show B.2.9 condition code bit
ItemTitleExplanation
   0     user[0]These four all are connected to a register, and this register can be programmed from MPI by the user. Their allow the condition code of " defined by the user ", and these condition codes can be tested with expense seldom. Wherein two are defined as control off-gauge " coded block pattern (Coded block Pattern) " processing, are used for 4 and 8 macroblock structure of experiment.
   1     user[1]
   2     cbp_eignt
   3     cbp_spec:al
   4     he[0]These directly receive the Huffman error register ofHuffman decoder
   5     he[1]
   6     he[2]
Show B.2.9 condition code bit (continuing)
ItemTitleExplanation
   7     ExtnExtension bits (being used for each token)
   8     BlkdtnThe blockmode shift unit
   9     MBstanIn the macroblock starting point
   10     PicstartInvisual starting point
   11     RestartResettinginterval starting point
   12     Chngdet" not nature "changes detecting position
   13     ZeroALU zerocondition
   14     SignTheALU sign condition
   15     FalseBe hardwired to False (vacation)
B.2.4.1 two-wire interface control
According to the present invention, two-wire interface is controlled in these parts and a bit makes an exception. A two-wire interface is arranged between Parser state machine and Huffman decoder. It is used for the carrying out of control command. Before a certain given order was accepted, the Parser state machine can be waited for always. Then set about from ROM, reading next command. In addition, each condition code is fed via the single line from ALU.
Every order has one in microcode ROM, allows to specify it should wait to be feedback. If do this appointment, after then this instruction has been accepted by the Huffman decoder, new order no longer appears, until be identified from the feedback line of ALU. This line is fb_valid, and current each condition code that is just providing of its expression ALU is effective in following meaning: they have reflected and the data that require to wait for that feedback command is relevant.
According to the present invention, this characteristic (feature) is to be used for consisting of various conditional branch instructions, according to the result of one section special data of decoding (or processing), the NextState that their decisions will be jumped to. Do not have this facility, just can not test any condition that depends on data in the streamline, this is to mean that the given processing block of a certain order arrival (namely be ALU in this situation) is uncertain because two-wire is controlled.
The Huffman decoder is all passed in non-and all instructions. Some instruction may not need data pipeline just can carry out. They tend to is jump instruction. One of microcode ROM is used for selection instruction and whether gives the Huffman decoder. If do not bother to see me out, just not the Huffman decoder does not receive this instruction, even therefore pipeline blocking in these cases carries out also can continue. B.2.4.2 event handling
Two event bit are arranged in the Parser state machine. One is the Huffman event, and another is the Parser event.
The Parser event is the simplest event. " condition " that is being monitored by this event be in the microcode ROM just. Therefore, an instruction just may cause Par-ser event with this position, position. The below is typical case: this instruction writes a suitable constant in the rom_control register, thereby interrupt service routine can determine the reason of interrupting.
Be (if the event of masking is then gone back immediately) after the Parser Event Service, control continues to get back to the place that it leaves again. If the instruction of firing event has one to shift modern (its condition is assessed as very), then shift and normally carry out. Therefore use shifting the method for coding, just may after service, jump to error handler.
The Huffman event is then different a little. The condition that just is being monitored is the "or" of three Huffman error bits. In fact, very similar to the Parser event to the processing of this condition. But, when mistake occurs, be identified from an outer ledger line huffintr-pt of Huffman decoder. This just makes controls the error handling processing that jumps in the micro code program.
Therefore, when the Huffman mistake occurred, sequence relates to producing interrupted and stop piece. After service, control forwards error handling processing to. Do not have " calling " mechanism, it does not resemble ordinary interruption, there should be no return to the place in the microcode before makeing mistakes after error handling processing.
Might in the situation that does not produce the Huffman mistake, assert huffintrpt. This occurs in the situation that B.2.2.3 saves " inerrancy " mistake of discussing. If so, do not send interruptions (to MPI), but control and still forward to (in the microcode) error handling processing. Because in the case Huffman error register zero clearing, so microcode error handling processing device can determine it is to want error handling processing and make corresponding response. B.2.4.3 special element (locations)
Several special elements are arranged in microcode ROM. Four entrances that the unit is main program among the ROM. When resetting, control forwards of these four unit to. The unit that is transferred depends on the coding standard of selecting, coding_std in the ALU register. Because this element itself is by the real reset-to-zero that resets, control forwards zero location to. But also may reset separately with the UPI register-bit CED_H_TRACE_RST in CED_H_TRACE the Parser state machine. The coding_std register does not reset in the case, and control forwards suitable that in four unit to.
Second group of four unit (0X004 to 0X007) uses when the Huffman interruption occurs. Typical case is: put an instruction of transferring to actual error handling processing device in each part of Unit four. The unit also is that the result according to coding standard selects. B.2.4.4 follow the tracks of
Means as auxiliary diagnosis provide follower. It allows microcode to carry out single step run. This operation is controlled in CED_H_TRACE_EVENT position in the CED_H_TRACE register and CED_H_ TRACE_MASK position. As its name suggests, their operation and some normal event bit are very approximate. But, because some differences (especially never producing UPI interrupts) are arranged, they and other event bit are not classified as a class.
When CED_H_TRACE_MASK put 1, follower was connected. From ROM, read every micro-code instruction, but delivering to before the Huffman decoder goes, a secondary tracking event occurring. In the case, CED_H_TRACE_EVENT is 1. Must inquire about it, because can not produce interruption. Whole microcode word can obtain in the register of CED_H_KEY_DMX_WORD_0 to CED_H_KEY_DMX_ WORD_9. If necessary, can carve at this moment the change instruction.Writing 1 to CED_H_TRACE_EVENT just makes instruction carry out and with the CED_H_TRACE_EVENT zero clearing. At the moment soon, when reading the microcode word that next will carry out from ROM, a new tracking event just appears. B.2.5 microcode
The programming of microcode is by using this a kind of very simple instrument of assembler " hpp ", and many extraction work (abstraction) are by finishing with grand preprocessor. But " C " preprocessor " cpp " of Application standard for this reason.
Being described as follows of code:
Ucode.u is master file. At first, it comprises the tokens.h that defines each token. Secondly, the register image (map) of regfile.h definition ALU. All fields in the fields.u definition microcode word, it provides a symbol table that descends definition, and these symbols are corresponding to each possible bit pattern in the field. Secondly, each label of using in the code is defined again. After finishing this step, it is grand that instr.u removes to define " cpp " that be used in a large number defining elementary instruction. Then, the errors.h definition is used for defining the number of each Parser event. Again, the order of laying field in the microcode word is set up in the unword.u definition.
The remainder of ucode.u is micro code program itself. B.2.5.1 refer to the present
This section is described in the various instructions that define among the ucode.u. All instructions are not discussed herein, because in many cases, they are the minor variations (especially ALU instruction) under the same subject. B.2.5.1.1 Huffman and data directory instruction
In the present invention, the H_NOP instruction is used by the Huffman decoder. This is nonoperable instruction. To the data decoding, in this sense, what Huffman do not do. The data that this instruction produces are always zero. Therefore, about being sent to ALU, instruction gets on.
Some instructions of next are token class: H_TOKSRCH, H_TOKSKIP_PAD, H_ TOKSKIP_JPAD, H_TOKPASS and H_TOKREAD. These instructions are all read one or several token from the input shift unit, and they are passed to the remainder of machine. H_TOKREAD reads a single token word. H_TOKPASS can be used to read a complete token, until with the word of zero extension bits and comprise it. Information concerning order repeats each token word. H_TOKSRCH abandons the front all serial datas of token, then reads a token word. H_ TOKSKIP_PAD skips all filling positions (H.261 reaching MPEG), then reads a token word. H_TOKSKIP_JPAD does same thing to JPEG filling position.
H_FLC (NB) reads the fixed-length code of " NB " figure place.
H_VLC (TBL) reads a vlc (table name transmits with memonic symbol, such as H_ VLC (tcoeff)) with the table of pointing out.
H_FLC_IE (NB) is similar to H_FLC, the position, position of still " ignoring mistake (ignore errors) ".
The similar H_VLC of H_TEST_VLC (TBL), but bypass position (bypass bit) set be not so the Huffman index by the data directory unit with changing.
H_FWD_R and H_BWD_R read a FLC, and its bit wide is respectively by ALU register r_fwd_ r_size and r_bwd_r_size indication.
H_DCJ reads each DC coefficient of jpeg format, and table number is from ALU.
H_DCH reads a H.261 DC item.
H_TCOEFF and H_DCTCOEFF read each conversion coefficient. In H_DCTOCOEFF, first coefficient bits set is used for non-matrix, and H_TCOEFF then is used for the DC item and has been read out matrix afterwards.
The table of a later lower dress of H_NOMINATE (TBL) name.
H_DNL (NB) reads each NB position and will install under them in the named table. B.2.5.1.2 ALU instruction
Can not explain in detail all ALU instructions, because their all toos are many. To the basic skills that consist of each memonic symbol be discussed, this should make instruction readable. In addition, these memonic symbols should should be readily appreciated that for the people that the industry has common skill.
Therefore most of ALU instructions and data use general " (load) packs into " instruction from one relevant to the transmission of elsewhere. In memonic symbol A_LDxy, the content of yes the y x that packs into namely lists first purpose, after list the source.
B.2.10, table is used to refer to some letters of possible data source and destination
LetterMeaning
    AA-register
    RThe Run register
    IThe data input
    OData output
    FALU register group
    CConstant
    ZConstant zero
As an example, LDAI is with the data A-register of packing into, and these data are from the ALU input port. If appointment is ALU register group, then memonic symbol will be with the address, so LDAF (RA) is with the content of the register group unit RA A that packs into.
When data were sent to purpose by the source, ALU can change it. If this is the case, arithmetical operation represents with the part of source data. Therefore memonic symbol LDA_AADDF (RA) is the current content of A-register, and the content that adds designating unit in the register group is packed among the A. Another example is LDA_ISGXR, and it gets the input data, and that point out according to the RUN register carries out sign extended, and deposits the result in A-register.
In many cases, same result is specified more than one purpose. Son for example again, LDF_LFA_ASUBC (RA), the result that it deducts a certain constant with the A also load register group of A-register of both having packed into.
Other memonic symbols represent some specific actions. Be used for clear A-register such as " CLRA ", " RMBC " is used for count of macroblocks device (macroblock counter) is resetted. These memonic symbols are quite apparent, and their explanation is arranged in the note of instr.u.
A kind of anomaly is to use suffix " _ 0 ", and it is used to refer to except normal behaviour, also operating result is delivered to the token formatter, so LDFI_O (RA) stores the input data and it is delivered to the token formatter. Otherwise if be ready, this can be LDF_LDO_I (RA). B.2.5.1.3 token formatter instruction
T_NOP " inoperation " instruction. This is the title of improper use, because can not set up a no-operation instruction (NO-OP). But when the instruction of board formatter is inessential so far because of the ALU no-output, just use it.
T_TOK exports a token word.
The defeated DATA token word of T_DAT (only using together with some Huffman state machine instructions).
T_GENT8 produces the token word based on 8 bit constant fields.
T_GENT8E is similar to T_GENT8, but extension bits is 1.
The NB position of T_OPD (NB)-data is from the minimum NB position of output, and all the other positions are from emit field.
T_OPDE (NB) is similar to T_OPD, but extension bits is high.
T_OPD8-is writing a Chinese character in simplified form of T_OPD (8).
T_OPD8E-is writing a Chinese character in simplified form of T_OPDE (8). B.2.5.1.4 Parser state machine instruction
The D_NOP no-operation instruction (NO-OP): the address is increment as usual, and the Parser state machine is not then done any additional thing. The remaining section of instruction is sent to data pipeline. Do not occur waiting for.
D_WAIT and D_NOP are similar, but etc. arrival to be feedback.
Simple transfer instruction group. If condition satisfies, the memonic symbol such as D_JMP (ADDR) and D_JNX (ADDR) just produces and shifts. Instruction does not output to the Huffman decoder.
Outside transfer instruction group. Its memonic symbol such as D_XJMP (ADDR) and D_XJNX (ADDR). They are similar to top simple counter part, but instruction outputs to the Huffman decoder.
Shift and the wait instruction group. Its memonic symbol such as D_WJNZ (ADDR). These instructions are output the decoder to Huffman, and before evaluation condition, Parser waits for the feedback from ALU.
Following each memonic symbol is used for some conditions itself.
B.2.11, table is used for the memonic symbol of expression condition
Memonic symbolMeaning
JMPUnconditional jump
JXT  JNXIf extn=1 (extn=0) then redirect
JHE0  JNHE0IfHuffman error bit 0 set (zero clearing) then redirect
JHE1  JNHE1If Huffmanerror bit 1 set (zero clearing) then redirect
JHE2  JNHE2If Huffmanerror bit 2 set (zero clearing) then redirect
JPTNIf the least significant bit set of pattern shift unit then redirect
JPICST  JNPICSTIn visual starting point (not in visual starting point) redirect
JRSTST  JNRSTSTIf do not restart interval starting point then redirect in (not existing)
 JNCPBSSuch as the then redirect of encoding without special CPB
 JNCPB8Then redirect of macro block if not 8 (namely if 4)
JMI  JPLAs being negative (for just) then redirect
JZE  JNZAs be zero (non-zero) then redirect
B.2.11, table is used for the memonic symbol (continuing) of expression condition
Memonic symbolMeaning
JCHNG  JNCHNGIf change detecting position set (zero clearing) then redirect
JMBST  JNMBSTAs in the then redirect of (not existing) macro block starting point
D_EVENT causes producing an event.
D_DFLT is used for setting up a default instructions. It causes an event, then transfers to label and is the unit of " dflt ". This instruction never should be carried out, and enters sunken because they are used for taking ROM so that transfer to a garbage.
D_ERROR causes an event, then transfers to label " srch_dispatch ", recovers from mistake with attempt. B.3 Huffman decoder ALU foreword B.3.1
According to the present invention, subassembly Huffman decoder ALU provides conventional arithmetic sum logic function for the Huffman decoder component. It can be done various additions and subtraction operation, the operation of various types of sign extended and will input data format and change into three layers of run-sign-level (triples). It also has a flexibly structure, and it operates accurately with form and is specified by a microinstruction word, and this microinstruction word arrives ALU synchronously with the input data, namely is under the control of two-wire interface.
Except the input port of 36 bit instructions and 12 bit data, ALU also has 6 run port and 8 constant port (in fact the latter resides on the token bus). Except microinstruction word, all these ports all pass through the ALU data path and go to drive their separately buses of width. The run-sign-level (out_data) that has independent of the extension bits of representing, extension bits to connect 17 in microinstruction word exports together. At the two ends of ALU data path a two-wire interface is arranged respectively. One group of condition code is exported together in company with their useful signal cc_valid. Also have a register group, other Huffman decoder subassembly can be accessed it by ALU, and microcomputer interface also can be accessed it. B.3.2.2 basic structure
The basic structure of Huffman ALU is shown in Figure 126. It comprises following part:
Input block 400
IOB 401
Condition code block 402
" A "register 403, its source are multichannel
Run register (6) 404, its source are the multichannel input
Adder/subtracter 405, its source are the multichannel input
Signextended logic 406, its source are the multichannel input
Register group 407
Each of these pieces (except the IOB) is driven into its output on the bus that runs through data path, and these buses are used as the multichannel input in piece source conversely. For example, adder has its data path bus, and this bus is some may one of the inputting of A-register. Similarly, A-register has its bus, and this bus forms the some of adder and may one of input. The subset that only has in this respect all possibilities describes in detail in the 7th joint microinstruction word.
In single cycle, may carry out one take addition as the instruction on basis or carry out one take the instruction of sign extended as the basis. Can allow in addition within the monocycle, to carry out simultaneously these two instructions, as long as their operation is strictly parallel. In other words, do not allow the command sequence of addition behind escape character after the first addition or the first escape character. Within the monocycle, the register group can or be read or write, but simultaneously read and write.
The output data have three fields:
The run-6 position
The sign-1 position
The level-10 position
If data remain directly by ALU, minimum effective 11 of the input data register are latched into sign and level field.
Might programme to limited the multicycle operation of ALU. In this respect, required periodicity is provided by the content of unit, register group place, and the address of register group position is then specified by microcommand; When iteration count reduced to 1, same operation repeated. This convenience typically is used for realizing moving to left: use adder with A-register content self addition, then the result is deposited and get back in the A-register. B.3.3 adder/subtracter sub-block
This is the adder of 12 bit wides, can select and will on the occasion of be set to negative value, carry be counted (carry-in bit) can select set or not set its input2. Output be one 12 and, do not utilize its carry output (carry-out). Seven kinds of modes of operation are arranged:
ADD: full add method, carry digit zero setting: input1+input2
ADC: full add method, carry position 1:input1+input2+1
The SBC:input2 negate, carry digit zero setting: input1-input2-1
The SUB:input2 negate, carry position 1:input1-input2
TCI: if SUB is used in input2<0, otherwise use ADD.
This mode is set to zero with input1 for the size that is worth quantity from twos complement during use.
DCD (DC is poor): if inpt2<0 then carry out ADC, otherwise would carry out ADD.
VRA (vectorial residual value adds): if input1<0 then carry out ADC, otherwise would carry out SBC. B.3.4 sign extended sub-block
This is one 12 unit, and it carries out sign extended to the input data by different way according to the size input. Size is 4 bits, and its value is 0 to 11 (0 corresponding to least significant bit, and 11 corresponding to highest significant position). Output is 12 bit data value that changed, and adds " symbol " position.
In SG * MODE=NORMAL mode, all more than the size position, (contain the size position) everybody get the value of size position. All are following, and everybody keeps constant. Symbol is got the value of size position. For example:
Data (data)=1,010 1,010 1010
size=2
Output (output) 0,000 0,000 0010, sign (symbol)=0
In SG * MODE=1NVERSE mode, all more than the size position, (contain the size position) everybody get the anti-value of size position, and all following everybody keep constant. Symbol is got the anti-value of size position. For example:
data=1010 1010 1010
size=0
output=1111 1111 1111,sign=1
In SG * MODE=DIFMAG mode, if the size position is zero, then all every negates that below the size position, (contain the size position), and all above everybody reservations are constant. If size is 1, everybody keeps constant for all. Two kinds of situations, symbol is all got the anti-value of size position. The manner is used for obtaining the size of each AC difference. For example:
data=0000 1010 1010
size=2
output=0000 1010 1101,sign=1
data=0000 1010 1010
size=1
output=0000 1010 1010,sign=0
In SG * MODE=DIFCOMP mode, all more than the size position, (do not contain the size position) everybody get the anti-value of size position, and all following everybody (containing size) keep constant. Symbol is got the anti-value of size position. The manner is used for obtaining the twos complement value of each DC difference. For example:
data=1010 1010 1010
size=0
Output=1111 1,111 1110, sign=1 are condition code B.3.5
Employed each condition code of Huffman piece has two bytes (16), and wherein some position is produced by ALU/ register group. These condition codes are: Sign (symbol) condition code, Zero (zero) condition code, Extension (expansion) condition code and variation detecting position (Change Detect bit). Because Parser is several different from other to the usage of last two kinds of codes, these last two kinds is not real condition code.
Sign, Zero and Extension condition code are upgraded when Parser sends a update instruction; Corresponding to each bar of these instructions, the condition code useful signal forms a positive pulse.
The Sign condition code is the latching of sign output of sign extended only, if A-register is input as zero then the Zero condition code puts 1. The Extension condition code is the input expanding position of latching, and is irrelevant with OUTSRC.
Condition code can be used for evaluating some condition type:
The result equals constant-use subtraction and Zero condition
The result equals register value-use subtraction and Zero condition
Register equals constant-use subtraction and Zero condition
Register-bit set-use Sign (symbol) expansion and Sign condition
Result bits set-use Sign (symbol) and Sign condition
Note when using the combination of sign extended and Sign condition code, only may evaluate single specific bit rather than evaluate like that a plurality of with usual logic ' with '.
Change in the present invention detecting position by the logic generation same with the Zero condition code, but the useful signal that it is not followed. Different from already present value (this means needs two clock cycle if in the microcommand one indicates the current value that is writing the register group,period 1 is put READ with REG-MODE, put WRITE with REGMODE second round), just should upgrade the variation detecting position. If detect a changing value, just begin subsequently microprocessor and interrupt. Changed detecting position and reset by common its effective method that makes, but REGMODE was set to READ this moment.
Hard wired count of macroblocks device structure (part of register group-see following explanation) also produces following condition code: Mb_Start, Pattern_Code, Restart and Pic_ Start. B.3.6 register group (Register File)
The address mapping table of register group is following to be shown. It has used ALU data path and UPI to share seven address space, and some address is that ALU does not access, because the counter in these normally hard wired macroblock structure or at the register of ALU self inside. The latter has special-purpose reference address, but it is the part of UPI address mapping table. (with " 0 " expression of amplifying, there is single ALU address some multibyte unit, but a plurality of UPI address is arranged in the table. Similarly, having several classes to be used as single place with the register of component counting (component count) CC (representing with I ") index by ALU in table treats. This just brings convenience for the little programming that initializes, resets (resetting) and macro block operates.
Except (UPI is read-only) ALU register of some special uses, all unit are that read-write is dual-purpose, and all counters all are reset to zero by one in the coding line. Mode code (Pattern code) the register ability that moves to right, its least significant bit forms the Pattern_Code condition bit. With abbreviation " M " expression of Macro, those register notes that double as counter (n position) usefulness are Cn to all registers in hard wired macroblock structure in table.
In the present invention, the content of some unit is hardwired to the other parts of Huffman Sub-system Number (subsystem_coding) standard, they are: two r-size places, and the individual unit (2 words) shown of every the ac huff table that goes to Huffman Decoder and dc huff.
Can be by ALU and UPI access with some addresses that black matrix represents, other can only be by the UPI access. Be not to guide some the register classes by CC can have one by the single ALU address of coding line appointment by ALU, CC will select to want which physical location of access. The ALU address is any address of this class register, although usually should be with first address. The situation of multibyte unit is also like this, should use the address to minimum that in (pair), although in fact all satisfy enough with two addresses. Attention,unit 2E and 2F can be accessed (with " T " expression) in highest (top-level) ATT, namely not only by some keyhole (lockhole, demonstration inside information) register. This unit, two places also is reset to zero.
In order to improve access speed, the register group is divided into four " groups " (ban-ks) physically, but this has no effect to addressing. Master meter points out to be applicable to the address assignment of MPEG, has the part form of repetition to provide respectively for JPEG and difference H.261 with latter two.
The addressThe placeThe addressThe place
00 A register1   I  3E  c2
01 A register0   I  3F  c3
02 run   I.O  40  dc pred_01
10 horiz pels1   I.O  41  dc pred_00
11 horiz pels0   I.O  42  dc pred_11
12 ven pels1   I.O  43  dc pred_10
13 ven pels0   I.O  44  dc pred_21
14 buff size1   I.O  45  dc pred_20
15 buff size0   I.O  46  dc pred_31
16 pel asp.ratio   I.O  47  dc pred_30
17 bit rate2   O  50  prev mhf1
18 bit rate1   O  51  prev mhf0
19 bit rate0   O  52  prev mvt1
1A pic rate   O  53  prev mvt0
1B constrained   O  54  prev mhb1
1C picture type   O  55  prev mnb0
1D H261 picture type   O  56  prev mvb1
1E broken closed   O  57  prev mvb0
1F pred mode   M  60  mb horiz cnt1 C:3
20 vbv delay1   M  61  mb horiz cnt0 -
21 vbv delay0   M  62  mb vert cnt1 C:3
22 full pel fwd   M  63  mb vert cnt0 -
23 full ped bwd   M  64  horiz mb1
24 horiz mb copy   M  65  horiz mb0
25 pic number   M  66  vert mb1
26 max h   M  67  vert mb0
27 max v   M  68  restan count1 C:16
28 -   M  69  restan count0 -
29 -   M  6A  restan gap1
2A -   M  6B  restan gap0
2B -   M  6C  horiz bix count C2
2C first group   M  6D  ven bk count C2
2D in picture   H.M  6E  comp id C2
  T.R 2E rom controt   M  6F  max comp id
  T.R 2F rom revision   H.R  70  coding std
  I.H 30 dc huff0   M.H  71  panern code SR8
  I 31 dc huff1   H  72  fwd r size
  I 32 dc huff2   H  73  bwd r size
  I 33 dc huff3
  I.H 34 ac huff0
  I 35 ac huff1
  I 36 ac huff2   M.I  78 h0
B.3.1, table is shown 1:Huffman register group address transfer table
    I  37 ac huff3    M.I  79  h1
    I  38 tq0    M.I  7A  h2
    I  39 tq1    M.I  7B  h3
    I  3A tq2    M.I  7C  v0
    I  3B tq3    M.I  7D  v1
    I  3C c0    M.I  7E  v2
    I  3D c1    M.I  7F  v3
B.3.1, table is shown 1:Huffman register group address transfer table (continuing)JPEG difference
10 horiz pels1
11 horiz pels0
12vert pels1
13vert pels0
14 buff size1
15buff size0
16 pel asp.ratio
17 bit rate2
18 bit rate1
19 bit rate0
1A pic rate
1B constrained
1C picture type
1D H261 picture type
1E broken closed
1F pred mode
20 vbv delay1
21 vbv delay0
22 pending frame ch
23restart index
24horiz mb copy
25pic number
26max h
27 max v
28 -
29 -
2A -
2B -
2Cfirst scan
2D in picture
2E rom control
2F rom revision
Show B.3.2 JPEG difference
H.261difference
10 horiz pels1
11 horiz pels0
12vert pels1
13vert pels0
14 buff size1
15buff size0
16 pel asp.ratio
17 bit rate2
18 bit rate1
19 bit rate0
1A pic tate
1B constrained
1C picture type
1D H261 picture type
1E broken closed
1F pred mode
20 vbv delay1
21 vbv delay0
22full pel fwd
23 full pel bwd
24horiz mb copy
25pic number
26max h
27 max v
28 -
29 -
2A -
2B in gob
2Cfirst group
2D in picture
2E rom control
2F rom revision
Table is difference H.261 B.3.3
 2C  first group
 2D  in picture
 2E  rom control
 2F  rom revision
B.3.7 microinstruction word
According to the present invention, the ALU microinstruction word is divided into several fields, and each field is being controlled a different aspect of said structure. The total bit that uses in the coding line was 36 (add inaddition 1 and do the extension bits input), adopted the minimum field of striding to encode, so kept the maximum flexibility of hardware configuration. The detailed description that coding line is cut apart sees below. Default field value, namely those values that do not change ALU or register group state provide with the italic printed words.
B.3.4, table is shown 2:Huffman ALU microinstruction fields field value detail bit OUTSRC RSA6 operation, symbol, (it refers to ZZA zero in A-register as6 position 0000, zero, A-register 0001 is decided run) sign, ZZA8 zero, zero, the ZZADDU4 zero, zero of minimum effectively (ls) 8 the 0010 level output of A-register, the highest effectively (ms) 0011 of adder output
4 potential sources) ZINPUT zero, input data 0100
The RSSGX operation, symbol, sign extended output 0111
The RSADD operation, symbol, adder output 1000
B.3.4, table is shown 2:Huffman ALU microinstruction fields (continuing) field value detail bit
The RZADD operation, zero, adder output 1001
RIADD inputs operation, and zero, adder output
ZSADD zero, symbol, adder output 1010
ZZADD zero, zero, adder output 1011
NONE is used for the ALU access without effective output-out_valid zero setting 11XX REGADDR 00-7F 7 REGSRC ADD of register group address are driven intoregister group input 0 with adder output
On
SGX is driven into register group defeated 1 with sign extended output
REGMODE READ reads 0 from the register group on entering
1 CNGDET TEST REGMODE is WRITE if WRITE writes the register group, then upgrades to change 0
Change detected detects HOLD and does not upgradevariation detecting position 1
Detect the 0 RUNSRC RUNIN that resets and move input and be driven into and move 0 of register input if CLEAR REGMODE is READ then change
Upper (operation source) ADD is driven into operation register defeated 0 with adder output
On
B.3.4, table is shown 2:Huffman ALU microinstruction fields (continuing) field value detail bit RUNMODE LOAD and is upgradedoperation register 0
HOLD do not upgrade operation register 1 ASRC ADD adder output be driven on the A-register input 00 (A deposit INPUT will input data be driven into A-register input on 01 device source)
SGX is driven into 10 of A-register input with sign extended output
On
REG is driven into 11 of A-register input with the output of register group
Upper AMODE LOAD upgrades A-register 0
HOLD does not upgrade A-register 1 SGXMODE NORMAL sign extended and gets with value 00 (symbol expansion INVERT sign extendedinverted value 01 exhibition mode, if the DIFMAG sign bit is 0 then hangs down every negate 10 and see 3.4 joints) the DIFCOM sign extended,inverted value 11 SIZESRC CONST are driven into sign extended size defeated 00 with the constant input from next high position
(symbol expansion A is driven into sign extended size with A-register and inputs on 01 exhibition on entering
B.3.4, table is shown the defeated REG of 2:Huffman ALU microinstruction fields (continue) field value detail bit SIZE the output of register group is driven into the source that signextended size 10 enters on the input) RUN will move register and be driven into sign extended size and fail 11
SGXSRC INPUT will input data and be driven into sign extendedsize input 0 on entering
On (sgx input) A A-register is driven into 1 of sign extended size input
UpperADDMODE ADD input 1+inputs 2 000 (adder ADC input 1+input 2+1 001 mode is seen 3.3 joints) SBC input 1-input 2-1 010
SUB input 1-inputs 2 011
IfTCI inputs 2<0 then SUB, otherwise thebenefit 100 of ADD-2
Code
IfDCD inputs 2<0 then ADC, otherwise ADD-DC differs from 101
IfVRA inputs 1<0 then ADC, otherwise SBC-vector 110
Residual value adds ADDSRC1 A and A-register is driven on theadder input 1 00 (adder REG is driven into adder with the output of register group and inputs on 1 01 inputs
B.3.4, source-INPUT that table is shown 2:Huffman ALU microinstruction fields (continue) fieldvalue detail bit 1 will input data and be driven into 10 not negates on the adder input 1) ZERO will zero be driven into constant is driven on theadder input 2 00 (input get A A-register is driven into adder input on the 2 01 anti-source) INPUT of 11 ADDSRC2 CONST on theadder input 1 and will input data and be driven into adder and input on 2 10
REG is driven into 11 ofadder input 2 with the output of register group
Upper CNDCMODE TEST upgrades each condition code 0 (condition code) HOLD and do not upgrade eachcondition code 1 each counter of CNTMODE NOCOUNT and do not rise in value * and 00 (macro block knot BCINCR block counter and pulse (ripple) increment 001 structure counting mode) CCINCR component counting forces increment 010
RESET resets 011 with all counters in the macroblock structure
DISABLE forbid allcounters 1 * * INSTMODE MULTI present instruction repeatedly select thegeneration 0
SINGLE only is B.4 buffer manager (Buffer Manager) foreword B.4.1 of one-cycle instruction 1
According to the present invention, the purposes of presents explanation buffer manager (bman), effect and concrete device thereof. B.4.2 general introduction
Buffer manager provides four addresses for the DRAM interface. These addresses are the page address among the DRAM. The DRAM interface contains two FIFO, coded data buffer and the token data buffer in DRAM. Therefore, be these four addresses, each buffer respectively has one to read address and a write address. B.4.3 interface
Buffer manager is only received DRAM interface and microprocessor. Microprocessor just is used for arranging each " initialization register " (seeing Table B.4.4). With the interface of DRAM interface be four 18 bit address, each address is by request/response (REQuest/ACKnowledge) agreement control. (because buffer manager is not among data path, so buffer manager does not have two-wire interface. )
In addition, the operation of buffer manager need not (operates off) DRAM interface clock generator, but uses (on) DRAM interface scans chain (scan chain). B.4.4 address computation
Read address and the write address of each buffer are produced by 9 18 register:
Initialization register (according to the microprocessor read-write)
BASECB-coded data base address buffer
The heap(ed) capacity of LENGTHCB-coded data buffer (unit: page or leaf)
BASETB-token data base address buffer
The heap(ed) capacity of LENGTHTB-token data buffer (unit: page or leaf)
LIMIT-DRAM capacity (unit: page or leaf)
Dynamic register (read-only from microprocessor)
READCB-is with respect to the coded data buffer read pointer of BASECB
NUMBERCB-is with respect to the coded data buffer write pointer of READCB
READTB-is with respect to the token data buffer read pointer of BASETB
NUMBERTB-is with respect to the token data buffer write pointer of READTB
The address computation formula:
Readaddr (reading the address)=(BASE+READ) mod LIMIT
Writeaddr (reading the address)=(((READ+NUMBER) mod LENGTH)+BASE) mod LIMIT
Used " mod LIMIT " item is because buffer can center on DRAM overlapping (Wrap around DRAM). B.4.5 piece explanation
In the present invention, shown in Figure 127, buffer manager is comprised of three top layer (top level) modules that are connected into annular, ring spy out the connection that device (snooper) is monitoring the DRAM interface. These modules are bmprtize (priorizations), bminstr (instruction) and bmrecalc (recomputating), and they are arranged in the ring in this order, also have bmsnoop (spying out device) to be arranged in each address output end. Module Bmprtize processes FULL/EMPTY (full/sky) sign of REQ/ ACK agreement and each buffer, and it also contains the state of each address, namely " this is effective address? " According to this information, its regulation bminstr should recomputate which address (supposition has). It also manages BUF_CSR (state) microprocessor registers, indicates each FULL/EMPTY sign; And management buf_ access microprocessor registers, and the control microprocessor is to the write access of each register of buffer manager.
Inform calculated address once bmprtize, module Bminstr just sends six instructions (per two cycles are sent out) and goes to calculate certain address with control bmrecalc.
Module Bmrecalc recomputates the address under the instruction of bminstr. Instruction of per two periodic duties of this module, it comprises all initialization registers and dynamic register, also comprises a simple ALU that can do addition, subtraction and modulo operation. Finish address computation or it when detecting the FULL/EMPTY state when it, just these states are informed bmprtize. B.4.6 the specific implementation of piece Bmprtize B.4.6.1
At reset mode, the buf_access microprocessor registers is set to 1 to allow the setting of each initialization register. When buf_access reads back 1 the time, the not calculating of enabling address is not because there are these calculating of effective initialization register just meaningless.
In case buf_access is removed affirmation (de_asserted) (it is write zero), bmprtize just sets about making all addresses for effective (with recomputating these addresses), because its purposes keeps all four addresses all effective exactly. At present, buffer-manager is " startup " (does not namely also calculate all addresses), and therefore request is not identified. In case all addresses become effectively, startup stage, namely come to an end, and all requests are identified. From then on after, when certain address become when invalid (because it by with mistake and responded), this address is just recomputated.
Need never priority level is decided in each address, because the ability of DRAM interface is the fastest per 17 cycles to use an address, and the ability of buffer manager is per 12 cycles to recomputate an address. So always only have an address invalid behind each the startup. Therefore will to recomputate any be not the invalid address of current calculating to bmprtize.
In the present invention, when buf_access was identified, startup will reenter, and therefore, in each time of microprocessor during the visit, did not have the address and offered the DRAM interface. B.4.6.2 Bminstr
The Bminstr module contains the counter (producing the periodicity that the address is used) in 12 cycles of mould. Notice that even cycle starts an instruction, and odd cycle finishes an instruction. The highest 3 together with this read to calculate or write calculating be decoded into bmreca-lc as giving an order:
Be used for reading the address:
B.4.1, the computing cycle operation A bus B bus that table is read the address is meaning 0-1 ADD READ BASE 2-3 MOD accumulator LIMIT address 4-5 ADD READ " 1 " 6-7 MOD Accum LENGTH READ 8-9 SUB NUMBER " 1 " NUMBER 10-11 MOD " 0 " the Accum SET_EMPTY of outcome symbol as a result
(NUMBER>=C) is used for write address:
B.4.2, the computing cycle operation A bus B bus that table is used for write address is meaning 0-1 ADD NUMBER READ 2-3 MOD Accum LIMIT 4-5 ADD Accum BASE 6-7 MOD Accum LIMIT Address 8-9 ADD NUMBER " 1 " the NUMBER 10-11 MOD Accum LENGTH SET_FULL of outcome symbol as a result
                                      (NUMBER>=
LEHGTH) annotate: the result of last operation always is retained in the accumulator.
When not having the address that need to recomputate, cycle rate counter is idle in null value, therefore produces an instruction that does not write any register. This does not produce any impact. B.4.6.3 Bmrecalc
Per two clock cycle of module Bmrecalc are finished once-through operation. In an even number counter cycle (start_alu_cyc), which buffer the instruction that it will be come by bminstr (and is, input or output) latch into, (latch operating result in the end _ alu_cyc) an odd number counter cycle. Operating result in being stored in the specified register of instruction, also always be stored in the register that is named as " Accum ". In addition, in the end_alu_cyc cycle, bmrecalc informs that the use of the address that bmprtize has just calculated can make buffer completely or is empty, and when address and full/sky have successfully been calculated load_addr).
Full/sky is to utilize the sign bit of operating result to calculate.
Module operation is not real modular arithmetic, and A mod B is achieved in that
(A>B?(A-B):A)
Yet this just just makes mistakes in following occasion:
A>(2B-1)
This occurs never. B.4.6.4 Bmsnoop
Module Bmsnoop is comprised of four 18 super mouse (snoopers), and each address that provides to the DRAM interface is being provided for they. Mouse must be " super " (namely can be accessed when clock moves), to test on the chip of permission to outside DRAM. These mouses must be worked in REQ/ACK (request/response) mode, so different from other mouse on device uses.
At this interface use REQ/ACK and without the two-wire agreement, because its transmission information (i.e. response) turns back to transmit leg and do not receive. So this has just strictly monitored each FIF0 pointer. B.4.7 register
In order to obtain microprocessor to the write access of each initialization register, should write 1 to buf_ access, just then giveaccess 1 the time when buf_access reads back. Otherwise, abandon the write access of microprocessor, should write zero to buf_access. When reading back, buf_access just accesses zero the time. Notice that buf_access is reset to 1.
Each dynamic register of the present invention and initialization register can at any time read. But in order to ensure unlikely (the are not changing) microprocessor that changes gradually of each dynamic register, just must obtain write access.
Only plan each initialization register is write once. They are write again may cause each buffer operation incorrect. But will increase buffer length (length) in the time of can giving meter on-the-fly and give meter to allow buffer manager use new length in due course.
Row checks to guarantee its gear to actual circumstances (as guaranteeing that each buffer does not overlap mutually) during never to the value in each initialization register. This is user's responsibility.
Show the B.4.3 non-lockhole register of buffer manager
Register nameUsageThe address
  CED_BUF_ACCESS   xxxxxxxD   0x24
  CED_BUF_KEYHOLE_ADDR   xxDDDDDD   0x25
  CED_BUF_KEYHOLE   DDDDDDDD   0x26
  CED_BUF_CB_WR_SNP_2   xxxxxxDD   0x54
  CED_BUF_CB_WR_SNP_1   DDDDDDDD   0x55
  CED_BUF_CB_WR_SNP_0   DDDDDDDD   0x56
  CED_BUF_CB_RD_SNP_2   xxxxxxDD   0x57
  CED_BUF_CB_RD_SNP_1   DDDDDDDD   0x58
  CED_BUF_CB_RD_SNP_0   DDDDDDDD   0x59
  CED_BUF_TB_WR_SNP_2   xxxxxxDD   0x5a
  CED_BUF_TB_WR_SNP_1   DDDDDDDD   0x5b
  CED_BUF_TB_WR_SNP_0   DDDDDDDD   0x5c
  CED_BUF_TB_RD_SNP_2   xxxxxxDD   0x5d
  CED_BUF_TB_RD_SNP_1   DDDDDDDD   0x5e
  CED_BUF_TB_RD_SNP_0   DDDDDDDD   0x5f
D represents register-bit in the table, and X represents not to be register-bit.
Table is the register in the buffer manager lockhole B.4.4
The lockhole register nameUsageThe lockhole address
  CED_BUF_CB_BASE_3   xxxxxxxx   0x00
  CED_BUF_CB_BASE_2   xxxxxxDD   0x01
  CED_BUF_CB_BASE_1   DDDDDDDD   0x02
  CED_BUF_CB_BASE_0   DDDDDDDD   0x03
  CED_BUF_CB_LENGTH_3   xxxxxxxx   0x04
  CED_BUF_CB_LENGTH_2   xxxxxxDD   0x05
  CED_BUF_CB_LENGTH_1   DDDDDDDD   0x06
  CED_BUF_CB_LENGTH_0   DDDDDDDD   0x07
  CED_BUF_CB_READ_3   xxxxxxxx   0x08
  CED_BUF_CB_READ_2   xxxxxxDD   0x09
  CED_BUF_CB_READ_1   DDDDDDDD   0x0a
  CED_BUF_CB_READ_0   DDDDDDDD   0x0b
  CED_BUF_CB_NUMBER_3   xxxxxxxx   0x0c
B.4.8 verification
Verification is carried out in Lsim, is added to some small-sized FIFO on (dummy) DRAM interface of simulation, uses the C code as the part of top layer chip simulation. B.4.9 test
The test that covers bman is to carry out by each mouse (snoopers) in the bmsnoop, each dynamic register (seeing B.4.4 joint) and with scan chain. This scan chain is the part of DRAM interface scans chain. B.5 reverse model device (Inverse Modeler) foreword B.5.1
Purposes, effect and the specific implementation of presents illustration of reverse moulding device (imodel) and token formatter (hsppk) according to the present invention.
The lockhole register nameUsageThe lockhole address
  CED_BUF_CB_NUMBER_2   xxxxxxDD   0x0d
  CED_BUF_CB_NUMBER_1   DDDDDDDD   0x0e
  CED_BUF_CB NUMBER_0   DDDDDDDD   0x0f
  CED_BUF_TB_BASE_3   xxxxxxxx   0x10
  CED_BUF_TB_BASE_2   xxxxxxDD   0x11
  CED_BUF_TB_BASE_1   DDDDDDDD   0x12
  CED_BUF_TB_BASE_0   DDDDDDDD   0x13
  CED_BUF_TB_LENGTH_3   xxxxxxxx   0x14
  CED_BUF_TB_LENGTH_2   xxxxxxDD   0x15
  CED_BUF_TB_LENGTH_1   DDDDDDDD   0x16
  CED_BUF_TB_LENGTH_0   DDDDDDDD   0x17
  CED_BUF_TB_READ_3   xxxxxxxx   0x18
  CED_BUF_TB_READ_2   xxxxxxDD   0x19
  CED_BUF_TB_READ_1   DDDDDDDD   0x1a
  CED_BUF_TB_READ_0   DDDDDDDD   0x1b
  CED_BUF_TB_NUMBER_3   xxxxxxxx   0x1c
  CED_BUF_TB_NUMBER_2   xxxxxxDD   0x1d
  CED_BUF_TB_NUMBER_1   DDDDDDDD   0x1e
  CED_BUF_TB_NUMBER_0   DDDDDDDD   0x1f
  CED_BUF_UMIT_3   xxxxxxxx   0x20
  CED_BUF_UMIT_2   xxxxxxDD   0x21
  CED_BUF_UMIT_1   DDDDDDDD   0x22
  CED_BUF_UMIT_0   DDDDDDDD   0x23
  CED_BUF_CSR   xxxxDDDD   0x24
Table is the register in the buffer manager lockhole B.4.4
Annotate: hsppk is the part of Huffman decoder in system, but is the part of reverse moulding device in function. So in this section, discussed relatively good. B.5.2 general introduction
Token buffer is between imodel and the hsppk, and it can hold mass data, and these data are all outside sheet among the DRAM of (off_chip). In order to ensure effectively using this memory, data must be 16 bit formats. Formatter will (packs) become to be suitable for this form of token-caching device from the data " packing " of Huffman decoder. Thereafter, reverse model device " unpacks (unpacks) " out with data from token-caching device form.
Yet the major function of reverse model device is that " run/level (operation/level) code expands out, becomes the run of a remainder certificate, a follow-up level with each. In addition, reverse model device guarantees that each DATA token has 64 coefficients at least, and it also provides one " door (gate) " to be used for stopping never to satisfy those streams (streams) of their startup criterion. B.5.3 interface HSppk B.5.3.1
In the present invention, Hsppk have the Huffman decoder as the input and the token-caching device as output. These two interfaces all are the two-wire types, and input is one 17 token ports (Port), and output is 16 " packing datas ", adds a FLUSH signal. Therefore the clock of Hsppk is connected to the Huffman scan chain from the Huffman clock generator in addition. B.5.3.2 Imodel
Imodel has token-caching device startup out gate logic (start_up output gate logic) (bsogl) to export as input and inverse quantization device (Inverse Quantizer). Input from the token-caching device is 16 " data of packing ", adds block_end (block end) signal, and from bsogl is a wirestream_enable signal. Output is 11 token ports. Total interface all is subjected to two-wire interface agreement (protocol) control. Imodel has its clock generator and scan chain.
Microprocessor only has access to each mouse of their output place to Imodel and these two pieces of Hsppk. B.5.4 piece illustrates B.5.4.1 Hsppk
Hsppk is taken into the l7 bit data from Huffman, and exports so far board buffer of 16 bit data. This process is finished like this: at first, will input data truncation (truncating) or cut apart (splitting) and become some words of 12, and secondly these words will be packaged into 16 form. B.5.4.1.1 cut apart
Hsppk accepts 17 bit data from Inverse Huffman. Use following various forms that this data format is changed into 12.
F=specified format wherein; The E=extension bits; R=moves the position; L=length position (arranging according to signed magnitude) or non-data token position; X=need not be concerned about.
           FLLLLLLLLLLLFormat 0
         ELLLLLLLLLLLFormat 0a
           FRRRRRR00000Format 1
Normal token only occupies minimum 12, and its form is:
           ExxxxxxLLLLLLLLLLL
This is truncated into form 0a. But data token has a run and a level in each word, its form is:
           ERRRRRRLLLLLLLLLLL
This is divided into following form:
      ERRRRRRLLLLLLLLLLL->FRRRRRR00000Format   1
ELLLLLLLLLLLFormat 0a or, if run is zero, then using form 0:
     EOOOOOOLLLLLLLLLLL->FLLLLLLLLLLLFormat 0
Can find out that extension bits is disappeared inform 0, suppose that it is 1. Therefore, when being zero, extension bits just can not useform 0. In the case, usingform 1 unconditionally. B.5.4.1.2 packing (packing)
All data words after cutting apart all are 12 width. Per four 12 words are packaged into three 16 words:
Show B.5.1 packaging method
Each input wordEach output word
    000000000000     0000000000001111
    111111111111     1111111122222222
    222222222222     2222333333333333
    333333333333
B.5.4.1.3 the flushing of buffer (flushing)
DRAM interface of the present invention afterwards, writes buffer with it again in one of collection (words of 32 16 " through packings "). This means that if block is just partly complete, data can be trapped in the DRAM interface at the end of a stream (stream). Therefore need a kind of flushing skill. Accordingly, Hsppk signaling DRAM interface is unconditionally write the complete piece of its current part. B.5.4.2.1 Imup (UnPcker, de-packetizer) Imup finishes three functions:
1) data is unpacked, become 12 words from its 16 bit formats.
B.5.2, the method that table unpacks
Input wordOutput word
    0000000000001111     000000000000
    1111111122222222     111111111111
    2222333333333333     222222222222
    333333333333
2) during removing, the token-caching device keeps correct data.
When DRAM interface during with the method flush that unconditionally writes when the complete piece of forward part, rubbish (rubbish) data are retained in the piece. Imup must delete junk data, namely deletes all from the data of FLUSH token, until the end of piece.
3) save the data in the original place until starting criteria (start_up Criteria) is satisfied
From the condition of this piece output data be: to each homogeneous turbulence not, receive " effectively " signal (stream_enable) from buffer starting (Buffer Startup). So 12 bit data are output to hsppk. B.5.4.2.2 Imex (Expander expander)
In the present invention, it is zero run that Imex is extended for some numerical value with all run-length codes (run length codes), a follow-up level. B.5.4.2.3 Impad (PADder, filler)
Impad guarantees that all data token bodies contain 64 (or more) word. It is in the last word of token filling some 0 accomplish this point. Not check that it has word more than 64 in vivo to the data token. B.5.5 the realization of piece HSppk B.5.5.1
In typical case, cut apart and be packaged in the single cycle and finish. B.5.5.1.1 cut apart (splitting)
At first must determine form
  IF(datatoken)          IF(lastformat==1)use format 0a;          ELSE IF(run==0)useformat 0;               ELSE useformat 1;          ELSE use format 0a;
Secondly must determine form bit
           format 0 format bit=0;
      format 0a format bit=extension bit;
      format 1 format bit=1;
If usingform 1 should not accepted new data in next cycle, because also having, must export the level of code. B.5.5.1.2 packing
Per four valid data input packing process circulation primary. 16 word output is comprised of the last effective word that keeps and its follow-up word. If this composition is invalid, then output is also invalid. Process is:
Show B.5.3 packing process
Reserved wordFollow-up wordCombined characters
Effective period 0  xxxxxxxxxxxx  000000000000  xxxxxxxxxxxxxxxxNot output
Effective period 1  000000000000  111111111111  0000000000001111Output
Effective period 2  111111111111  222222222222  1111111122222222Output
Effective period 3  222222222222  333333333333  2222333333333333Output
X represents undefined position in the table.
Duringeffective period 0, there is not word output, because this word is invalid.
Effective period, number was treated with the ring counter dimension. The increment of counter is controlled by valid data and an accepted output of dispenser.
When receiving that FLUSH token (or picture_end) and token itself are ready to export, the FLUSH signal also outputs to the DRAM interface so that with reset-to-zero effective period. If the FLUSH token was come in any time except thecycle 3, then the FLUSH signal is delayed a week, to guarantee the output of token itself. B.5.5.2 Imodel Imup (UnPacker de-packetizer) B.5.5.2.1
Identical with the packing device situation, last effective input is stored, and inputs combination with the next one, then allows to unpack.
B.5.4, the process that table unpacks
Follow-up wordReserved wordWord after unpacking
Effective period 0  0000000000001111  xxxxxxxxxxxxxxxx  000000000000Input
Effective period 1  1111111122222222  0000000000001111  111111111111Input
Effective period 2  2222333333333333  1111111122222222  222222222222Input
Effective period 3  2222333333333333  1111111122222222  333333333333Not input
Here x represents not defined position.
Kept by a ring counter effective period. The data that unpack comprise the data of token, FLUSH and the PICTURE_END signal of being decoded by FLUSH. In addition, form and extension bits are also by unpacking data decode out.
formatbit_is_extn=(lastformat==1)ll databody
format=databody &&(formatbit && lastformatbit)
Be used for token decode and be sent to imex.
When a certain FLUSH (or picture_end) token was unpacked out and exports imex to, all data deleted (Valid is forced low) were until receive end-of-block signal from the DRAM interface. B.5.5.2.2 Imex (expander)
According to the present invention, imex is four state machines that the run/level code is expanded out. This state machine is:
State 0: from pack into rum counting of run code.
State 1:run subtracts counting, and everybody is zero in output.
State 2: input data and output levels; Default setting.
State 3: illegal state. B.5.5.2.3 Impad (filler)
The data token head is informed Impad by imex. Next step, impad counts the number of coefficients of token body. If token finished before reaching 64 coefficients, then inserting numerical value in token end is zero coefficient, makes token that 64 coefficients be arranged. For example, the data head of expansion is not having 64 zero coefficients to insert thereafter. Have 64 or more the data token of multiple index be not subjected to the impact of impad. B.5.6 register
Imodel of the present invention and hsppk do not have microprocessor registers, but they spy out except the device.
Show the B.5.5 register of Imodel and hsppk
Register name usage address
   CED_H_SNP_2    VAxxxxxx     0X49
   CED_H_SNP_1    DDDDDDDD     0X4a
   CED_H_SNP_0    DDDDDDDD     0X4b
   CED_IM_SNP_1   VAExxDDD     0X4a
   CED_IM_SNP_0   DDDDDDDD     0X4d
Here V=significance bit; A=accepts the position; The E=extension bits; The D=data bit. B.5.7 verification
Selected stream is by the Lsim dry run. B.5.8 test
Being to spy out device by token-caching output in input, is that the device of spying out by imodel self reaches test to imodel in output. Scan chain by imodel self reaches logic.
The output of hsspk can be spied out device by huffman output and be visited. Logic can now be examined by the huffman scan chain. B.6 buffer memory starts (Butfer Start_up) B.6.1 foreword
This section illustrates method and the realization of buffer memory starting according to the present invention. B.6.2 general introduction
For guaranteeing that visual stream can show smooth and continuously, must collect the data of some before decoding can begin. This is called starting conditions. Coding standard has specified a VBV to postpone, and this delay can generally convert the data volume that needs collection to. The purposes of " buffer memory starting " be exactly guarantee that data in each stream advance to allow decoding from the token-caching device before, this stream satisfies its entry condition. Stream is remained in each buffer by one nominal (notional) door (out gate). This is positioned at output place (namely at reverse model device) of token-caching device, and it only has been met Shi Caiwei stream in the entry condition that flows and has opened. B.6.3 interface
Bscntbit (Buffer Start_up bit counter, buffer memory start bit counter) in data path, communicates by letter with two-wire interface. It receives microprocessor. It also is switched to bsogl (Buffer Start_up Output gate Logic, buffer memory starts the out gate logic) with two-wire interface and goes. Bsogl removes to control imup (Inverse Modeler UnPacker, reverse model device de-packetizer) by double rail logic, and imup realizes out gate. B.6.4 block structure
Shown in Figure 130, Bscnbit is positioned within the data path. This path is between detector for initial code and coded data buffer. These monocycle parts of Bscntbit are to the effective word counting of the data of leaving this piece, and with this numeral with must be compared by the entry condition (or being target) that microprocessor is packed into. In case target is reached, just notify bsogl. Data are not subjected to the impact of bscntbit.
Bsogl is positioned between bscntbit and (at reverse model device) imup. On effect, it is a designator (indicators) formation, is used to refer to each stream and has satisfied their target. Formation moves forward by the stream (that is, the FLUSH token of receiving in the data flow at imup place) that leaves buffer, and another " designator " accepted by imup subsequently. If formation is empty (namely not satisfied the stream that starts target in buffer), then in imup, stream is stopped.
Formation only has certain degree of depth, yet, can in bsogl, formation be disconnected, and allow the microprocessor monitors formation, in this way, the degree of depth just can boundlessly expand. These queue mechanism are called respectively interior formation and outer formation. B.6.5 the realization of functional block Bsbitcnt (buffer memory start bit counter) B.6.5.1
Bscntbit is input to the effective word counting of buffer memory starting to all. Counter (bsctr) is the counter of 16~24 bit widths able to programme. In addition, bsctr has carry look ahead (carry look ahead) circuit to give enough fast speed. The width of Bsctr is programmed with ced_bs_prescale. Method is to force 8 to 16 for high, and this just makes these total energys transmit a carry. So in fact do not use them. Only have the highest 8 of bsctr to be used for making comparisons with target (ced_bs_target).
Compare (ced_bs_count>=ced_bs_target) finished by bscmp.
Target is to get from stream when stream is in the Huffman decoder, and it calculates with microprocessor again. So it is just to be set up in the time of certain that flows after beginning. Before starting, target_vali is set to low. Make target_valid be set to height to writing of ced_bs_target, and allow in bscmp, to compare. When comparative result showed ced_bs_count>=ced_bs_target, target_valid was set to low. Target satisfies.
When target was satisfied, counting was reset. Notice that it is not to reset at the end that flows. In addition, after being satisfied, target forbids counting, if this is before the stream end. The saturation value of counting is 255.
When in bsbitcnt, detecting once the end (i.e. FLUSH) of stream, just produce abs_flush_event. If stream finished, also produce another one event (bs_flush_before_target_met_event) before goal satisfaction. When any of these events occurred, piece was stopped. The target that this just allows the user to remove to restart to search for next stream, perhaps, if the bs_flush_before_target_met_event event, then or:
1) writes zero target, force target_met one time
Perhaps:
2) notice that target is former and do not reach, allow next stream to advance, until it and a upper stream reach target together. Target of this next stream can/should do corresponding adjustment. B.6.5.2 BSOGL (buffer memory starting out gate logic)
As previously mentioned, Bsogl is one and is used to refer to the designator formation that stream has satisfied target. Queue type arranges (internal queues is 0, and outside formation is 1) with ced_bs_queue. Select internal queues when resetting. Queue depth has determined the maximum number of the stream that satisfies condition. Stream can be in coded data buffer, Huffman or token-caching device. When reaching this number (being queue full), bsogl can force data path to be parked in the bsbitcnt place.
The use internal queues need to be from the action of microprocessor. Yet, increase if necessary queue depth, outside formation then can be set, and (method is: ced_bs_access is set, to obtain the access to ced_bs_queue, the latter should be set, target_met _ event and stream_end_event are enabled, and then access is withdrawn).
Outside formation (being looked after the count value of (maintain) by microprocessor) is inserted into internal queues. Outside formation is looked after (maintain) by two event target_met_event and stream_end_event and a register ced_bs_enable_nxt_stream. These two events can be called respectively service_queue_input and service_queue_ output simply. In fact, target_met_event provides the internal queues upper reaches end (up stream end) of (supply) formation. Similarly, ced_bs_enable_nxt_stream be absorb (consume) formation the dirty end of internal queues (down stream end) similarly, stream_end_event is to the request of lower flow queue (down stream quence) is provided; Stream_end_event resets ced_bs_enable_nxt_stream.
Use to these two events should be as follows:
   /*TARGET_MET_EVENT*/    j=micro_read(CED_BS_ENABLE_NXT_STM);    if(j==0)/*Is next stream enabled?*/    {/*no,enable it*/    micro_write(CED_BS_ENABLE_NXT_STM,1);    printf(* enable next strean(queue=0x%x)\n*(context->queue))    }    else/*yes,increment the queue of *target_met* streams*/    {    queue++;    printf(* stream already enabled(queue=0x%x)\n*,(contekt-    >queue));    }<!-- SIPO <DP n="453"> --><dp n="d453"/>    /*STREAM_EVENT*/    if(queue>0)/*are there any *target_mets* left?*/    (/*yes,decrement the queue and enable another stream */    queue--;    micro_write(CED_BS_ENABLE_NXT_STM,1);    printf(* enable next stream(queue=0x%x)/n*,(context->queue));    }    else    printf(*queue empty cannot enable next stream(queue=0x%x)\n*,    queue);    micro_write(CED_EVENT_1,1<<BS_STREAM_END_EVENT);/*cleer event    */
Queue type can at any time transfer outside (using said method) to by inside, but only has when outer queue empty (according to above-mentioned " queue=0 "), queue type could be changed into interior formation by outer formation. The method of conversion is: ced_bs_access is set, and to obtain the access to ced_bs_queue, the latter should be set, target_met_event and stream_end_event conductively-closed, and then access is withdrawn.
On the other hand, inspection that can not the convection current starting conditions to ced_bs_queue (outside) set, masks target_met_event and stream_end_event, then to ced_bs_enable_nxt_stream set. Like this, all stream just will always be enabled. B.6.6 microprocessor registers
Register nameUsageThe address
 CED_BS_ACCESS  xxxxxxxD  0x10
 CED_BS_PRESCALE*  xxxxxDDD  0x11
 CED_BS_TARGET*  DDDDDDDD  0x12
 CED_BS_COUNT*  DDDDDDDD  0x13
 BS_FLUSH_EVENT  rrrrrDrr  0x02
 BS_FLUSH_MASK  rrrrrDrr  0x03
 BS_FLUSH_BEFORE_TARGET_ME  T_EVENT  rrrrDrrr  0x02
 BS_FLUSH_BEFORE_TARGET_ME  T_MASK  rrrrDrrr  0x03
Show B.6.1 BSCNTbit register
Register nameUsageThe address
 TARGET_MET_EVENT   rrrDrrrr 0x02
 TARGET_MET_MASK   rrrDrrrr 0x03
 STREAM_END_EVENT   rrDrrrrr 0x02
 STREAM_END_MASK   rrDrrrrr 0x03
 CED_BS_OUEUE*   xxxxxxxD 0x14
 CED_BS_ENABLE_NXT_STM*   xxxxxxxD 0x15
Show B.6.2 Bsogl register
Here
D is a register-bit
X is a non-existent register-bit
R is the register-bit of a reservation
In order to obtain the access to these registers, ced_bs_access must be set to 1 and be queried until it reads back 1, only in certain interrupt service routine. Ced_bs_ access is set to zero abandons access. B.7 B.7.1 the DRAM interface is summarized
In the present invention, spatial decoder, time decoder and video formatter respectively have the DRAM interface unit of a special chip. In three kinds of all devices, the purposes of DRAM interface is by the block address that address generator provides data to be sent to outside DRAM and to be sent to chip from outside DRAM from chip.
In a typical case, the DRAM interface is both asynchronous with address generator in order to the clock of work, and the clock of the different parts of also crossing with data communication device is asynchronous. Yet this asynchronous obtaining easily processed, because each clock all roughly is operated on the same frequency.
Data communication device is everlasting between DRAM interface and the chip remainder and is transmitted (unique exception is the prediction data in the temporal decoder) take the piece of 64 bytes as unit. Transmit by using a kind of being called the device of " alternately buffer ". This is a pair of RAM with the work of Twin Cache Architecture form in itself, and when the DRAM interface filled up (filling) or RAM of clearancen (emptying), chip another part is clearancen or fill up another RAM then. Each replaces buffer and hangs over and divide separately on other bus, and the address of this bus out of the ordinary is produced by address generator.
Each chip has four alternately buffers, but the function of these buffers is different at every kind of chip. In spatial decoder, one replaces buffer and is used for coded data is sent to DRAM. Another is used for reading coded data from DRAM, and the 3rd is used for that the token data are sent to DRAM, and the 4th is used for reading token data from DRAM. In temporal decoder, one alternately buffer be used for (Predicted) pictorial data of base (Intra) or prediction is write DRAM, use the data that read base or prediction from DRAM for second, two are used for reading the forward and backward prediction data in addition. In video formatter, one alternately buffer is for being sent to DRAM with data, and three are used for from the DRAM reading out data in addition. Each replaces buffer with one in three brightness (Y) and reddish blue chromatism data (being respectively Cr and Cb).
Lower joint will be described according to the present invention to have one and writes alternately buffer and one and read the alternately work of the DRAM interface of buffer, and working condition with spatial decoder DRAM interface is identical basically for this. Explain with Figure 131 " DRAM interface ". B.7.2 general DRAM interface
Consult Figure 131, with the interface of address generator 420 and with the interface of each piece of supply and demand data all be two-wire interface. Address generator 420 or can produce the address as the result who receives the control token perhaps only produces fixing address sequence. DRAM interface 421 is processed those two-wire interface relevant with address generator with particular form. It is not when preparing receiver address acceptance line (accept line) to be remained height, and waits for that address generator provides an effective address, processes this address, then within a clock cycle acceptance line is set to height. Therefore, it is to realize request/response (REQ/ACK) agreement.
The characteristic of a uniqueness of DRAM interface is that it has the ability to communicate by letter with address generator, also can or accept the component communication of data with those complete independently supplyings. For example, address generator can produce an address relevant with writing data in the alternate buffering device, but not action, sends signal list and is shown with a blocks of data and has been ready to write outside DRAM 422 until write alternately buffer. Yet action does not occur, until address generator is provided to the address on the suitable bus. In addition, put into data in case write one of RAM in the buffer alternately, another RAM may before the data input is stopped, fill fully and " being replaced " to DRAM interface one side (the accept signal of two-wire interface sets low).
When understanding the working condition of DRAM interface of the present invention, be important to note that in a correct system of configuration, the DRAM interface the speed that alternately transmits data between buffer and the outside DRAM must be at least can with the same of the summation of buffer alternately and chip remainder among average data speed.
Each DRAM interface comprises that a kind of decision should replace for which the method for buffer service next time. In general, this or a kind of " circulation (round robin) ", or a kind of priority level encoder. In " circulation " method, serviced alternately buffer be next available never take turns to recently that replace buffer. In the priority encoder mode, serviced is that those replace buffer, and their priority level is higher than other. In above-mentioned two situations, have additional request to come from refresh requests generator (refresh requ_ est generator), it is higher than the priority of all other requests. Refresh requests is produced by refresh counter, and refresh counter can be programmed by MPI. B.7.2.1 replace buffer
Figure 132 represents that is write an alternately buffer. Working condition is as follows:
1) valid data appear at 430 inputs (data input). After each data block was accepted, it was written into RAM1 and address increment.
2) when RAM1 is full, input side (side) abandons control and a signal delivered to the side of reading, and RAM1 has been ready to be read out now with indication. This signal passes through between two kinds of asynchronous clocks, so pass through the trigger of three synchronous usefulness.
3) next one data item of coming input side is written into still empty RAM2.
4) point out this when round-robin method or priority encoder and take turns to this and replace buffer when being read that the DRAM interface just reads the content of RAM1 and they are write outside DRAM. Then a signal is sent back to by asynchronous interface, as (2), to point out that RAM1 is ready to again be loaded into now.
5) if the DRAM interface " replaces it " with the RAM1 clearancen and before input side is full of RAM2, then data can be continuously by alternately buffer acceptance; Otherwise after RAM2 was filled, alternately buffer can be set to its acknowledge(ment) signal lowly, and return can the side's of being transfused to use until RAM1 " has been replaced ".
6) this process ad infinitum repeats down.
The course of work of reading buffer alternately is similar, but the position opposite of input data bus and output data bus. B.7.2.2 outside DRAM and the alternately addressing of buffer
The design of DRAM interface is to make obtainable memory bandwidth for maximum. Therefore it arranges to such an extent that each data block of 8 * 8 is stored in the same page of DRAM. Thereby can take full advantage of fast access to web page mode of DRAM, in these modes, provide a row address, then many column address. A kind of facility is provided in addition, and the width that allows to receive the external RAM data/address bus is 8,16 or 32, make used DRAM what can conform to bandwidth requirement with the size of application-specific.
In this example (it has illustrated positively how the DRAM interface on spatial decoder works), address generator provides each read and write to replace the component address of buffer to the DRAM interface. This address is used as the row address of DRAM. 6 of column address self are provided by the DRAM interface, and these are also as the address of buffer RAM alternately. Receiving the data number transfer bus that respectively replaces buffer is 32 bit wides, so if the highway width of receiving outside DRAM is less than 32, then write alternately that buffer reads or next word when writing certain and reading alternately buffer (read and write is concerning the direction of transfer of outside DRAM) at next word from certain, must carry out two or four outside DRAM access.
Situation during with time decoder and video formatter is more complicated. They relate to respectively afterwards. B.7.3 DRAM interface sequence
In the present invention, DRAM interface sequence piece uses sequential chain, so that the edge of each DRAM signal is accurate to 1/4th system clock cycles. Used the clock from two 1/4th cycles of mutual deviation of phaselocked loop. They combine, and (notional 2 * clock) to form nominal twice clock. Then, any one chain forms with two parallel shift registers, with positive and the anti-phase work of " Clock Doubled ".
At first, a chain is used for a page starting cycle, and another chain is used for clear each cycle of read/write/brush. The length in each cycle can be programmed by MPI. In view of this, the page or leaf starting chain has regular length, and periodic chain length is then once changing with suitable being as the criterion in the page or leaf starting.
At reset mode, each chain is eliminated, and simultaneously, has a pulse to produce. This pulse is guided by the status information that the DRAM interface comes, and transmits along each chain. The DRAM interface clock produces with this pulse. Each DRAM interface clock cycle is corresponding to the one-period of DRAM. So because there is different length in the DRAM cycle, the clock speed of DRAM interface is not changeless just.
In addition, each timing chain will mix with the information of coming from the DRAM interface mutually from the pulse that above-mentioned chain comes, and respectively exports gating and enables (notcas, notras, notwe, not-oe) to produce. B.8 inverse quantization device (Inverse Quantizer) foreword B.8.1
This file is described purpose, effect and the realization of inverse quantization device (iq) according to this invention. B.8.2 general introduction
The inverse quantization device uses the coefficient that has quantized, and quantizes the weighted sum step sizes and rebuilds each coefficient. The amount of all these uses all transmits in data flow. B.8.3 interface
Between reverse moulding device and reverse DCr, it is connected to microprocessor to iq in data path. The binding of data path is by two-wire interface. The input data are 10 bit wides, and output is 11 bit wides. B.8.4 the mathematical operation of inverse quantization equation H.261 B.8.4.1
For the piece of encoding with base (intra) mode: C 'i=8Qi    i=0
Figure A9510324604641
Ci=min(max(C′iThe piece of-2048), 2047) encoding for all alternate manners:
Figure A9510324604642
Ci=min(max(C′i-2048) 2047) JPEG equation C ' B.8.4.2i=Wi,jQi+1024    i=0 C′i=Wi,jQi    0<i<64 C′i=min(max(C′i.-2048) .2047) j=ipeg_table_indirection (c) MPEG equation B.8.4.3
For the piece of encoding in basic mode: C 'i=Wi,jQi+1024    i=0
Figure A9510324604651
Ci=min(max(C′i.-2048)2047)
In basic DC situation, to add 1024 when calculating each predicted value among the huffman, they are reset to zero.
The piece that other has been encoded for all:
Figure A9510324604652
Ci=min(max(C′i.-2048) 2047) the variation equation of JPEG B.8.4.4Ci&prime;=floor(2iq_quant_scaleWi,jQi16)+1024,i=0Ci&prime;=floor(2iq_quant_scateWi,jQi16),0<i<54Ci=min(max(C′i.-2048) 2047) j=jpeq_table_indirection (c) all other tokens B.8.4.5
Except the data token, all tokens must not be quantized by iq.
Here:
Figure A9510324604661
Figure A9510324604662
Figure A9510324604663
Floor (a) returns an integer, so
      (a-1)<floor(a)≤a    a≥0
      a≤floor(a)<(a+1)    a≤0
Q: be quantization parameter. CiBe rebuild with coefficient Wi,jThat value i in the quantization table matrix is that coefficient label j along zigzag is B.8.4.6 multiple standards comprehensive of quantization table matrix number (0<=j<=3)
Can find out that all above standards and their variation (also comprising the control data that must do not changed by iq) can be mapped in the single equation:OUTPUT=(2INPUT+k)(xy)16The inverse quantization post-processing function:
Add 1024
Signed magnitude converted to 2 complement representation
Be rounded to hithermost odd number toward all even numbers of 0 steering handle.
Making the result saturated is+2407 or-2048
B.8.1 changing value k, x and Y and their used functions to the variable of every kind of standard are shown in table.
B.8.1 control decoding of table
StandardThe X weightingThe Y engineer's scale  KAdd 1024Idol rounds offSaturated resultThe complement code ofconversion 2
H261Base DC     8  8   0   No   No  Yes   Yes
Base     16  iq_quant  _scale   1   No   Yes  Yes   Yes
Other     16  iq_quant  _scale   1   No   Yes  Yes   Yes
JPEG DC     Wij  8   0   Yes   No  Yes   Yes
Other    Wij 8  0  No  No Yes  Yes
MPEGBase DC     Wij  8   0   Yes   No  Yes   Yes
Base     Wij  iq_quant  _scale   0   No   No  Yes   Yes
Other     Wij  iq_quant  _scale   1   No   Yes  Yes   Yes
XXX DC     Wij  iq_quant  _scale   0   Yes   No  Yes   Yes
Other     Wij  iq_quant  _scale   0   No   No  Yes   Yes
Table is control decoding (continuing) B.8.1
StandardThe X weightingThe Y engineer's scale   KAdd 1024Idol rounds offSaturated resultThe complement code ofconversion 2
Other token     1     8   0 No No No No
B.8.5 the structure of functional block
From B.8.4.6 with the table B.8.1, can find out that the device of standard inverse quantization more than can use single structure. Its computing block diagram is shown in Figure 133 " arithmetic block ":
The control of arithmetic block can be divided into two parts by function:
Token decode load status register or quantization table.
Status register is decoded as control signal.
Token is decoded in igca, and iqca controls the next cycle, i.e. the register memory cell of iqcb. It also controls four quantization tables that are accessed among the igram. Arithmetic unit, namely two multipliers and some post-processing functions are in igarith. The complete block diagram of iq is shown in Figure 134. B.8.6 the realization of functional block Iqca B.8.6.1
In the present invention, iqca is a state machine, is used for the paired igram of token decode with to the control signal at the register of igcb. State machine regards that the state machine of each token is more suitable as, because it is resetted by each new token. For example:
Code to QUANT_SCALE (seeing B.8.7.4 " QUANT_SCALE ") and QUANT_ TABLE (seeing B.8.7.6 " QUANT_TABLE ") is as follows:
if(tokenheader==QUANT_SCALE){    sprintf(preport,*QUANT_SCALE*);    reg_addr=ADDR_IQ_QUANT_SCALE;    rnotw=WRITE;    enable=1;}if(tokenheader==QUANT_TABLE)/*QUANT_TABLE token */switch(substate){    case0:/*quantisation table header */    sprintf(preport,*QUANT_TABLE_%s_s0*,    (headerextn?*(full)*:*(empty)*));    nextsubstate=1;    insertnext=(headerextn?0:1);    reg_addr=ADDR_IQ_COMPONENT;    rnotw=WRITE;    enable=1;    break;    case1:/* quantisation table body */    sprintf(preport,*QUANT_TABLE_%s_sl*,    (headerextn?*(full)*:*(empty)*));    nextsubstate=1;    insertnext=(headerextn?0:(qtm_addr_63==0));    reg_addr=USE_QTM;<!-- SIPO <DP n="466"> --><dp n="d466"/>        rnotw=(headerextn?WRITE:READ);        enable=1;        break;        default:        sprintf(preport,*ERROR in iq quantisation table tokendeccder    (substate %x)\n*,        substate);        break;        }        }
This lining state (substate) is a state in token, and for example QUANT_ SCALE only has a sub-state. Yet QUANT_TABLE has two sub-states, and one is head, and second is the token body.
State machine is realized with PLA. Unrecognized token does not cause that word line (word line) raises, and makes PLA export some default (harmless) controls.
In addition, iqca supplies with the address to igram with body word (BodyWord) counter, and word is inserted in the stream, for example one without the QUANT_TABLE of expansion in (seeing B.8.7.4). This is to be used in to keep output when effective, and the way that stops to input obtains. (iqub or iqarith) these words can be inserted with correct data in successor block.
Iqca is by a monocycle in the data path of two line interfaces control. B.8.6.2 iqcb
In the present invention, iqcb preserves the value of iq status register. It is pack into data path or therefrom unload of these values under the control of iqca.
The value of status register decoded (seeing Table B.8.1) is delivered to the control line of iqarith, goes to control XY multiplier term and rear quantification (post quantization) function.
The sign bit of data path is here by separately and deliver to rear quantization function. Also have the zero-value word on the data channel also to be detected here. Zero is multiplexed to data path so computing is left in the basket. " 0 advances in accordance with iq for this; 0 goes out " the easiest method of regulation.
Only have as register iq_access to be set to 1 and read back 1 the time, status register could be from microprocessor access. In this situation, the paused data path of iqcb has a stationary value so guarantee register, does not have data by error in data path.
Iqcb has a monocycle in data path, data path is controlled by two-wire interface. B.8.6.3 Iqram
Iqram must support four quantization table matrixes (QTM), and each has the 64*8 position. So it is six transistor RAM of a 256*8 position, per cycle is once readable or write once. This RAM is contained in the two-wire interface logic, accepts its control and writes data from what iqca came. Its sense data is to iqarith. Similarly, iqram occupies the cycle same in the data path, as iqcb.
RAM read back 1 o'clock at iq_access, can be from the microprocessor read and write. RAM is placed in a keyhole register, and the iq_qtm_keyhole_addr addressing is used in the iq_qtm_keyhole back. Access iq_qtm_keyhole will make its address pointed, that be kept among iq_qtm_keyhole _ addr increase. Equally, iq_qtm_keyhole_addr can directly be write. B.8.6.4 iqarith
Notice that iqarith is three functional pipelines, divide and carry out three cycles. Its function is discussed below (seeing Figure 133) B.8.6.4.1 XY multiplier
This is that one 5 (X) takes advantage of 8 (Y) position carry to preserve without the sign multiplication device, is fed to the data path multiplier. Multiplier and multiplicand are selected with the control line that iqcb comes. Multiplying is found the solution adder (resolving adder) at second period in first cycle.
In the input of multiplier, can be multiplexed to data path from the next data of iqram and get on, to read QUANT_TABLE to data path. B.8.6.4.2 (XY) * data path multiplier
This 13 (XY) takes advantage of the carry preservation of 12 (data paths) position to divide on three cycles of piece without the sign multiplication device. Three partial products are in first cycle, and 7 at second period, and remaining two the 3rd cycle.
Because all outputs from multiplier are less than 2047 (the non-coefficients of non_coefficient) or are saturated to+2407/-2048, the highest 12 never need to find the solution. Correspondingly, find the solution adder and only have two bit wides. In the remaining part of taking advantage of of high order tagmeme, one 0 is detected enough as a saturation signal. B.8.6.4.3 quantizing post-processing function (Post quantization functions) quantification post-processing function is:
Add 1024
Signed magnitude become 2 complement representation.
Be nearest odd number to all even number houses of 0 steering handle.
With the result saturated+2047 or-2048.
Output is set to zero (seeing B.8.6.2)
The most front three functions realize 12 adder (second and theperiod 3 on flowing water carry out). From this, can see each functional requirement what, then these merge (combine) to single adder.
B.8.2, table quantizes the post processing adder functions
Function   if datapath>0   if datapath>0
Convert 2 complement code toDo nothingNegate adds 1
All even numbers round offSubtract 1Add 1
Add 1024Add 1024Add 1024
The people of one's own profession technical ability will realize as one has, must be careful during the program of these functions of rearranging, because when merging, they are very complementary.
Saturation value, 0 and 0+1024 when theperiod 3 finishes, be multiplexed to data path. B.8.7 inverse quantization device token
Below explanation stipulated the working condition of each token tp that the inverse quantization device responds it. At all situations, token also is sent to the output of inverse quantization device. In most applications, token is not revised by reverse phase quantizer, only has following said certain situation exception. All unrecognized tokens are not sent to the output of inverse quantization device with not changing. B.8.7.1 SEQUENCE_START
This token makes register iq_prediction mode[1:0] and iq_mpeg_indi-rection[1:0] be reset to 0. B.8.7.2 CODING_STANDARD
This token makes iq_standard[1:0] pack into based on the desired value of the current standard (MPEG, JPEG or H.261) of decoding. B.8.7.3 PREDICTION_MODE
This token loads iq_prediction_mode[1:0]. Although PREDICTION_ MODE carries more than two, the inverse quantization device only need be accessed two of minimum order. These have determined piece yes or no based encode. B.8.7.4 QUANT_SCALE
This token loads iq_quant_scale[4:0]. B.8.7.5 data
In the present invention, this token carries real quantization parameter (quantized coe-fficients). The token head comprises two of identification chrominance component, and these are loaded into iq_ comment[1:0]. 64 following token words comprise quantization parameter. These are modified to the result that the inverse quantization device is processed, and rebuilt coefficient (reconstructed coefficients) replaces.
If there are not lucky 64 expansion words in the token, then the working condition of inverse quantization device is uncertain.
Data token at inverse quantization device input carries quantization parameter. These coefficients are shown 11 (10 add a sign bit) with the signed magnitude form shfft. Value " negative 0 " should, but can correctly be interpreted as 0.
Data token outside the output of inverse quantization device carries reconstructed coefficients. These coefficients with 2 complement code form with 12 bit representations (11 add a sign bit). The token expansion number of words that has at the data token of output place is the same with input the number that the place has at reverse phase quantizer. B.8.7.6 QUANT_TABLE
This token can be used for packing into new quantization table or read an already present table. Say that typically in the inverse quantization device, token is used to the new table of packing into and having decoded from bit stream. Read one the action of existence table in the front orientation quantiser of encoder, be useful, enter bit stream if this table will be encoded.
The token head comprises two, to identify the table number that will use. These positions are placed on iq_compon-ent[1:0] in. Notice that this register comprises one " table number " now, rather than a chrominance component.
If the extension bits of token head is 1, the inverse quantization device wishes just to have 64 expansion token words. Each is counted as one and quantizes tabular value, is placed on the continuous position of suitable table, fromposition 0. The 9th of each expansion token word is left in the basket. Token is not delivered to the output of inverse quantization device with normal mode with not revising yet.
If the extension bits of token head is 0, then the inverse quantization device will be read the continuous position of suitable table, fromposition 0. Each position becomes an expansion token word (the 9th is 0). During this EO, token will comprise lucky 64 expansion token words.
To all expanding digitals, except 0 and 64 corresponding to the not definition of operation of the inverse quantization device of this token. B.8.7.7 JPEG_TABLE_SELECT
This token be used for packing into or unload by chrominance component convert table number to/from iq_ipeg_ indirection. These conversions are used for JPEG and other standard.
Token head (Token Head) comprises two, is used for identifying current interested chrominance component. These positions are placed on iq_component[1:0] in.
If the extension bits of token head is 1, token should comprise an expansion word, and its minimum two are written into iq_ipeg_indirection[2*iq_Component[1:0]+1:2*iq_ Component[1:0]] unit. The value of just having read becomes token expansion word (high 7 will be zero). When this EO, token will comprise a token expansion word just.
Show the B.8.3 effect of JPEG_TABLE_SELECT
Chrominance component in the headThe position of iq_ipeq_indirector access
    0     [1:0]
    1     [3:2]
    2     [5:4]
    3     [7:6]
B.8.7/.8 MPEG_TABLE_SELECT
When processing by mpeg standard, this token is used for determining with default table or user-defined quantization table. The token head comprises two. Theposition 0 of head determine if the words of writing who writes iq_mpeg_indirection. That unit is write inposition 1.
Because iq_mpeg_indirection[1:0] register is by the zero clearing of SEQUENCE_START token, only when user-defined quantization table has been sent in the bit stream, just must use this token. B.8.8 microprocessor registers iq_access B.8.8.1
Want so that microprocessor to the access of any iq register, iq_access must put 1, and to its inquiry until it reads back 1 (seeing B.8.6.2). As do not accomplish will the register of being read still to be controlled by this point by data path, so, be unsettled. About igram, access is closed, and reads back 0.
Write 0 to iq_access, just control is retracted into data path. B.8.8.2 Iq_Coding_Standard[1:0]
This register holds coding standard, that is the standard that realizes with the inverse quantization device.
Show B.8.4 coding standard value
    iq_coding_standardCoding standard
    0     H.261
    1     JPEG
    2     MPEG
    3     XXX
This is deposited by the CODING_STANDARD token and loads.
Although this is two bit registers,, in core image, but taken 8 now, can realize the processing more than above standard in the future. B.8.8.3 Iq_mpeg_indirection[1:0]
In mpeg decode operating period, this two bit register is used for keeping the record that a quantization table will use.
Iq_mpeg_indirectin[0] be controlled to be the table that the based encode piece is used. If it is 0, then use quantization table 0, require to comprise default quantization table. If it is 1, then use quantization table 2, require to be included as the user-defined quantization table that the based encode piece is used.
This register is loaded by the MPEG_TABLE_SELECT token, with SEQUENCE_ START token it is reset to 0. B.8.8.4 Iq_ipeg_indirection[7:0]
This eight bit register, four kinds of being in JPEG scanning, to occur may chrominance components each component, determine with in four quantization tables which.
The table number thatcomponent 0 will be used is preserved in position [1:0].
The table number thatcomponent 1 will be used is preserved in position [3:2].
The table number thatcomponent 2 will be used is preserved in position [5:4].
The table number thatcomponent 3 will be used is preserved in position [7:6].
This register is subjected to the impact of JPEG_TABLE_SELECT token. B.8.8.5 iq_quant_scale[4:0]
The currency of this register holds quantization scaling factor. This register loads with QUANT_ SCALE token. B.8.8.6 iq_component[1:0]
This register is preserved a value usually, and this value is converted into quantization table matrix number (Quantization Table Matrix number). It is loaded by some tokens.
The data token head makes this register load with the chrominance component of processed piece. This information only is used for some variablees at JPEG and JPEG to determine QTM number by accessing iq_ipeg _ indireetion[7:0] do this part thing. At other standard, iq_Component[1:0] be left in the basket.
The JPEG_TABLE_SELECT token makes this register chrominance component of packing into. Then it be used as one and point to iq_ipeg_indirection[7:0] pointer. Ig_ipeg_ indirection[7:0] access with the token body.
The QUANT_SCALE token is packed this register into QTM number. So this shows or packs into (is the extend type of order such as usefulness) from token, perhaps reads to form a suitable expansion token from table. B.8.8.7 iq_prediction_mode[1:0]
This two bit register is preserved the prediction mode that follow-up piece will be used. The inverse quantization device is to determine whether will use based encode (intra codings) to unique utilization of this information. If two of register all is 0, then each subsequent block is based encode.
This register loads with the PREDICTION_MODE token, resets to 0 with SEQUENCE_START token register.
Iq_prediction_mode[1:0 in JPEG and JPEG variation pattern] do not affect operation. B.8.8.8 Iq_ipeg_indirection[7:0]
Iq_ipeg_indirection is used as a look-up table. This table converts chrominance component QTM number to. Correspondingly, iq_Component is as the pointer of iq_ipeg_indirection, shown in showing B.8.3.
Such as the extend type with token, this register cell writes direct with the JPEG_TABLE_SELECT token.
Such as the non-extend type with token, this register cell directly reads with JPEG_TABLE_ SELECT token. B.8.8.9 Iq_quant_table[3:0] [63:0] [7:0]]
Four quantization tables are arranged, and each has 64 unit. Each unit is 8 place values. Should not usenumerical value 0 in any unit.
These registers are to be used as a RAM who describes to realize in B.8.6.3 " Igram ".
These tables can load with the QUANT_TABLE token.
Notice that the data in these tables are pressed the storage of Zig-Zag scanning sequence. Many files are 8 * 8 square formations of the value representation of quantization table for number. Usually the DC item is in the upper left corner, increases from left to right with horizontal frequency, increases from top to bottom with vertical frequency. These tables must be read along Zig_ Zag scanning pattern because the number of quantization table in order " i " put into. B.8.9 microprocessor registers map
Show B.8.5 storage image
RegisterCell positionDirectionReset mode
 iq_access  0x30  R/W  0
 iq_coding_standard[1:0]  0x31  R/W  0
 iq_quant_scale[4:0]  0x32  R/W  ?
 iq_component[1:0]  0x33  R/W  ?
 iq_prediction_mode[1:0]  0x34  R/W  0
 iq_ipeg_indirection[7:0]  0x35  R/W  ?
 iq_mpeg_indirection[1:0]  0x36  R/W  0
 iq_qtm_keyhole_addr[7:0]  0x38  R/W  0
 iq_qtm_keyhote[7:0]  0x39  R/W  ?
B.8.10 test
Reaching the test to the inverse quantization device, is to spy out device by the output of reverse model device in input, then is the device of spying out by inverse quantization device oneself in output. Logic reaches with the scanning of inverse quantization device oneself.
If the ramtest signal is identified, can obtain to igram access, and do not remove to access iq_access. B.9 IDCT foreword B.9.1
To the purpose of this section narration of inverse discrete cosine transform (IDCT) piece, be for the engineering information source of IDCT is provided. It comprises following information:
The purpose of IDCT and key property
It is what how to design with verification
Structure
Being intended that of this section narration provides full information with promoting or help is following of task to a people with the general technical ability of one's own profession.
Realizing IDCT is one " silicon macroefficiency processor "
IDCT is integrated into another equipment
Be IDCT silicon chip development and testing program
The modification of IDCT, reset meter or safeguard.
Develop following DCT parts. B.9.2 general introduction
A discrete cosine transform/Zig-Zag (DCT/ZZ) realizes the conversion to block of pixels. Each piece represents 8 pixel height and takes advantage of 8 screen areas that pixel is wide there. The purpose of conversion is that this pixel block is represented in the frequency domain by the frequency sorting. Because eyes are responsive to the DC component in the image, but just very different to the high fdrequency component susceptibility, the size that frequency data allow each component reduces respectively according to the sensitivity of eyes. The processing of reduction amplitude is called quantification. Quantification treatment has reduced the information that image comprises, and namely quantification treatment is lossy. Lossy processing reaches total data compression with the method for some information of elimination. Frequency data are classified, so high frequency all occurs continuously. The high frequency overwhelming majority may be quantified as zero. These continuous zero mean with runlength encoding method quantized data to be encoded and produce more data compression, although run length coding, RLC is not lossy processing usually.
IDCT piece (in fact it comprise a reverse Zig_Zag random access memory (or claiming IZZ) and an IDCT) is converted to spatial data to classified frequency domain data. It is the function of IZZ that this reverse classification is processed.
The image decompression systems is indicated pixel with integer. ICD7 piece spare is the part of system. This means that the IDCT parts must get and produce integer value. Yet because the IDCT function is take integer as the basis, inner numerical representation decimally part keeps inner accuracy. Preferably with full Floating-point Computation, but the specific implementation (implementation) of narration is used fixed-point computation here. With fixed-point computation some loss of significance is arranged, but H.261 the precision of this realization surpasses and the specified precision of IEEE. B.9.3 purpose of design
According to this invention, the design main purpose is to design an IDCT piece that function is correct, and the silicon area that this piece is used is minimum. Design also requires under the operating condition of appointment, and clock speed is 30MHz. The adaptability to future is also considered in this design. Following clock rate that need to be higher. Every possible place, the structure of design all allow to do like this. B.9.4 the IDCT interface is described
The IDCT piece has with lower interface:
The token data input port of 12 bit widths
The token data output port of 9 bit widths
A MPI port
A system service input port
A test interface
The re-synchronization signal
Two token data ports all are the standard two wire interface types of saying in the past. Said width refers to the figure place of Data Representation, is not whole line numbers of port. In addition, relevant with input token data port have clock and a reset signal. Reset signal is used for the output re-synchronization with front member. Relevant with output token data port also have two re-synchronization clocks, and they are that subsequently parts are used.
MPI is standard, address four bit representations. Also have three externally decoded selection inputs, they are used for selecting address space into each event, each internal register and each scratchpad register. This mechanism provides flexibility for the diverse location that the IDCT address space is transformed in the different core boards. Also has an individual event output, idctevent, and two I/O signals, n_derrd and n_serrd. They are the ternary data wires of event, are used for being external to IDCT and to the suitable position of the non-data/address bus of microprocessor.
The system service port comprises standard time clock and reseting input signal, also has the two-phase clock (override clocks) of going beyond one's commission to select input with the relevant clock mode of going beyond one's commission.
Test interface JTAG comprises clock and reset signal, scanning path data (scan_path data) and control signal and ram test (ramtest) and chip testing (chiptest) input signal.
When normal operation, the microprocessor port is invalid. Because IDCT does not require any microprocessor access and obtains its appointed function. Similarly, only requiring test or test interface is just effectively during verification. B.9.5 the Fundamentals of Mathematics of discrete cosine transform
In the video bandwidth compression, the rectangular area of input data table diagram elephant is so used conversion must be two-dimentional. The bidimensional conversion is difficult to calculate expeditiously. But bidimensional DCT has can separated character. Conversion separately can be to the separately calculating and irrelevant with other dimension of every one dimension. This realization is with an one dimension IDCT algorithm. This algorithm is in particular and is transformed on the hardware and designs, and is improper to software model. This one dimension algorithm application one by one is to obtain the result of a two dimension.
A N is taken advantage of the pixel block of N, and the mathematical definition of bidimensional DCT is as follows:
Formula 10 forward DCTY(j,k)=2Nc(j)c(k)&Sigma;m=0N-1&Sigma;n=0N-1X(m,n)cos[(2m+1)j&pi;2N]cos[(2n+1)k&pi;2N]
Formula 11 reverse DCTX(m,n)=2N&Sigma;j=0N-1&Sigma;k=0N-1c(j)c(k)Y(j,k)cos[(2m+1)j&pi;2N]cos[(2n+1)k&pi;2N]
Herein
        j,k=0,1,...,N-1
More than definition is equivalent to two N on the mathematics and takes advantage of the N matrix multiple, and is double, between multiplying each other for twice, does a matrix transpose. Be equivalent to two N * N matrix multiple on the one dimension DCT mathematics. The situation of bidimensional is on the mathematics:
                    Y=[XC]TC
In addition, C is the matrix that cosine term forms.
So DCT describes with matrix disposal sometimes. Matrix description is more convenient to the mathematical simplification of conversion. But must emphasize that this only makes symbolic notation easier. Notice that the 2/N item affects DC level constant C (j) and C (k), is called normalization factor. B.9.6 idct transform algorithm
Said such as more detailed description subsequently, the algorithm that is used for calculating real IDCT should be a kind of " fast " algorithm. Used algorithm is optimized, to obtain effective hardware configuration and implement device. The main feature of algorithm is to have utilized
Figure A9510324604851
Ratio, reducing multiplication one time, and a kind of conversion of algorithm. This algorithm design makes between the first half and the latter half more symmetrical. This symmetrical result who produces is re-used the highest computing element of many costs effectively.
In the figure of this algorithm of explanation (Figure 136), the symmetry between the first half and the Lower Half is obvious at the middle part of figure. The adder of last row and subtracter also have symmetry, can merge (4 adder/subtracter are significantly less than illustrated 4 adder+4 subtracters) to adder and subtracter with relatively little cost.
Notice that all outputs of one-dimensional conversion are pressedRatio increases. Mean inproportion 2 increases of last two-dimentional answer. In the last saturated level that rounds off, can be proofreaied and correct at an easy rate with the method for displacement.
Algorithm was encoded with double-precision floating point C shown in once inciting somebody to action, and the IDCT of result of calculation and reference (using direct matrix multiplication) relatively. Then, further work out out the integer form (bit_accurate integer version) (not comprising timing information) of the level that accurately puts in place of algorithm with C. It can be used for performance and the precision of checking algorithm, as it on silicon chip the specific implementation. The various allowable errors of conversion in standard H.261, have been stipulated. This method once be used for practicing (exercise) accurately put in place level model and measure the precision (delivered accuracy) that is delivered for use.
Figure 137 has shown total IDCT structure, understands in the sense the public character between upper part and the lower part, also shows some points, and on these aspects, intermediate object program needs storage. This circuit is time multiplexing (time multiplexed), allows upper part and lower part separate computations. B.9.7 idct transform structure
As previously mentioned, the IDCT algorithm is optimized to realize high efficiency structure. The key characteristic of formed structure is as follows:
The calculating operation that cost is high is reused effectively
A small amount of multiplier. These multipliers all are constant coefficients, rather than for general purpose (reducing the needs of scale and the indivedual coefficient storage of eliminating of multiplier)
A small amount of latch, the no more than needs that make the structure pipelining
Operation is arranged to such an extent that make every level production line only require once the single operation of finding the solution
Can arrange to bear results by natural order
There are not the complicated conversion that intersects in length and breadth or a large amount of multiplexed (all being that cost is very large when the two is in the end realized)
Utilize the result who finds the solution and preserved operation (sub-addition, a subtraction) in order to remove twice carry
Structure allows every grade to adopt 4 clock cycle, namely eliminates the requirement to the calculating operation of very fast (greatly)
Structure will be supported than the faster operation of current 30MHZ pixel-clock operation. These need finding the solution operation from little/slow pulsation carry, are changed into larger/carry lookahead form faster simply. These are found the solution operation and have taken every grade of largest portion that needs the time, so only accelerate the speed of these operations, total service speed are had tremendous influence. And do like this total scale of conversion only had smaller increase. The degree of depth of increasing streamline is pick up speed further.
The control of transform data stream (transform data_flow) is very simple and efficient high.
The figure of one-dimensional transform micro-structural (Figure 141) illustrates algorithm is how to transform to a small group hardware resource, then how to enter streamline so that necessary performance constraints condition is met. One " control shift register " matched with the data flow streamline, obtain control to this structure with such method. This controls simplicity of design, and efficient is high on the silicon chip layout.
Named control signal on Figure 141 (latch, Sel_byp etc.) is various enable signals, is used for controlling each latch, thus control signal stream. Clock signal to latch is not shown.
Mapped structure satisfies desired accuracy standard, and makes simultaneously conversion be of a size of minimum, and in this respect, the details of some realizations is highly significant. Used technology is classified as two primary categories usually.
In each intermediateness, make the wide number of fixed word keep maximum dynamic ranges with the method for unit control fixed position.
In order to reach precision (rather than the wide precision that increases of the word that increases simply whole conversion) with selectable calculating operation, utilized the statistical definition of required precision.
The straightforward procedure of a conversion of design comprises that enough large fixed word is wide with one, carries out simple fixed-point calculation to obtain precision. Unfortunately, this method causes excessive word wide, so also made larger conversion. The method that adopts in this invention allows fixed position to change in whole conversion process, changes and carries out by this way, makes and can utilize available dynamic range to any specific median maximumly, thereby obtain possible maximal accuracy.
Because the result who allows is statistical description, can regulate selectively any median break-in operation, in order to improve overall accuracy. Selected adjusting is some simple operations that LSB (low order) calculates. This processing cost is little or free of charge. The another kind of method of this technology is that to increase the word comprise obvious value wide. Adjusting can be effectively at assigned direction to end product " weighting ", be the preceding if find this, then these results look after opposite direction. Can effectively make these results' overall average change (shifting) with the method for the fractional part of regulating the result. B.9.8 the IDCT block diagram is described
The block diagram of IDCT shows the piece that all are relevant with the token streams processing. This figure, Figure 138 does not represent the details of clock, test and microprocessor access and event mechanism. The mechanism of spying out that is used to provide test access does not represent in the drawings. B.9.8.1 error in data checker
First piece is error in data checker and adjuster, is known as " decheck ". It picks up and produces the token streams of 12 bit wides, analyzes this stream and checking data token. All other tokens are left in the basket and directly pass through. Some spreading numbers are not equal to 64 data token and do verification. Possible mistake is called as " not enough (deficient) " (<64 expansion), i.e. idct_too_few_event, and " unnecessary (supernumerary) (>64 expansion), i.e. idct_too_many_event. With standard Event mechanism these mistakes are sent signal. But this piece also attempts with the method for processing token streams easy bugs to be recovered. When not enough mistake occurring, data token is filled (stop to receive input and carry out insertion) by some " 0 " values expansions, to supply 64 correct expansions. When unnecessary mistake occurring, the expansion to the 64th forces reset, and removes all extra expansions from token streams. B.9.8.2 reverse Zig_Zag
In Figure 138, the next piece of spatial decoder is reverse Zig_Zagrandom access memory 441, izz, and it also is the token streams of obtaining and produce one 12 bit wides. The same with all other pieces, flow analyzedly, but only have data token to be identified. All other tokens do not pass through with changing. Data token also passes through, but the order of expansion is changed. What this piece relied on is correct data token (namely only having 64 expansions). As situation was not, then operation did not add regulation. Resequence according to the reverse Zig_Zag pattern of standard, in default situation, resequence still in order to provide the horizontal sweep data in IDCT output. It also is possible that the change sort method provides vertical scanning output. Except standard I ZZ ordering, this piece is finished an extra rearrangement to the row of each 8 word. This is to do for the particular requirement of IDCT one-dimensional transform piece. Its as a result the order of line output be (1,3,5,7,0,2,4,6) rather than (0,1,2,3,4,5,6,7). B.9.8.3 pattern of the input device
Next piece in Figure 138 is pattern of theinput device 442, " ip_fmt ", and it formats the first dimension input data of idct transform. The input of this piece is the token streams of 12 bit wides, and output is the token streams of 22 bit wides. Data token is moved to left, and integer part is moved on to the correct active position (correct significance) of idct transform standard 22 wide words, and fractional part is by zero setting. This means 10 decimal places are arranged herein. All other tokens are not shifted, and the position of unnecessary not usefulness is by simply zero setting. B.9.8.4.1 tie up conversion-first dimension
Shown in Figure 138, next piece is first one-dimensionalidct transform piece 443, " oned ". The token streams of these piece input andoutput 22 bit wides. With usually the same, flow analyzedly, data token is identified. Other token does not pass through with changing. Data token is finished one-dimensional 8 * 8 inverse discrete cosine transforms one time there by the data path of a pipelining. In the first dimension output, 7 decimals are arranged in the data word. All other tokens are by a data path operation of only having shift register. This path is just in order to coordinate mutually with required stand-by period of data transformation. These tokens are combined into again token streams before output. B.9.8.5 transposition random access memory
Transpositionrandom access memory 444 " fram ", with the processing method of 441 pairs of token streams of anti-phase Zig-Zag random access memory be similar in many aspects. Except processed token width (22) was different with the execution retracing sequence, in other side, they were with the same manner work. In fact they share their most of control logic. Each row also will reorder in addition. This is the needs of one dimension under the IDCT, also is the basic conversion that row become row. B.9.8.6 one-dimensional transform-second is tieed up
Next piece is another occasion of one-dimensional idct transform as shown in the figure. It is all the same with the first situation about tieing up every-way. The output of this one dimension has 4 decimals. B.9.8.7 round off with saturated
Round off and saturateblock 446 at Figure 138, " ras " gets 22 bit wide token streams. Stream comprises the data expansion of 22 fixed point formats. Piece output is 9 bit wide token streams, and here data are expanded (to the positive infinity direction) one-tenth integer that rounded off, and the saturated complement representation form that becomes 92. All other tokens are directly passed through. B.9.9 the hardware description of piece
All pieces of processing token streams there is the structure of conceptual (notional) of a standard, shown in Figure 139. This two line interfaces latch is from carrying out to the token streams processing section separately. The variation of this structure can comprise other internal block (a for example random memory core (RAMcore)). Some piece shown in the figure, the structure in diagram be (although it still exists really) not clearly. This is because need to be all data paths " logical combination together, and the cause that it is separated with the memory cell logic (standard cell logic) of all standards. In very simple piece, for example " ras " can directly be put into the out_acce-pt of locking input double-locking storage and not carry out logical process. B.9.9.2 " Decheck "-error in data verification/recovery
Pointed as block diagram general introduction one joint,first piece 440 executing data verification and the corrections in the token streams. The mistake that is detected is processed with standard Event mechanism. But the event conductively-closed of this means is fallen. This piece is according to the event mask state or can continue its recovering step when mistake is detected, and perhaps is stopped. IDCT should can't see incorrect data token forever. So the just quite simply attempt of resuming work that piece attempts at doing, it comprises may be the content of serious problems.
The pipeline depth of this piece is two-stage, realizes with zcells fully. Input two-wire interface latch is " front (front) " type. Mean that the supply mode when this piece (in the front portion of IDCT) and its previous power supply separates, allow safety operation when then all inputs arrive transistor gate. The work of this piece is to analyze token streams, allows non-data token directly pass through. When finding a data token, the spreading number of finding is begun counting after the token head. If when counting is not equal to 63 and extension bits is found to be " 0 ", then produce a rub-out signal (it is sent to affair logic). According to the state of the mask bit of that event, " decheck " or be stopped (namely no longer receive input or produce output) or beginning is wrong recovers. Recover mechanism to " deficiency " mistake counter controls, the stream that gives token inserts correct spreading number (value that is inserted into is " 0 " always). Obviously, when insertion is carried out, do not receive input. When the extension bits of discovery in the 64th expansion is not " 0 ", just produce " unnecessary (super numerary) " mistake. Finish data token with forcing the 64th extension bits for the method for " 0 ". Make again the invalid way of its output with continuing receive data, all are all deleted in the board stream by the subsequent words of set from now on extension bits.
Attention: these two rub-out signals are not lasting (unless this piece are stopped), in other words rub-out signal only from mistake is detected until recovery finish during this period of time in remain valid. Minimum is a complete cycle during this period of time, if an infinite redundant data token is arranged then it can be kept forever. B.9.9.3 " Izz " and " tram "-random access memory reorders
" izz " 441 (oppositely Zig-Zag random access memory) and " tram " 444 (transposition random access memory) here consider together. Because of the two variation that realizes said function, their resemblance is more than difference. These two pieces all obtain token streams, and each expansion of data token is reordered, and all other tokens are not passed through with changing. Handled extension width is different with the sequence that reorders, but the major part of the control logic of each random access memory is the same, and is in fact also formed " public control " piece. This piece has explanation in the figure of each random access memory. The difference of width is on the not impact of this control section. So only be required to be each with the random access memory of RAM core with different " address sequence generator ", and two line interface parts of proper width are arranged.
The total characteristic of each RAM mainly is the characteristic of FIFO (first in first out). This is absolutely correct in the token level, but the output order of the expansion word of data token has been made specific modification. The degree of depth of FIFO is 128 grades. Can bear 30MHz by the requirement of system in order to satisfy, this is necessary because data token output begin be detected after, the output of FIFO just stops (held up). This is because the characteristic of the used sequence that reorders requires to collect the complete piece of 64 expansions in FIFO before the output of reordering can begin. Or rather, be different to reverse Zig_Zag with the minimal amount that transposed sequence requires, all be slightly less than 64 in both cases. Yet the FIFO that controls a length and be not 2 power is very complicated. This means to the RAM core on a small quantity saving can cause desired control logic more complicated and lose more than gain.
The design of RAM core apparatus allows to finish in the single cycle of 30MHz once reads and once writes (to same address or each other address). This means that RAM is working with inner 60MHZ cycle time effectively.
Be used in and produce the method for reading address special sequence (" generation of sequence address ") in 0 → 63 scope and finish the operation of reordering, be not to carry out according to natural order but reorder. The sequence that needs is designated with standard Zig_Zag sequence (for 8 levels or vertical scanning) or with the required sequence of normal matrix transpose. Because after this requirement of idct transform one dimension piece, these standard sequences further are rearranged order. Reorder be by every row with very/requirement of even form (i.e. (1,3,5,7,0,2,4,6) and not (0,1,2,3,4,5,6,7)) output carries out.
Being created on the algorithm of transposition address sequence is very simple. Directly the address that produces respectively row and column is only required in the generation of (straight) transposed sequence. Both realize with counter. The requirement of rearrangement order means that only the address produces with a simple particular state machine rather than natural count device.
Being created on the algorithm of anti-phase Zig-Zag sequence is so not simple. Because this fact, a little ROM is used to preserve whole 64 6 bit address values. This ROM comes addressing with the row and column counter. These counters can be exchanged, in order to change between the horizontal and vertical scan mode. The generator that consists of with a ROM can design very soon. It also has a benefit, exactly: realize forward Zig-Zag (ROM reprogrammed) or increase other possible sequence in future need not spending very large cost. B.9.9.4 " oned "-one-dimensional idct transform
It is 20 grades streamline that this piece has a degree of depth, and this streamline is firm (rigid) when being subjected to block (sta-lled). This robustness has greatly been simplified design and can suitably have been affected total dynamic characteristic. This is because pipeline depth is not so large, and two one-dimensional transform devices all are positioned at after the RAM, and RAM provides a certain amount of cushioning effect.
These parts are according to normal structure, but to data token expansion (they are to want processed) and all other, there is independently path inside. All refer to the item that those should not pass through with changing. Attention: diagram draws with a kind of particular form. The first, because require all data path logical combinations together, the second, because require to allow to produce automatic compiling code (this has illustrated the control logic at top layer).
Analyzed token resembles usually, then route passed through two different parallel routes distinguish in the past in accordance with regulations with the multiplexer (MUX recombination for data expansions and other value. And multiplexer is in the front of output two line interface latching sections. Parallel route needs, because numerical value transmits and can be changed by the transform data path. Process the remainder of token streams with a simple shift register, make the stand-by period of transform data path obtain coupling.
The control section of " oned " need to be analyzed cutting apart and recombination of token streams and control token. Its another major part is the control change data path. The principal organ that controls this data path is a control shift register. Its matched data path streamline, and by tap, provide the control signal of necessity for every one-level of data path streamline.
" oned " piece has such requirement, exactly: it can only expand complete line in data, on being 8 groups, the complete line of expansion begins operation, it can not be in the illegal data of the intermediate treatment of each row (" gap "), although in fact, to have guaranteed to export be complete data block in the work of " izz " and " tram ". This data block is a continual sequence that 64 effective expanding value are arranged. B.9.9.4.1 transform data path
The micro-structural of transform data path, " t_dp ", the front has been shown in Figure 141. Note not expression of some details (for example, clock, displacement etc.). Yet, this figure explanation data path how on arbitrary level of streamline simultaneously to four Value Operations. The basic minor structure of data path, namely three major parts also can be found out (for example, before the common block, common block is behind the common block), required computational resource and latch resource and can see too. The control signal of name is to the pipeline latch enable signal of (with adding/subtract selector). They are tactic by the decoding of control shift register state. Notice that each pipeline stages is actually four clock cycle length.
Many latch, stage are arranged in the transform data passage. They are used for Gather and input, storage intermediate object program and make continuous wave output in streamline. Some latch is that multichannel is imported, and namely they can be loaded by more than one source conditionally. All latch enable type, clock is namely arranged independently and enable input. This means the correct timing sequence generating enable signal of easy usefulness, and needn't consider the problem of misalignment (skew). By contrast, if adopt the clock scheme that generates then will consider the generation of misalignment problem.
The main computing element that requires is as follows:
Some fixed coefficient multipliers (carry is preserved output)
Carry save adder
Carry is preserved subtracter
Find the solution adder
Find the solution adder/subtracter
All calculating is finished with 2 complement representation. This both can be normal (finding the solution) form, also can be that carry is preserved form (i.e. two numbers, they with the real value of expression) all severally all find the solution out depositing in before, and every level production line only does and once find the solution operation because this is the most time taking operation. That does here finds the solution operation all with simple pulsation. This means that solver is quite little, but slow. Prevail because find the solution in every grade total time, use the rapid solving arithmetic element, the whole conversion of obviously just having an opportunity to accelerate. B.9.9.5 " Ras "-round off with saturated
In this invention, the task of " ras " piece is to obtain 22 fixed-point numbers from the output of the second dimension " oned ", and these numbers are become correctly rounding off of requirement and 9 saturated bit strip symbol whole-number result. This piece is also finished in the scheme intrinsic necessary by 4 except (2/N item) with removed by 2 again, with the every one dimension that compensates bidimensional carry out in advance byRatio is amplified. This is removed by 8 and shows that fixed position is counted as will being moved to the left 3 than what expect. Namely the result is regarded as 15 integer representation methods and 7 decimals (rather than 4 decimals). The rounding procedure that realizes is " toward positive infinity direction round off (round to positive infinity) ", namely to just in time being that 0.5 decimal adds 1. The essential factor of deciding like this is the simplest attainable rounding procedure for this. After round off (condition of integer part increases) finished, whether tested this 9 bit strip symbolic result of checking of result required to be saturated to maximum or the minimum of a value of this scope. This work is used and the increment carry is gone out the method that checks with a former integer-valued high position is finished.
With usually the same, token streams is analyzed, only is used for the data token expanding value and round off with operated in saturation. This piece has the streamline of a two-stage degree of depth, realizes with zcells fully. B.9.9.6 " Idctsels "-IDCT register selective decompression device
These parts are simple decoders, and it becomes the selection line of parts out of the ordinary (spying out device and RAMs) test access to 4 MPI address wires and " sel_test " input decoding. These parts only comprise the zcells combinational logic. B.9.1 decoded selection (selects) is shown in table.
Show B.9.1 space, IDCT test address
Address (16 system)ItemRegister name
    0x0
    7..1Need not
    0The TRAMkeyhole address
    0x1
    7..0
    0x2     7..0TRAMkeyhole data
    0x3
    7..0TRAM keyhole dataa
    0x4     7..0The IZZkeyhole address
    0x5
    7..0IZZ keyhole data
Show B.9.1 space, IDCT test address (continuing)
Address (16 system)ItemRegister name
    0x6
    7..3Need not
    2Ipfsnooptest selection signal
    1Ipfsnoop is effective
    0Ipfsnoop accepts
    0x7     7..5Need not
    5..0Ipfsnoop position [21:16]
    0x8     7..0Ipfsnoop position [15:8]
    0x9     7..0Ipfsnoop position [7:0]
    0xA     7..3Need not
    2Thed2snoop test selection
    1D2snoop is effective
    0D2snoop accepts
    0xB     7..6Need not
    5..0D2snoop position [21:16]
    0xC     7..0D2snoop position [15:8]
    0xD     7..0D2snoop position [7:0]
Show B.9.1 space, IDCT test address (continuing)
Address (16 system)ItemRegister name
    0xE
    7Theoutsnoop test selection
    6Outsnoop is effective
    5Outsnoop accepts
    4..2Need not
    0xE     1..0Outnoop data [9:8]
    0xF     7..0Outnoop data [7:0]
The address of a. repeating is " Idctregs "-IDCT control register and event B.9.9.7
This piece of the present invention comprises the example (instances) of the not enough mistake of standard Event logical block deal with data and unnecessary mistake, also has an independent memory-mapped position " vscan ", it is used for changing " izz " and reorders, and makes IDCT be output as the vertical scanning mode. This position is reset to " 0 " value, and namely default mode is horizontal sweep output. Two possible events are formed an idctevent signal by ' or ' together, and this signal can be used as interrupt signal. The address of register and event and bit position are seen B.9.10 joint. B.9.9.8 clock generator
In IDCT, used two " standard " types (" clkgen ") clock generator. Therefore two independently scanning patterns can be arranged. Clock generator is called " idctcga " and " idctcgb ". Unique difference on the function is that " idctcgb " do not need to produce " not-rstl " signal. In two clock generators, the total amount of the buffer unit of each clock and the output that resets, the real load that is fit to each clock and resets and drive separately. The load of coupling is from the door of final layout and actual the recording of capacitance of track (track).
When having finished the arrangement of IDCT top layer piece and wiring (Place and Route), we utilize the ability of interactive global routing characteristics to increase the width of first's wire of Clock Distribution tree, for negative all the other heavy clocks (Pho-b and Phi-b), because on these lines, will flow through larger electric current. B.9.9.9 JTAG controll block
Because IDCT has two independently scan chain and two clock generators, so the example of two standard JTAG controll blocks " jspctle " is arranged. These control assemblies form interface between test port and two scanning patterns. B.9.10 event and control register
IDCT can produce two events, and an independent control bit is arranged. These two events are idct_too_few_event and idct_too_many_event. If incorrect data token is detected, just produce these events at " decheck " of IDCT front parts. Independent control bit is " vscan ", when requiring IDCT output vertical scanning, just it is put 1. So " izz " piece has been controlled in this position. Whole event logic and memory conversion control bit all is placed in " idctregs " piece.
From the IDCT angle, these registers are placed on following position. Ternary i/o line n_ derrd and n_serrd are used for reading and writing these unit in suitable.
Show B.9.2 IDCT control register address space
Address (hexadecimal)ItemTheregister name
    0x0
    7..1Need not
    0     vscan
Show B.9.3 IDCT event address space
Address (hexadecimal)The position nameThe register name
    0x0     n_derrd     idct_too_few_event
    n_serrd     idct_too_many_event
    0x1     n_derrd     idct_too_few_mask
    n_serrd     idct_too_many_mask
B.9.11 method problem logic design method B.9.11.1
According to the present invention, in all IDCT pieces of design, the unified simple logical design countermeasure of attempt. This countermeasure means may be with fast and simple mode is made " safety " design. To the major part of control logic, only adopted at that time the simple scheme of MS master-slave relation. Asynchronous set/the input that resets only is connected to correct system reset. Although, as if often might propose cleverly non-standard line configuring and more effectively finish said function, this scheme has following advantage.
Concept is simple
Easily design
Service speed is (with latch → logic → latch>logical type design comparison) and be appropriate to automatic analysis clearly
Without the burr on the waveform (glitches) problem (comparing with the SR latch)
Initialize and only use system reset,
Allow scanning pattern correctly to work
Allow the generation of the C-code of automatic compiling
Have many places to adopt transparent d-type latch, these are listed below. B.9.11.1.1 two line interface latch
The calibrated bolck structure of input and output two-wire interface adopts latch. Subsistence logic not between output two line latch and follow-up input two line latch. B.9.11.1.2 ROM interface
Because the sequential requirement of ROM circuit has been used latch in the IZZ of ROM output place sequence generator. B.9.11.1.3 the transform data path and control shift register
Each streamline storage level might realize with full MS master-slave device, but for the reason of desired memory space, can greatly save with latch. But, this scheme requires the user to consider several factors.
The control shift register must produce the control signal (namely needing to use latch in this shift register) of two kinds of phase places (phases) now as enabling
Adopt latch, the Time-Series analysis more complicated
" t-postc " will be no longer produces the code through compiling automatically, because a latch outputs to another latch (because of the timing that enables, this is not a problem concerning circuit) in same phase place
Yet the area owing to saving with latch makes this invention be worth accepting these factors. B.9.11.1.4 MPI
Because the character of this interface has certain requirement to the latch in the keyhole logic of RAM core in event and the block of registers " idctregs " (with heavy synchronized). B.9.11.1.5 jtag test control
These calibrated bolcks adopt latch. B.9.11.2 the problem in the line design
Except the work of in library unit design, doing, in IDCT without any need for the transistor level line design. Library unit (library cells) is used for IDCT design (standard block, data path storehouse, RAM, ROM etc.). Some known shortest path in the translation data passage is done some line simulations (using Hspice), to approaching the path of the maximum length that allows, Hspice also is used to verify the result of Shortest Path Analysis (CPA) instrument with regard to those.
Note, in when normal operation, IDCT is fully static (that is, we indefinitely halt system clock), but in can scanning (scanable) latch dynamic node is arranged, they can disappear when being stopped (or very slow) at test clock. Because the irrecoverability of some node, their show a Vt landing (for example multichannel output), thus IDCT when static state may not be " micropower " (micro-power). B.9.11.3 layout method
Total method that this invention layout realizes was with BPR (certain is manually intervened) layout one IDCT completely at that time. IDCT is comprised of many zcells and a small amount of macro block. Some collects layout (for example, RAM, ROM, clock generator, data path) with hand weaving these macro blocks; If " oned " piece then uses BPR that more zcells and data path are consisted of.
Data path kdplib cell formation. In addition, the layout change of the part of kdplib unit regulation is defined and adopts the place that is beneficial to improving size. In the used data path of each oned piece, " oned_d " is discrete component maximum in the design, done at that time very large effort and come the size (highly) to this data path to be optimized.
The tissue of transform data path " t_dp " is rather crucial, because the accurate ordering of the element in the data path will affect the inner processing mode that connects. The number that reduces " overs " (being free of attachment to the vertical line of sub-block) is very important. This occurs in the most crowded point, because a maximum permissible value (ideal situation is 8,10 also to be possible, yet very inconvenient) is arranged. Data path logically is divided into three main subdivisions, and the data path layout is also done like this. In each subdivision, actual have four parallel data streams (these streams are merged in different places), so the method for many organising datas stream (thereby, the position of all elements) is arranged in each subdivision. In each subdivision, to the ordering of piece, also have the spacing (pitches) of the distribution physical bus of logic bus, before beginning, layout all carefully makes, in order to may obtain the layout of an energy exact connect ion. B.9.12 check
IDCT tests at many layers, from the algorithm checks of top layer to last test of location.
The initialization of mapped structure is made of C. Full precision and the integer model that accurately puts in place both have been developed. Various tests are to do at the model that accurately puts in place, in order to guarantee with H.261 the precision regulation is consistent, and the dynamic range of within mapped structure, measuring various calculating.
In many cases, design (for example, to data path and RAM control) with the method for M performance specification (M behavioral description) of writing a sub-block. Before the design that the diagram that enters those parts is described, some such M performance specifications simulated in Lsim. (for example, RAM, clock generator) performance specification still is used for the top layer simulation in some cases.
The countermeasure of carrying out logic simulation is the sketch simulation to each thing, as long as this thing can meet the demands in the simulation of that level. The library unit of low layer (being zcells and kdplib) is mainly simulated with their performance specification, because obtaining so less faster simulation. In addition, the performance library unit provides the timing verification characteristic, and it can make some line configuring outstanding problem. As certificate authenticity, some simulation is described to finish with the transistor of library unit. All logic simulation is carried out under the zero-lag state, so purpose is to want the checking function characteristic. The check of real-time performance is carried out with other technology.
Lsim switching layer simulation (using the RC_ timing mode) is to do as the partial test of Timing characteristics, but the check circuit of burr sensitivity (for example, to) of problems of other potential transistor level to some also is provided.
The main test technology of check sequence problem is the use of CPA instrument, to " path " option of " datechk ". This is used to the long signal path (some is known) of identification. Hspice is used for verifying that CPA analyzes under some critical condition.
Most of Lsim simulation be with standard source → parts → leakage (sink) methodology is carried out because the major part of IDCT characteristic is to use by the token streams of equipment to drill. The other simulation also is necessary. Simulate to test characteristic (configuration, event and test logic) and those test characteristics of accessing by JTAG/ scanning of accessing by MPI with these.
The simulation of coding and decoding can have the people of the general technical ability of one's own profession to finish at an easy rate to whole IDCT by one. The method of the source → piece of Application standard → leakage again, and be used for the many same token streams of Lsim check. B.9.13 test and test support
This section research provides some skills of test, and analyzes each piece and how to test.
The three kinds of skills (mechanisms) that provide for test access are as follows:
Microprocessor is to the access of RAM
Microprocessor is to spying out the access of parts
The scanning pattern of access control and data path logic
Two " spying out device " parts and " superfine is spied out device " parts are arranged in IDCT. Figure 140 illustrates position and the access of other tests microprocessor of spying out the device parts.
In order to test the purpose of each master unit characteristic relevant with token streams, with above-mentioned parts and two RAM pieces, can isolate each main piece. Use microprocessor access, can control token and be input to any piece, then observe the token port output of this stand-alone component. In addition, two independently scanning patterns are arranged, they are by (almost) all trigger and latch of each control section of every, for " oned " transform data path flow waterline. They are also by some data path latch. Two scanning patterns represent with " a " and " b ", the former from " decheck " piece run to " ip_fmt " parts and the latter from first " oned " piece to " ras " piece.
Go to access that to spy out device be possible with the method for normally accessing suitable memory-mapped unit. To RAM too feasible (with " ramtest " that see fit input). Scanning pattern is normally accepted the interview by jtag port.
Each parts and the various test problems that relate to are discussed now. B.9.13.1 " Decheck "
This piece has normal structure (seeing Figure 139), and here, two latch that the input and output two-wire interface is used are round a processing block. With usually the same, do not provide scanning to two line latch, because they transmit data when being enabled simply, there is not the degree of depth of logic to test. In this piece, " control " part comprises the streamline that one-level is comprised of zcells, and these zcells are on scanning pattern " a ". The logic of control section is fairly simple, the generation of data that the chances are in the most complicated path expansion counting. Use there 6 incrementers. B.9.13.2 " IZZ "
This piece is a kind of distortion of normal structure, comprises RAM block part and a control section that is added to two line interface latch. Control section realizes that with zcells address sequence produces with a little ROM. All zcells visit address and the data of ROM by the zcell latch on scanning pattern " a ". Also have more logic, for example, the generation of number adds the ability of increment or decrement. In addition, there are 7 full adders to be used for reading the generation of address. RAM can be by MPI through the keyhole register access. See Table B.9.1. B.9.13.3 " lp-fmt "
This piece also is reference format. Control logic is with some quite simple zcell logic realization (all on scanning pattern " a "). But the latching and being shifted of data/multiplex is not finish in having the data path of direct access, because the logic here is very plain and simple. B.9.13.4 " oned "
This piece also is to adopt normal structure. It is divided into random logic and data path two parts. The zcell logic is fairly simple, and all zcells are on scanning pattern " a ". The control signal of transformation pipeline data path obtains from a long shift register. This shift register comprises the zcell latch. Latch is on scanning pattern. In addition, because considerable logical depth (for example multiplier and adder) is arranged between some grade of streamline, so some pipeline latch is placed on the scanning pattern. Non-data token transmits along a shift register, and shift register realizes that as a data path any level is not all had test access. B.9.13.5 Tram
This piece is very similar to " izz " piece. Yet in this case, in the address generate of address sequence without ROM. This finishes with algorithm. All zcell state of a controls are all on data path " b ". B.9.13.6 Rras
This piece adopts normal structure, and is realized by zcells fully. The most complicated logic function is be used to 8 incrementers that round off. All other logics are quite simple. All states are on scanning pattern " b ". B.9.13.7 other top layer piece
Top layer at IDCT has several other pieces. Spy out the part that device obviously is the test access logic, each JTAG controll block also is its part. Also have two clock generators, they are without any special test access (although they support various test characteristics). " idcts-els " piece is that the Zcell logical groups is synthetic, is used for Microprocessor Address is decoded. And " idctregs " piece comprises the addressable event of microprocessor and the control bit relevant with IDCT. B.10 B.10.1 temporal decoder general introduction of foreword
According to this invention, the internal structure of temporal decoder is shown in Figure 142.
All data streams between the piece of core board (and a big chunk data flow in the piece) is controlled with two line interfaces. (seeing Technical Reference and detailed chapters and sections for details). Each arrow in scheming B.10.1 represents two line interfaces. The token streams of input is by input interface, and this interface is synchronous with the data of coming from external system clock the internal clocking that obtains from phased lock loop (Ph0/Ph1). Token streams is divided into two-way by a top layer fork, and one the tunnel enters address generator, another road to one 256 word FIFO. From the data of I or the P frame of front, when taking out from DRAM, FIFO plays cushioning effect to data. Meanwhile, the data of front several I or P frame are taken out from DRAM, they in prediction adder (Prediction Adder), be added to the input error number (incoming err-or data) that comes from spatial decoder (Spatial Decoder) upper before, in calculating wave filter, process (P and B frame) first. Separate intersymbol at MPEG, the frame of I and the P frame data that reorder also must be taken out, so that the order of output frame is correct. The data that reorder are inserted in the stream in read pointer (Read Rudder) piece.
Address generator is that forward and backward is predicted, reordered, read and write is returned etc. and produces independently address. The data that write back are told from stream in the write pointer piece. At last, data in the output interface piece with the external clock re-synchronization.
All main pieces are connected to internal microprocessor interface (UPI) bus in the temporal decoder. This is to obtain from external microprocessor interface (MPI) bus the MPI parts. These parts have the address decoder of each parts in to the chip relevant with it. Affair logic is also relevant with MPI.
Other logic of temporal decoder is mainly relevant with test. At first, IEE 1149.1 (JTAG) interface 460 not only provides interface to inner each scanning pattern, also provides interface to the jtag boundary scan characteristic. Next is each two line interface level. They allow by MPI data flow to be done the plug-in type access. And they are included in each key aspect (strategic points) in the pipeline organization under test mode. B.11 clock, test and relevant issues clock mode (regimes) B.11.1
In considering chip before each functional block, the clock mode in the chip and correlation thereof had to understand be helpful.
In when normal operation, most of piece of chip and signal pllsysclk synchronous operation from phase-locked loop (PLL) piece. The DRAM interface is an exception, and the needs synchronous to the iftime sub-block are depended in its timing. The iftime sub-block produces DRAM control signal (notwe, notoe, notcas, notras). The core of this piece with two-phase non-overlapping clock clko and clkl regularly. They are that two phase clock from 90 ° of phase differences obtains, and these two clocks are by PLL cki0, and ckil and ckq0, ckq1 independently supply with.
Because clk0, clk1 DRAM interface clock is asynchronous with clock with the chip remainder, and the interface between DRAM interface and chip other parts may have temporarily steady state (metastable behavior). Take measure (to the greatest extent actual capabilities) is eliminated this possibility. Occur in synchronously two zones: at the output interface (addrgen/ predread/psgsync, addrgen/ip_wrtz/sync18 and addrgen/ip_rd2/ sync18) of address generator with in some parts. These component controls " replacing " of the alternate buffering device (swing-buffer RAMs) in DRAM interface (seeing the relevant chapters and sections of DRAM interface). Every kind of situation, finish synchronizing process with temporarily steady hard (metastable-hard) trigger of three series connection. Should be noted that and this means that clk0/clk1 is used in the output stage of address generator.
Except these fully asynchronous clock modes, many independent clock generators are arranged. They produce the not overlapping clock of two-phase (Ph0, Ph1) from pllsysclk. Address generator, predictive filter and DRAM interface, each has their clock generator by oneself. The remainder of chip moves with a common clock generator. Its reason has two aspects. The first, reduce the capacitive load on each clock generator, thereby allow with less clock driver, and reduce the clock trace width. The second, each scanning pattern is controlled with a clock generator, just allows with shorter scanning pattern so increase the number of clock generator.
The signal on the clock mode border of transmitting must be by re-synchronization, because a little misalignment (minor skews) between the not overlapping clock that obtains from the different clocks generator may mean at each interface overlap (underlab). Being based upon that the inner circuit of each piece of " spying out device " (seeing B.11.4 joint) guarantees not can this thing happens. Spying out the device piece has been placed on the border between all clock modes. But be an exception in the front of address generator, re-synchronization is finished in the token decode parts there. B.11.2 the control of clock
Each standard time clock generator produces many different clocks, and they allow to work in normal mode and scanning-test (scan-test) mode. Under scanning-test mode, the other places that are controlled at of clock are described in detail. But it should be noted that some clocks (tpho, tphl, tckm, tcks) that a clock generator produces are not linked to any basic symbol (primitive symbols) among the figure usually in diagram. This is automatically to be produced by a preprocessor because of scanning pattern. This preprocessor correctly connects these clocks. From the viewpoint of function, preprocessor has connected clock this fact different from diagram and can ignore; Effect (behavior) is the same.
When normal operation, can there be many diverse ways to obtain master clock. B.11.1, how table shows the condition selecting variety of way according to pllselect pin and override pin.
Show B.11.1 clock control mode
pllselect  overrideMode
    0     0Pllsysclk is directly linked outside sysclk, with the PLL bypass; DRAM interface clock (cki0, cki1, ckq0, ckq1) from the lead-in wire ti and directly controlled.
    0     1Override mode-ph0 and ph1 clock are directly controlled from tph0ish and tp1ish lead-in wire. DRAM interface clock (cki0, cki1, ckq0, ckq1) is directly controlled from ti and tq lead-in wire.
    1     0Normal operation pllsysclk is the clock that PLL produces; The DRAM interface clock produces with PLL.
    1     1Non-essential resistance is connected to ti and tq replaces internal resistance (debugging).
B.11.3 two line interfaces
The general function of two line interfaces is described in detail in Technical Reference. Yet two line interfaces are used for inner all parts of temporal decoder to the communication of parts. Most of piece comprises some pipeline stages, and all these pipeline stages itself all are again two line interface levels. So in order to illustrate many line maps, understand the inside of two-wire interface and realize it being very important. Generally, the structure of these inner pipeline stages is shown in Figure 143.
Figure 143 shows the representation of a latch-logic-latch, because this is normally used structure. Yet when many levels were put together, " level " of thinking latch-latch-logic was effective (this is a more familiar model to many engineers) equally. The use of latch-logic-latch structure allow all internal block communications be latch to latch, in the piece of sending out or receiving all without any inserting logic.
See again Figure 143, just can consist of simple two line interface FIFO levels by removing logical block. At this moment data and useful signal directly connect between latch, and the in that is latched _ valid signal is directly linked the nor gate in the input of in_accept latch, and is the same by the situation of gate control with the out_accept signal as out _ valid signal. When corresponding reception signal when being high, data and useful signal are just propagated. By this way, in_valid and out_accept_reg mutually or, if in_valid is low, out_accept_reg will be received for low data. When occuring, gap (with the low data of significance bit) will remove from streamline whenever stopping (it is low to receive signal).
Shown in Figure 143, after the logical block insertion, in_accep is also relevant with the state of data or piece with out_valid. Shown in configuration in, any state that the principal and subordinate arranges in the middle piece that keeps is standard, main equipment enables with Ph1, enables with Ph0 from equipment. B.11.4 spy out the device piece
Spy out the device piece, by MPI, access data flow at the difference of chip. There are two classes to spy out the device piece. The common device piece of spying out only can be accessed in test mode, and clock can directly be controlled there. " the super device of spying out " can be accessed when clock moves. The circuit that it comprises makes from microprocessor bus next asynchronous data and inside chip clock synchronous. B.11.2, table is listed in interior all of interim decoder and spies out position and the type of device.
B.11.2, table is spied out device in temporal decoder
The positionType
  addrgen/vec_pipe/snoopz31Spy out device
  addrgen/cnt_pipe/endsnpSpy out device
  addrtgen/cnt_pipe/endsnpSpy out device
  addrgen/predread/snoopz44Spy out device
  addrgen/ip-wrt2/superz10The super device of spying out
  addrgen/ip_rd2/superz10The super device of spying out
  dramx/dramif/itsnoops/   snoopz15(fsnp)Spy out device
  dramx/dramif/ifsnoops/   snoopz15(bsnp)Spy out device
  dramx/dramif/ifsnoops/superz9The super device of spying out
  wrudder/superz9The super device of spying out
  pflts/fwdflt/dimbuff/snoopk13Spy out device
  pflts/bwdflt.dimbuff/snoopk13Spy out device
  pflts/snoopz9Spy out device
Two kinds of details of use of spying out device are included in test one joint. The details of the work of jtag interface is included in the JTAG file. B.12 functional block B.12.1 top layer fork (Top Fork)
According to the present invention, there are two purposes at the top layer fork. The first, it is bifurcated into two streams that separate with data flow: one is arrived address generator, and another is to FIFO. The second, it provides starting and stops the means of chip, and chip can be configured.
The appearance of device fork part is very simple. Same data both had been added on the address generator, also were added on the FIFO. Before acknowledge(ment) signal was sent back to upper level, these data just must be received by address generator and FIFO. Therefore, the useful signal of fork two branches all depends on the acknowledge(ment) signal of another branch. In halted state, the useful signal that is added to two branches all is retained as low such as fruit chip.
Remain when low at in_accept, the chip starting is until configuration bit is set high. This has guaranteed do not have data to be accepted before user configuring chip. If the user need to be at any other time configuring chip, then he must arrange configuration bit and wait, until chip is finished current stream. It is as follows to stop step:
1) if configuration bit is set, after the top layer fork detects a FLUSH token, no longer receives more data.
When 2) the FLUSH token arrived read pointer, chip has finished stream to be processed. This uprises signal seq _ done.
3) when seq_done uprises, an event bit is set, this available microprocessors is read. Event signal can be shielded by event block. B.12.2 address generator
In the present invention, address generator (addrgen) is responsible for the block count in the frame, and is responsible for the correct address sequence of DRAM data transmission generation. The input of address generator is the token streams of coming from token input port (by the top layer fork). It output to the DRAM interface, comprise address and out of Memory. These information are controlled by a request/affirmation agreement.
The major part of address generator is:
Token decode
The generation of block count and DRAM block address
Motion vector data is converted to the address offset amount
For prediction transmits to the address generator request
Reorder and read the location generator
Writing address generator is token decode (tokdec) B.12.2.1
In the token decode device, with coding standard, frame and block message and with motion vector relevant token decoded. The information of extracting from stream deposits one group of register in. Also can access these registers by upi. Detect the data token head just to follow-up each piece signalling enable block counting and address generate. What does not occur during operation JPEG.
Decoded token list is as follows:
  ·CODING_STANDARD
  ·DATA
  ·DEFINE_MAX_SAMPLING
  ·DEFINE_SAMPLING
  ·HORIZONTAL_MBS
  ·MVD_BACKWARDS
  ·MVD_FORWARDS
  ·PICTURE_START
  ·PICTURE_TYPE
  ·PREDICTION_MODE
This piece also makes up from the next information of each request generator and goes the conversion (toggling) of control frame pointer, and stops inlet flow. When a new frame occurring in the input (form with pictu-re_start token occurs), stream is stopped, but relevant with former frame write back or reorder and read not yet to finish. B.12.2.2 count of macroblocks device (mblkcntr)
Count of macroblocks device of the present invention comprises four basic counters, the horizontal and vertical position of piece in the horizontal level of macro block and upright position and the macro block in their sensing frames. When the time begins and when PICTURE_START occurs each time, all counter reset-to-zeros. When the data token head arrives, according to the structure of the chrominance component in the token head number and frame, these counters increments and resetting. This frame structure is described with the sampling register in the token decode device.
To a given chrominance component, counting carries out as follows. Each the new data token of a same component, the horizontal block counting increases, until it reaches the width of macro block, then is reset. This resets the vertical blocks counting is increased, until it reaches the height of macro block, so it resets. When this occurs, just expect next chrominance component. Therefore, each component in the macro block by this operation repeat-the horizontal and vertical size of macro block may be different because of each component. If to arbitrary component, the piece number that receives lacks than expection, counting will proceed to next component with not makeing mistakes.
When the chrominance component of data token is lacked than desired value, the horizontal macroblock counting increases. (note, for a given chrominance component, when occurring counting more than the piece of desired value, also can this thing happens. Because each counter can be expected a higher minute volume index (index) at that time. When counting reaches in the macro block image width, horizontal count resets. This resets the vertical macroblocks counting is increased.
In CIF form H.261, the ability to count of macroblocks is arranged further. In this situation, an extra level that is called piece group (group of blocks) is arranged between macro block and image. This group is that 11 macro blocks are wide, and three macro blocks are dark, always two groups wide of images. The token decode device extracts the CIF position from the PICTURE_TYPE token. This is passed to the count of macroblocks device, instruction it to piece batch total number. Each component blocks number too much or situation very little can cause above-mentioned same reaction. B.12.2.3 piece calculates (blkcalc)
Piece calculates the coordinate that the Coordinate Conversion of the interior piece of macro block and macro block is become the position of piece in the image, and namely it has broken level. Certainly, this must take the sampling of different chrominance components ratio into account. B.12.2.4 matrix address (bsblkadr)
The information of coming from blkcalc is used for calculating block address in linear DRAM address space with the chrominance component displacement. In fact, say for a given chrominance component that the linear block address is that the piece number (number of blocks dowm) of downward direction multiply by the piece number (number of blocks long) that image width adds length direction. This is added to the chrominance component side-play amount and goes to form the matrix address. B.12.2.5 vector shift (vec_pipe)
The motion vector displacement information that the token decode device gives adopts the form of horizontal and vertical pixel offset coordinates. Namely to forward direction and backward each in the vector one (x, y) arranged. (x, y) provides from the piece that forming to the displacement another predicted piece. Displacement is take half pixel as unit representation. Notice that these coordinates can be positive or negative. Then they be used to form the coordinate of piece and new pixel side-play amount at first according to each chrominance component and sampling calibration (scaled).
In Figure 145, the piece that hypographous region representation just is being formed. Dotted outline represents just predicted piece. Large arrow represents block offset-to the horizontal and vertical vector of DRAM piece. It is (Isosorbide-5-Nitrae) that this DRAM piece comprises the initial point of predicting piece-in this situation. Small arrow represents the position of the prediction piece initial point of new pixel side-play amount-in that DRAM piece. Because the DRAM piece is 8 * 8 bytes, it is (7,2) that the pixel side-play amount is look to.
Then multiplier array Vmarrla is converted to a linear vector side-play amount to piece vector shift amount. Pixel information is sent to the predictions request generator as (x, y) coordinate (pix_info). B.12.2.6 predictions request
Frame point, matrix address and vector shift amount are added together the formation block address. This piece will be got the location from DRAM (Inblkad3). If the pixel side-play amount is 0, only produce a request. If in x dimension or y dimension a side-play amount is arranged, then produce the block address of two requests-originally and a downward address among both or immediately immediately to the right. At x and y dimension side-play amount is arranged, then produce four requests.
Occurring in synchronously for the first time between the addition (Inblkad3) and state machine between chip clock mode and the DRAM interface clock mode. This state machine produces suitable request. So state machine (psgstate) DRAM interface clock timing, its scanning element consists of the part of DRAM interface scans chain. B.12.2.7 the read request that reorders and write request
Because do not relate to the pixel side-play amount here, form each address with the way of matrix address and relevant frame point addition. Reorder and read to use same frame memory. Prediction and data then are written back to another frame memory. Each piece comprises the FIFO of a weak point, is used for depositing the address, because the transmission of read and write data probably lags behind the transmission of the prediction data on appropriate address. (this is because of along chip data stream, and read/write data has more impact than prediction data convection current). Each piece also comprise between chip clock and the DRAM interface clock synchronously. B.12.2.8 side-play amount
DRAM disposes two frames and deposits, and each comprises at most three chrominance components. Frame in every frame deposits pointer and the chrominance component side-play amount must be programmed by upi. B.12.2.9 spy out device
In the present invention, it is as follows to spy out the position that device put:
Between blkcalc and bsblkadr-this interface comprises horizontal and vertical block mark, suitable chrominance component displacement and take the image width (right that component) of piece by unit.
In bsblkadr back-matrix address.
In vec_pipe back-linear block side-play amount, the pixel side-play amount in the piece and the information of prediction mode, chrominance component and H.261 operation.
At Inblkad3 back-physical block address, described in " predictions request ".
The super device of spying out is placed in the read and write request generator that reorders, and uses when the outside DRAM of test. All details please be seen DRAM interface joint. B.12.2.10 scanning
The addrgen piece has its scan chain, and its timing is controlled with the clock generator (adclkgen) of parts oneself. Note the request generator ownership DRAM interface clock scope in the rear end of parts. B.12.3** predictive filter
According to the present invention, the general construction of predictive filter is shown in Figure 146. The forward and backward wave filter is together complete, and they are to the filtering of MPEG forward and backward prediction piece. Only have forward-direction filter to be used in H.261 mode (the h261_on input of backward filter should be low forever, because H.261 stream does not comprise back forecast). Whole predictive filter piece is comprised of the streamline of some two line interface levels. B.12.3.1 predictive filter
The work of each predictive filter is fully irrelevant with another, and valid data one appear at its input, and it is with regard to deal with data immediately. Can find out that from Figure 147 a predictive filter comprises four independently pieces, wherein two is identical. Be preferably MPEG and H.261 operate the work of describing independently these pieces. H.261 more more complex, it is at first described. B.12.3.1.1 H.261 operation
Used one-dimensional filtering equation is as follows:Fi=xi+1+2xi+xi-14(1&le;i&le;6)
                     Fi=xi(other)
This formula is used for every delegation of 8 * 8 by the x predictive filter, is calculated that by y wave filter is used for each row. The mechanism that realizes it is shown in Figure 148. This is the reproduction of pfltldd figure basically. Wave filter is comprised of three two line interface pipeline stages. To first and last pixel of delegation, register A and C are reset, and data by register B, D and F (content of B and D is added to 0) with changing. The control set of B * 2 * mux is so the output of register B moves to left one. This displacement is except a position, and this position always is shifted in anything part. So all values is taken advantage of (more explanations about this respect see below) by 4. All other pixel xi+1Load register C, xiLoad register B, xi-1Load register A. Can find out from Figure 148, so H.261 filtering equations is implemented. Because vertical filtering is three horizontal group is that a unit finishes, (seeing following note about the dimension buffer) do not need differently to process first and last pixel of delegation. In delegation, control uses the control logic relevant with each 1-D wave filter to finish with the counting of pixel. Should be noted that the result is not also removed by 4. Finish by 16 except (moving to right 4) in the input of predictive filter adder (B.12.4.2). This realizes after horizontal and vertical filtering is all finished. So computational accuracy is loss not. Register DA, DD and DF deliver to streamline under the control signal. This comprises h261 _ on and last_byte.
In other piece in predictive filter, the function of formatter only is that the assurance data are delivered to the x-wave filter with proper order. Therefore this only needs three grades of shift registers, and the first order is linked the input of register C, the second level link register B and the third level to register A.
Between x and y wave filter, dimension buffer buffered data is so that take three vertical pixels as one group delivering to the y-wave filter. Yet the group of these three pixels is still flatly processed, so there is not transposition to occur in predictive filter. Referring to Figure 149, from the order of dimension buffer output pixel, explanation in showing B.12.1.
B.12.1, H.261 table ties up buffer sequence
ClockThe input pixelThe output pixelClockThe input pixelTheoutput pixel
  1   0   55[a]   17   16   7
  2   1   56   18   17   F(0,8,15)[b]
  3   2   57   19   18   F(1,9,17)
  4   3   58   20   19   F(2,10,16)
  5   4   59   21   20   F(3,11,19)
  6   5   60   22   21   F(4,12,20)
  7   6   61   23   22   F(5,13,21)
  8   7   62   24   23   F(6,14,22)
  9   8   63   25   24   F(7,15,23)
  10   9   0   26   25   F(8,16,24)
  11   10   1   27   25   F(9,17,25)
  12   11   2   28   27   F(10,18,26)
  13   12   3   29   28   F(11,19,27)
  14   13   4   30   29   F(12,20,28)
  15   14   5   31   30   F(12,20,28)
  16   15   6   32   31   F(14,22,30)
The minimum row of a, last pixel (least row of pixels from previous block) is not if having last (if or have long gap) then be invalid data between piece.
B, F (x) represent the function in the filtering equations H.261. B.12.3.1.2 MPEG operation
When MPEG worked, predictive filter was finished simple half pixel (half pel) interpolation method:Fi=xi+xi+12(0&le;i&le;6.halfpel)
         Fi=xi(0≤i≤7,integerpel)
Unless h261_on is input as low, this is default filtering operation. If it is low that signal dim enters the 1-D wave filter, then integer pixel (integer pel) interpolation method is performed. Correspondingly, if h261-on is low for hanging down with xdim and ydim, all pixels directly pass through without filtering. This is an obvious requirement, and namely when dim signal when to enter the 1-D wave filter be high, row (or row) will be 8 pixels wide (or high). This be summarised in table B.12.2 in. With reference to Figure 148, " 1-D predictive filter ". The 1-D wave filter to the operation of pixel in the middle of the MPEG with in the delegation H.261 first and the operation of final pixel be the same. For MPEG half pixel operation, register A resets forever, the output of register C move to left 1 (the under any circumstance output of register B always moves to left one). So after two clocks, register F comprises (2B+2C), be four times in the result who needs, but this input in the predictive filter adder has been given attention, count there oneself and flow through x and y wave filter, moved toright 4.
Show B.12.2 1-D filter operations
    h261-on     xdim     ydimFunction
    0     0     0   FiXi
    0     0     18 * 9 ofMPEG
    0     1     09 * 8 ofMPEG
    0     1     19 * 9 ofMPEG
    1     0     0H.261low pass filter
    1     0     1Illegal
    1     1     0Illegal
    1     1     1Illegal
The function of formatter and dimension buffer is also fairly simple in MPEG. Formatter must be collected first two effective pixels, this with ability they delivered to the x-wave filter make half pixel interpolation; The dimension buffer only need cushion delegation. It should be noted that always can only there be 8 pixels in delegation after data are passed through the x-wave filter, this is because filtering operation is the line translation of 9 pixels the row of 8 pixels. " losing " pixel replaces with the gap in data flow. After finishing half pixel interpolation method, the X-wave filter inserts a gap in every row terminal (after per 8 pixels); The y-wave filter inserts 8 gaps at the end of piece. This is important, because the group in 8 or 9 gaps at piece end is with consistent for data token head and other token between the data token in the FIFO stream. This makes the circulation of the worst case of chip reduce to minimum. This situation occurs in 9 * 9 when filtered. B.12.3.2 predictive filter adder
When MPEG worked, reckoning can be with width of cloth figure early, and the figure that a width of cloth is slower or both on average form. Prediction with the frame formation of morning is called forward prediction, with the back forecast that cries of slower frame formation. The function of predictive filter adder (pfadd) is to determine with which predicted value of filtering (forward direction, backward or both), and determines still average by both of prediction by (pass through) forward direction or backward filtering. (rounding off to positive infinity).
Prediction mode can only change between piece, namely when starting (power-up) or fwd _ 1st_byte and/or bwd_1st_byte signal effectively after, show it is the last byte of current prediction piece. If current block is a forward prediction, then only check fwd_1st_byte. If it is a back forecast, then only check bwd_1st_byte. If it be one bi-directional predicted, then fwd_1st_byte and bwd_1st_byte both will check.
Which predicted value signal fwd-on and bwd-on determine to use. At any time, these signals may two all effectively or two all invalid. During startup, or when the input of piece effective (valid) data and when a gap occurring not, piece enters all invalid states of two signals.
Be that next piece determines prediction mode with two criterions: signal fwd_ima_twin and bwd_ima_twin, they show forward block or backward consists of a bi-directional predicted right part, and bus fwd_p_num[1:0] and bwd_p_num[1:0]. These buses comprise numeral, and these numerals are for each prediction piece or predict that newly piece increases by 1 to (pair). These pieces are necessary, because, for example, if any two forward prediction pieces, and then one bi-directional predicted of back, the DRAM interface is enough backward of getting bi-directional predicted at a distance in front, so that before second forward prediction piece, this backward input that reaches the predictive filter adder. Similarly, other backward and forward prediction sequence can be taken out from sequence in the input of predictive filter adder. Therefore, prediction mode is following below is determined:
1) if effectively forward data exists, and fwd_ima_twin be high, and then piece just stops, untilput 1, effective backward data arrival along with bwd_ina-twin. Then those pieces that are averaged by every pair of predicted value.
2) if effective backward data exist, and bwd_ima_twin be high, and then piece stops, until along with fwd-ima-twin put 1, effectively forward data arrives, then as above processing. If the forward and backward data are effectively together, then do not stop.
3) exist such as effective forward data, but fwd_ima_twin does not put 1, then checks fwd_ p_num. As this equal from the number that the last time is predicted out add 1 (being stored in the pred_num) then tested mode be set to forward direction.
4) exist such as effective reverse data, but bwd_ima_twin does not put 1, then checks bwd_ p_num. Equal to add 1 (being stored in the pred_num) from the number that the last time is predicted out such as this, then prediction mode is set to backward.
Notice, used from streamline and return " early-valid " signal that one-level is come that this is for predictive filter adder mode can be arranged before first data of coming from new piece arrive. This guarantees not stop to be introduced into streamline.
Ima_twin and pred_num signal do not have and the data of filtering together by forward direction and backward prediction filter streamline. This be because:
1) these signals only just are verified when being effective at fwd-1st-byte and/or bwd-1st-byte. This has almost saved 25 three pipeline stages in each predictive filter.
2) in whole, signal is remained valid, so when fwd_1st_byte and/or bwd_1st_byte arrival predictive filter adder, signal is effective.
3) signal is verified with the previous clock cycle in data arrival. B.12.4 predict adder and FIFO
Prediction adder (padder) forms the frame (predicted frame) of predicting with the method that the data of predictive filter is added error information (error data). In order to compensate from input through address generator, DRAM interface and predictive filter cause locates to postpone, and error information is arriving before the Padder, through the FIFO (sfifo) of 256 words.
CODING_STANDARD, PREDICTION_MODE and data token are decoded, to determine when predict that piece is formed. In data token, 8 position prediction data add 92 complement code error information. Its result is limited in 0 to 255 scope and delivers to next piece. Notice that this data limit also is applicable to all intra-coded data, is included among the JPEG.
Prediction adder of the present invention also comprises a mechanism. It is used for detecting not mating between the data come from FIFO and the data from predictive filter. In theory, the next data token of FIFO that the data volume Ying Yucong that comes from wave filter comprises prediction counts up to full correspondence. In the event of catastrophe failure, padder can attempt to compensate.
The tail of the data block of coming from FIFO and wave filter is inputted to identify with in-extn and fl-last respectively. When the end of filter data when the end of data token stream was detected in the past, the remainder of token continues constant output. On the other hand, if filter block is longer than data token, then input is stopped, until all unnecessary filter data have been accepted abandonment.
In FIFO or prediction adder, all do not spy out device. Because can become to make data directly deliver to these pieces from the token input port chip configuration, and the token output port is directly delivered in their output. B.12.5 write pointer and read pointer write pointer (wrudder) B.12.5.1
Write pointer all from the prediction adder token passing to read pointer. It also transmits all data blocks of I among the MPEG or P image, and all data blocks H.261 are to the DRAM interface, so that these data blocks can be write into the external frame memory under the control of address generator. All major functions are included in the two-wire interface level, spy out device although write back data in one of passage in transit to the DRAM interface.
Write pointer is to following token decode:
B.12.3, the token that table is decoded by write pointer
The token nameThe function of write pointer
    CODING_STANDARDJpeg stream is forbidden writing back
    PICTURE_TYPEOnly write back at I and P frame, the B frame does not write back
    DATAOnly have the data in the data token to be write back
After the data token head was detected, all data bytes outputed to the DRAM interface. Represent that with the in_extn step-down data token finishes to be detected. This makes one fills with the alternate buffering device that (flu-sh) signal is delivered to the DRAM interface. Under normal operation, in any case this can be consistent with the moment that the alternate buffering device replaces, do not comprise 64 byte datas such as the data token, this provides a recovery ways (although seldom a few width of cloth output images subsequently are possible incorrect). B.12.5.2 read pointer (rrudder)
The read pointer of this invention has three functions, and wherein two major functions are related to reordering of in MPEG image sequence:
1) data that will read back from the external frame memory are inserted token streams in the tram.
2) in I and P image, visual header is reordered.
3) with the method that detects the FLUSH token, detect the end (seeing B.12.1 joint, " top layer fork ") of token streams.
The structure of read pointer is shown in Figure 150. Whole parts are to form according to standard two line interface technology. Token in the input interface latch is decoded, and these decodings have determined the work of parts:
B.12.4, the token that table is decoded by read pointer
The token nameThe read pointer function
  FLUSHSignal is to the top layer fork
  CODING_STANDARDNot MPEG such as decoding standard, forbid reordering
  SEQUENCE_STARTThe data that first image of retracing sequence is read back are invalid
  PICTURE_STARTSignal current output FIFO and must exchange (I or P image). In the image token first
  PICTURE_ENDAll above token of image layer allow to pass through
  TEMPORTAL_REFERENCESecond image token
  PICTURE_TYPEThe 3rd an image token
  DATAWhen reordering, the content of data token is changed the data that reorder
Reordering function is decided by MPI, but is not that MPEG then is under an embargo such as coding standard, no matter the state of register how. Whether same MPI register control address generator produces the address of reordering. So, reorder from this piece output. How to work in order to understand read pointer, consider respectively the input and output logic. Remember that the sequence of token is as shown below:
·CODING_STANDARD
·EQUENCE_START
·PICTURE_START
·TEMPORAL_REFERENCE
·PICTURE_TYPE
·Picture containing DATA Tokens and other tokens
·PICTURE_END
·...
·PICTURE_START
... input control logic B.12.5.2.1
Start at the beginning, all tokens enter FIFO1 (being called current input FIFO) until run into first PICTURE_TYPE token of I or P image. Then FIFO2 becomes current input FIFO, and all input is pointed to it, until the PICTURE_ TYPE of next I or P image run into, and FIFO1 becomes current input FIFO. In I and P image, all tokens between PICTURE_TYPE and the PICTURE_END are abandoned, except the data token. This is in order to prevent that motion vector etc. from linking together with error image in the stream that has reordered, and error image is originally nonsensical there.
Three bit codes are along with token streams is put into FIFO, to show existing of some token head. This has saved and must finish token decode in each FIFO output. B.12.5.2.2 output control logic
Start beginning, receive token (being called current output FIFO) until run into a visual initial code from FIFO1. After this, FIFO2 becomes current output FIFO. Return to referring to B.12.5.2.1 saving, can find out, in this one-level, three image tokens, PICTURE_START, TEMPORAL_REFERENCE and PICTURE_START are retained inFIFO 1. When I or P frame ran into a visual initial code at every turn, current output FIFO was exchanged. Correspondingly, three image tokens are stored, until next I or P frame. At that time, they are relevant with the data that correctly reorder with becoming. Therefore the B image does not reorder, and is abandoned without any token and passes through. In the first width of cloth figure, all tokens comprise PICTURE_END, are abandoned.
During I and P image, the data that data token comprises in the token streams are replaced by the data that reorder of coming from the DRAM interface. During the first width of cloth figure, the data of " through reordering " still exist in the data input of reordering, because address generator still asks the DRAM interface to reach it. This is counted as insignificant information and is abandoned. B.13 B.13.1 the DRAM interface is summarized
In the present invention, spatial decoder, temporal decoder and video formatter, each comprises the DRAM interface unit of this special chip. In whole three devices, the function of DRAM interface is to transmit data from chip to outside DRAM with from outside DRAM to chip. The data transmission is to be undertaken by the block address that address generator provides.
The DRAM interface is typically used clock work. Asynchronous between this clock and address generator and clock by each piece of data, however this asynchronous be easy to handle because these clock works are in similar identical frequency.
Data communication device is everlasting between the remainder of DRAM interface and chip with the block movement (unique exception is the prediction data in the temporal decoder) of 64 bytes. The generation that transmits is by a device that is called " alternately buffer ". This is in fact a pair of a pair of RAM with the operation of double-damping structure form. Fill or during a RAM of clearancen the other part clearancen of chip or fill another RAM with the DRAM interface. An independent bus line that carries the address from address generator links to each other with each alternate buffering device.
Each chip has four alternately buffers, but every kind of situation, these functions that replace buffer are different. In spatial decoder, alternate buffering device is used for transmitting coded data to DRAM, and another reads coded data from DRAM, and the 3rd is transmitted the token data to DRAM, and the 4th read the token data from DRAM. In temporal decoder, alternate buffering device is used for writing pictorial data base or prediction to DRAM, and second is read base data or prediction data from DRAM, and two other reads the forward and backward prediction data. In video formatter, alternate buffering device is used for transmitting data to DRAM, and other three are used for from the DRAM read data,, read brightness (Luminance) (Y) that is, in each one of red, blue difference data (being respectively Cr and Cb).
Being operated in the spatial decoder file of the general characteristic of DRAM described. Lower joint is described the distinctive characteristic of temporal decoder. B.13.2 temporal decoder DRAM interface
As B.13.1 described in the joint, temporal decoder has four alternately buffers: two are used for the decoded base of read and write and (I and P) that predicted pictorial data. The work of these buffers such as aforementioned. Two other is used for getting prediction data.
Usually, prediction data is by x, and the motion vector of y-shaped formula shows just processed position skew. The block boundary of the data of (with writing DRAM) so the data block general tree that is removed should be encoded before it. This illustrates in Figure 151 and Figure 25. Here the shadow region represents the piece that forming. Dotted outline represents just predicted piece. Address generator the address transition that shows with motion vector be a block offset (monoblock number) shown in large arrow, and pixel side-play amount is shown in small arrow.
In address generator, frame point, matrix address and vector shift amount are added up and form the address of the piece that will get from DRAM. If the pixel side-play amount is 0, only produce a request. As tie up at x or y both one of in a side-play amount is arranged, then produce the block address of two requests-originally and a downward address among both or immediately immediately to the right. When x and y both have a side-play amount, then produce four requests. To each piece that will get, address generator calculates initial sum halt address parameter and these parameters is delivered to the DRAM interface. The use of these initial sum halt address is preferably exemplified, and is summarized as follows:
Consider a pixel side-play amount (1,1), shown in shadow region among Figure 152 Figure 26. Address generator causes four requests, and label is that A is to D in the drawings. The problem that solves is how desired row address sequence is provided soon. Solution is with " start/stop " technology. Be described below:
Consider the A piece among Figure 152. Must (1,1) begin to read from the position, finish with position (7,7). Suppose and once reading now a byte (i.e. 8 DRAM interfaces). Three least significant bits of the x value calculated address that coordinate is right, the y value forms three highest significant positions. The initial value of x and y all is 1, providesaddress 9. Data are from this address read, and the x value increases. This process is repeated until that the x value reaches stopping till the value of it. At the moment, the y value increases 1, x initial value and is reloaded into, and providesaddress 17. Because each byte of data is read, the value of x increases again, until reach its value that stops. This process repeats until the value of x and y has all reached their value that stops. Therefore,address sequence 9,10,11,12,13,14,15,17 ..., 23,25 ..., 31,33 ... ..., 57 ..., 63 be produced.
Similarly, it is (1,0) and (7,0) that B piece initial sum stops coordinate, and the C piece is: (0,1) and (0,7), the D piece is: (0,0) and (0,0).
Next problem is that these data should be toward where writing. Obviously, see first the A piece, the data that read fromaddress 9 should be write 0,0 address, address in replacing buffer. The data that read fromaddress 10 should be write and replaceaddress 15 buffer, etc. Similarly, the data of reading from theaddress 8 of B piece should be write the alternately address 15 of buffer, and 16 data of reading should write alternately bufferaddress 15 from the address. This operation has very simple realization at last, is summarized as follows.
Consider piece A. Replace the anti-number (inverse) that the buffer address register is packed into and stopped to be worth when reading to begin, the negate of y stops value and forms three highest significant positions (MBS), and the negate of x stops value and forms three least significant bits. In this situation, when the DRAM interface was being readaddress 9 among the outside DRAM, alternately the address of buffer was 0. Then, the address register that replaces buffer when outside DRAM address register increases also increases. Shown in showing B.13.1:
B.13.1, table is predicted the explanation of addressing
Outside DRAM accessReplace the buffer addressOutside DRAM address (binary system)Replace buffer address (binary system)
9=y-start, x-start 0= y-stop, x-stop     001001     000000
10 1     111110     000001
11 2     001011     000010
15 6     001111     000110
17=y-1,x-start 8=y+1 x-stop     010001     001000
18 9     010010     001001
Discussion so far concentrates on 8 DRAM interfaces. For 16 or 32 interfaces, must make a small amount of modification. The first, the pixel offset vector must by " restriction " (clipped), make it point to 16 or 32 edges. In the example that we use always, to the A piece, beginning DRAM readssensing address 0, and the data inaddress 0 to 3 will be read. Below, the data of not wanting must be abandoned. All data are write alternately buffer and add that side-play amount reads, just can accomplish this point. (alternately the physical quantity of buffer ratio is necessary large in the time of 8 now). During as MPEG half pixel interpolation method, must read 9 bytes from the DRAM interface at x and/or y direction. In this situation, address generator provides suitable initial sum halt address, and some other logic is used in the DRAM interface. But the working method of DRAM interface does not have basic change.
About the DRAM interface of temporal decoder, last point it is to be noted, must provide other information to predictive filter, need to do which kind of processing with indication to data. This comprises following aspect:
" last byte " signal indicates the last byte (64,72 or 81 byte) that once transmits
H.261, one identifies
A bi-directional predicted sign
With two sizes that indicate piece (be 8 or 9 words at x and y direction)
Double figures shows the order of each piece
Last byte-identifier can produce when alternately buffer is read in data. Other signal obtains from address generator, and carries by the DRAM interface, so that they interrelate with correct data block. This is to read from replacing buffer with the predictive filter parts because of data. B.14 UPI file preface B.14.1
The presents intention makes the reader to the work of MPI of the present invention a understanding be arranged. UP interface isospace decoder and temporal decoder interface are the same basically, and unique difference is the number of address wire.
Logic described herein is pure to be the microprocessor internal logic. Relevant circuit diagram is:
  UPI
  UPI101
  UPI102
  DINLOGIC
  DINCELL
  UPIN
  TDET
  NONOVRLP
  WRTGEN
  READGEN
  VREFCKT
Circuit U PI, UPI101, the difference of UPI102 is in UPI101 the input of 7 bit address, its 8th total ground connection, and UPI and UPI102 have the input of 8 bit address,
Input/output signal
Signal described herein is a tabulation to all input and output of UPI module (input, output define with respect to UPI). Describe source or the destination of these signals in the table in detail:
NOTRSTINPUT Glolal chip reset, Low level effective comes from temporary (Pad) enter drive.
E1InputEnable signal 1, Low level effective comes from temporary (Pad) enter drive (Schmitt).
E2InputEnable signal 2, Low level effective comes from temporary (Pad) enter drive (Schmitt).
RNOTWInputRead not write signal comes from temporary (Pad) enter drive (Schmiff).
ADDRIN[7:0] the Input address bus signal, come from temporary (Pad) enter drive (Schmitt).
NOTDIN[7:0] the Input1 input data bus, come from temporary (Pad) driver of the input of Micro Processor Bipolar data pin (TTLin).
INT_RNOTWoutput reads inside not write signal, to the internal wiring of being accessed by MPU interface (seeing memory diagram).
INT_ADDR[7:0] the OutPut internal address bus, to all circuits (seeing memory diagram) of being accessed by MPI.
INTDBUS[7:0] the Input/output internal data bus, to all circuits (seeing memory diagram) of being accessed by MPI, and the microprocessor data o pads. The data that internal data bus transmits are with the data opposite (inverse) on the chip pin.
READ_STROutputAn is an internal timing signal, and it indicates once reading of a unit in device memory figure.
WRITE_STROutputAn is an internal signal, and it indicates the write-once of a unit among the internal storage figure.
TRISTATEDPADOutputAn is an internal signal, links the microprocessor data o pads, and it indicates them should be ternary.
Total note:
The UPI schematic diagram comprises 6 less module: NONOVRLP, UPIN, DINLOGIC, VREFCKT, READGN, WRTGEN. Should be noted that from the signal summary table except the microprocessor bus timing signal, do not have the clock signal relevant with MPI. Other clock signal on chip is asynchronous to the bus timing signal to all. So, except those can by the sequential that external control forced, not have timing relationship between the other parts of microprocessor operation and equipment. For example, when at test macro access MPI the time, stop outside system clock.
Another implication that does not have clock in the UPI is that some inner sequential is that oneself arranges. In other words, some signal is internal controls to the delay of UPI parts.
The general function of UPI is to get address date from the external world, enable and read/write signal, and these signal format, makes them can correctly drive internal circuit. Definition is INT_RNOTW_INT_ADDR[... to the internal signal of memory diagram access], INTDBUS[...] and READ_STR and WRITE_STR. To a read cycle and a write cycle time, the sequential relationship of these signals represents below. Although one that should be noted that tables of data definition and following figure ordinary representation chip enables the cycle, circuit operation is such: enable signal can keep low, can circulate and do continuous read operation or write operation in the address. Because the address transition circuit is arranged, this function is possible.
In addition, INT_RNOTW, READ_STR, the existence of these signals of WRITE_STR reflects some redundancy really. Its allow internal circuit with following the two one of: with independently READ_STR and a WRITE_STR (ignoring INT_RNOTW), or with INT_RNOTW with one independently gating signal (gating signal by READ_STR and WRITE_STR mutually ' or ' obtain).
Within a read cycle, internal data bus is pre-charged to height, and it also has pull-up resistor, so for each expanded period, when internal data bus is not driven, it will default to 0 * FF state. Because the data on internal data bus and the pin are anti-phase, when being enabled, they are converted into 0 * 00 on this pin externally. This means, if one (this register is dead space in the memory diagram) of register of any external cycles access or register then exported data and determined, and be low.
The circuit details:
UPIN-
This circuit is total variation detection part. It comprises a sub-circuits, is called TDET. TDET is single position change detecting circuit. To each address bit and rnotw signal, and each enable signal, UPIN has a TDET module. UPIN also comprises some combinational logic, and each output that changes detection line is grouped together with door. The signal that this gate logic produces is as follows:
TRAN-shows the once transformation (transition) on input signal.
UPD-DONE-shows that some transformations finish, and one-period can be performed.
CHIP_EN shows that this chip is selected.
TDET-
This is the variation detection line of single position. It comprises 2 latch and 2 XOR gates. The signal SAMPLE timing of first latch, second with signal UPDATE regularly. These two non-overlapped signals come from module NONOVRLP. General operation is such, and an input changes and causes CHANGE one time, causes successively SAMPLE. As SAMPLE when being high, all inputs change and are accepted. When the input variation stops, then CHANGE step-down, and SAMPLE step-down, this causes that UPDATE uprises. So transfer data to output latch and show UPD_ DONE.
NONOVRP-
This circuit is a non-overlapping clock generator basically, and it inputs TRAN, and produces SAMPEL and UPDATE. In UPDATE output, external door control stops UPDATE to uprise, until a write pulse is finished.
DINLOGIC-
This module comprises 8 kinds of situations of data input circuit DINCELL and drives some door of TRISTAT-EPAD signal. This shows that the output FPDP only just drives in following situation, and Enable1 is low, and ENable2 is low, and RnotW is high and inner read_str is height.
DINCELL-
This circuit comprises data input latch device and a three-state driver that drives internal data bus. When signal DATAHOLD is high, and Enable1 and Enable2 be when both low, and the data of coming from input pad are latched. Whenever internal signal INT-RNOTW when low, three-state driver drives internal data bus. Internal data bus draws element (pullup) to be also contained in this module to transistors precharge on the bus.
WRTGEN-
This module produces WRITE_STR, and latch signal DATAHOLD, and the latter uses for data latches. Write gate is a self-timing signal, yet the delay between self-timing is determined in VREFCKT. The output of timing circuit RESETWRITE is used to stop the WRITE_STR signal. Should be noted that the actual write pulse of writing register only produces after finishing an access cycle. This is only to be sampled on the rear edge in cycle because be input to the data of chip. Therefore, only data are just effectively after having finished a normal access cycle.
READGEN-
This circuit, the name hint such as it produces the READ_STR signal. It also produces the PRECH signal, and this signal is used for the precharge internal data bus. The PRECH signal also is a self-timing signal, and its cycle is decided by VREFCKT, also gets the voltage of piece on internal data bus. READ_STR is not self-timing, but finishes beginning from precharge cycle, continues up to this cycle only to end up being. Precharge circuit phase inverter, their transfer characteristic is biased, so that they need one almost to be 75% voltage of power supply before paraphase. This circuit guarantees that before READ_STR began, internal bus was by correctly precharge. In order to prevent PRECH pulse tend zero width in the situation of precharge of internal bus, timing circuit guarantees the width of a minimum by signal RESETREAD.
VREFKCT-
VREFCKT is unique circuit of control interface self-timing. Two kinds of delays, the width of 1/WRITE_ STR and the width of 2/PRECH are all passed through the transistorized Current Control of P. Control this P transistor with signal VREF, this voltage is set with a 25k Ω diffusion resistance. C1 summarizes C.1.1 preface
Structure according to pixel format device of the present invention is shown in Figure 155, two address generators that are respectively applied to write and read are arranged here, manage two address generators and the buffer-manager of frame-rate conversion is provided, data are processed streamline, comprise the two over-sampling device (upsamplers) of vertical and level, color space conversion device and gamma correction also have one to adjust the final controll block of processing streamline output. C.1.2 buffer-manager
The token of input imagery formatter is buffered in first in first out (FIFO) memory. Then, be sent to buffer-manager, these parts detect the arrival of new image and judge the availability of each picture elements buffer of storage. If available buffer is arranged, it just is assigned to and arrives image, and its pointer just passes to writing address generator, if find without available buffer, the image of coming in will postpone until to have a buffer to become available. All tokens are sent to writing address generator.
Whenever reading address generator receives the VSYNC signal from display system, according to new display buffer pointer buffer-manager is formed a request. If have a buffer to contain complete pictorial data and think that image gets ready for showing, the pointer of buffer will be delivered to display address generator so, otherwise buffer-manager is just sent the pointer of shown final buffer device. When starting working, zero is transmitted as pointer, until the first buffer full.
Desirable visual number when if visual number (calculating during each picture input) shows (presenting number) more than or equal to given coding frame frequency, an image just is ready to show. Desirable number can determine by the count value of visual clock pulses, visual clock or produce in this locality or outside formation by Clock dividers, and this technology allows frame-rate conversion (drop-down such as 2~3).
Outside DRAM is used as buffer, and the sort buffer device quantitatively can be 2 or 3, just needs 3 buffers if will realize frame-rate conversion. C.1.3 writing address generator
Writing address generator receives token and detects the arrival of each new data token from buffer-manager. When each data token arrived, address generator was that the DRAM interface calculates the segment that a new address storage arrives, and then initial data passes to the DRAM interface, is written to the alternate buffering device in interface mileage certificate. Notice that the DRAM address is the figure block address, and image is pressed the grid block tissue in DRAM. Yet the pictorial data of input is in fact with the macroblock sequence tissue, so algorithm of Address Generation must be considered line width (in the piece number) skew to the row of piece lower in the macro block.
The arrival buffer pointer that provides by buffer-manager be used as store the address offset of whole image, and each component is stored in the individual region in specified buffering area, so also use the component skew in calculating. C.1.4 reading address generator
Token is not accepted or generated to reading address generator (dispaddr), it is calculated address only, response VSYNC, it can be according to field_info, read_start, and Sync_mode and lsb _ invert are from buffer-manager request buffer pointer, when receiving a pointer, it just forms 3 address set, and each corresponding one-component is in order to read current image by raster order. The permission difference is set to: interlaced/progressive shows and/or data, vertically do not sample, and field synchronization (interlacing is shown). On than low level, reading address generator converts the base address to the order block address, and be that each byte count in three components is with the page structure compatibility of DRAM, the address that offers the DRAM interface be with piece starting and block end counting page and row address. C.1.5 viewing pipeline
From the data of the DRAM interface viewing pipeline of feeding, three at first vertical interpolation of component data stream, then Horizontal interpolation, after the interpolation, three components have same ratio (4: 4: 4) and are sent to color space conversion device and color look-up table/gamma correction, and output interface can keep data flow at this moment until HSYSC arrives and shows. Then, 3 components to of output controller controls, two or three 8 buses, way is decided with the need. C.1.6 timing mode
Basically there are two kinds of timing modes and pixel format device to interrelate, the first is system clock, this clock is chip front end (address generator and buffer-manager, add the front end of DRAM interface) provide regularly, the second is pixel clock, and this clock provides all regularly (output of DRAM interface and whole viewing pipelines) for the rear end.
Above-mentioned two kinds of clocks all drive the clock generator on some chips. FIFO, buffer-manager and reading address generator be with same clock (D φ) work, and writing address generator is with the same similar but independent clock W φ. Data are entered in the DRAM interface synchronously with inner DRAM interface clock (out φ), and D φ, W φ and out φ generate by SYSCLK.
The read and write address is timed in the DRAM interface by the clock of DRAM interface oneself.
Data are read from the DRAM interface when bifR φ, and are sent to the viewing pipeline part (its physical location northeastward) that is called " bushy_ ne ". By the clock operation that represents among the NE, partly use independently but same clock R φ timing from gamma RAMS streamline forward. BifR φ, NE φ and R φ all take from pixel clock pixin.
In order to test, all main interfaces are with spying out device or the super device of spying out between each piece. This depends on timing mode and required access type. Be used for being separated from each other but latch with it when having the block boundary of identical timing mode to reset and interrelate. C.2 buffer management
Effect according to buffer management piece of the present invention provides an address generator, thisaddress generator 2 or 3 external buffers of pointer mark, and they are used for the read and write of pictorial data. The distribution of these pointers is affected by 3 principal elements. Each factor represents the at work effect of a timing mode. These factors refer to that pictorial data arrives the speed (coded data rate) of pixel format device input, the frame frequency (presentation rate (presentation rate)) of the video sequence of the speed (demonstration data transfer rate) that data show and coding. C.2.2 functional overview
3 buffer system allow presentation rate and show speed different (drop-down such as 2~3), so when frame frequency is subjected to the timing restriction, when needs obtain the best frame sequence of possibility, can adopt repeating frame or frame-skipping. The image that presents some difficulty in decoding can also solve according to similar approach. So when the image decoding spended time is longer than getable displaying time, when being busy with other thing, the frame of front will be repeated. In the double buffering system, 3 timing modes need pin-just this 3rd buffer because occupy space (taking up slack) thus flexibility is provided.
Buffer-manager is worked with some status information that each external buffer is associated by maintenance. These information comprise the indication buffer whether at the various marks that use, and whether data are full, show whether be ready to, and whether the image numbers in the image sequence is in the buffer of current storage. The number of presenting also is recorded. Whenever receiving a visual clock pulses, this number just increases, and presentation image number is that frame frequency according to coded sequence shows present desirable image numbers.
At input, when detecting the PICTURE_START token, just distribute one to arrive buffer (buffer that the data that enter will be written into) at every turn. Then, this buffer flag becomes using. When PICTURE_END, the arrival buffer is disengaged distribution (being set to zero) and according to image numbers and the number of presenting Relations Among, buffer is marked as full (FULL) or is ready to (READY).
Each VSYNC is through two-wire interface display address generator request one new display buffer. If there is buffer flag to be ready to, distribute this buffer to be assigned with demonstration through buffer-manager so. If there is not READY (being ready to) buffer, the buffer that shows previously shows repeating.
Whenever the variation of the number of presenting is detected, according to READY_ness each buffer that comprises complete image is tested. This test is undertaken by checking visual number and presenting the number Relations Among. Buffer is considered successively. When any one of buffer is considered to be ready to (READY), the automatic READY_ ness of any buffer of cancellation front mark READY just. The buffer of front just indicates into empty (EMPTY). This is effective, because according to allocative decision, the image numbers of back is stored in the buffer, and this considers afterwards.
Be instructed to if skip image in inlet flow, the TEMPORAL_ REFERENCE token in H.261 is modified the image numbers in the buffer. Although these characteristics such as we are desired, and currently do not comprise. Equally, the TEMPORAL_REFERENCE token in MPEG is also without effect.
The FLUSH token stops input, until each buffer-empty or the buffer that is used as demonstration are assigned with. Then, present number and visual number resets, and begin a new sequence. C.2.3 C.2.3.1.1 structure C .2.3.1 interface arrives the interface of bm_front
All data are from FIFO, and bm_front is input to buffer-manager. Produce transmission by two-wire interface, data are wide to be 8 and to add an extension bits. All data that arrive buffer-manager guarantee it is complete token. If have obvious gap to occur in the data flow upper end, just need to continue to process the request that presents number and display buffer. C.2.3.1.2 arrive the interface of waddrgen (writing address generator)
Token (8 bit data, 1 extension bit) is sent to writing address generator through two-wire interface. The pointer that arrives buffer also passes on the same-interface, so that be that address generation can obtain correct pointer when the modern board of PICTURE_START arrives writing address generator. C.2.3.1.3 arrive the interface of dispaddr
Interface to reading address generator comprises 2 independent two-wire interface, and they have been counted as respectively the effect of " request " and " answer " signal, and single line is inadequate, because all be based on the state machine of two-wire at 2 ends.
Generally, as follows with the relevant event order of dispaddr interface. At first, the drq_valid of dis_ paddr by assert, response is from the VSYNC that display device comes, and the request of sending is input to buffer-manager. Secondly, when buffer-manager when its state machine obtains a certain appropriate point, the request that receives also distributes one to be used for the buffer that shows. Then, the disp_Valid line is identified that buffer pointer is transmitted and is typically received by dispaddr immediately. And, there is one with the last relevant additional wire of two-wire interface (rst_fld), its indication number must reset with the relevant field of current pointer, and no matter front surface field number. C.2.3.1.4 MPI
The buffer-manager piece uses 4 and 8 bit data bus and the read-write gating in Microprocessor Address space. There are 2 to select signal, but an expression user access unit, and another expression does not need the test cell of access in normal working conditions. C.2.3.1.5 event
Buffer-manager can produce two different event: pointer finds and delays to reach. First event arrives and its PICTURE_START expands and is identified when byte (graphics pointer) is write the value coupling of BU_BM_TARGET_IX register when arranging when image. Second event occur in when display buffer be assigned with and its visual number less than current when presenting number, the processing that namely should be undertaken by buffer-manager in the system flow waterline does not manage to catch up with the demonstration demand. C.2.3.1.6 visual clock
Visual clock is the clock signal as the number of presenting counter in the present invention, it or generate from chip, perhaps take from external source (standard display system). Buffer-manager receives these two signals and selects one of them according to the value of Pclk_ext (in the control register of buffer-manager one). This signal can be used as the enable signal of temporary picutpad. So if the pixel format device is generating its visual clock, this signal also can be used as the output signal of chip. C.2.3.2 main functional block
The lower part explanation forms each hardware capability piece of buffer-manager schematic diagram (bmLogic). C.2.3.2.1 I/O piece (bm input)
This module comprises all hardware relevant with four two-wire interface of buffer-manager (input and output data, drq_valid/ receives and disp_valid/ receives). The input data register is shown with the token of some the decoding hardware that is attached thereto. Be input to the signal vheader on the bm_ tokdee, the output of guaranteeing the token decode device is only thought in leader point effectively (namely not in the centre of token). The rtinmd piece is as the output data register, and is adjacent with double (duplicate) the input data register of next piece in streamline. This is for considering that the different clocks generator causes timing difference. Signal " go " is based on effective reception of data with " ngo " to be stopped and (AND) with non-, and is used for the state machine in other places, to represent whether thing is inputing or outputing end " obstruction (bunged up) ".
The display pointer of this module partly comprises two-wire interface and about equivalence " go " signal of data, the rst_fld position also appears at here, if arrange, this signal keeps high level always until high level appears in a cycle disp_valid. Then, it is reset. In addition, made the whole of external buffer at the FLUSH token, it is empty (EMPTY) or using (IN_USE) afterwards through the display buffer mark, and rst_fid is reset, at this synchronization image number with present number and be reset.
A small amount of adjunct circuit is arranged, and they and input data register interrelate, and this register appears on another higher level. This circuit produces a signal, and its expression input data register comprises a value that equals to write BU_BM_TARGIX, and this circuit also is used for the generation event. C.2.3.2.2 pointer blocks (bm index)
Pointer blocks mainly is comprised of 2 bit registers of the various crucial buffer pointers of expression. They are arr_buf, write the data that reach image in this buffer, disp_buf, read pictorial data to show it from this buffer, also have rdy_buf, it comprises the pointer of the buffer of up-to-date image, if buffer is asked by dispaddr, this image is shown. Also have a register that comprises buf_ix, it is as the generic pointer of directed at buffer. This register obtains an increment (" D " is input to Port Multiplier), tests circularly their state by buffer, that is to say can obtain arr_buf when state need to change a value among disp_buf or the rdy_buf. So these registers (Ph0 pattern) can be accessed as the part in space, test address from microprocessor. Old_ix only is the again timing of buf_ ix, and it is used to represent the state of enable buffer and the number of elephant register in the bm_stus piece. The two is decoded into these signals of 3 signals (each holdsvalue 1 to 3) from these piece output buf_ix and old_ix. Other output indicates whether that buf_ix has the identical value with arr_buf or disp_buf, whether rdy_buf and disp_buf the two value zero is all arranged. Zero is not a reference value for buffer. It only represents that reach/show/ready buffer current the distribution.
Arr_buf and disp_buf can be enabled by two-wire interface output receiving register separately.
Adjunct circuit at the bmLogic layer is used for judging whether that current buffer pointer (buf_ix) equals in use defined maximum pointer, and its value is write control register when arranging. The control register value is " 1 " expression three buffer systems, is worth to be " 0 " expression two buffer systems. C.2.3.2.3 buffer state
Major part is state and the image numbers register of each buffer in the buffer state. Each group is the master-slave mode arrangement in 3 groups, is 3 register groups from register, and master register is single register, and (using register to enable by old_ix) from register pointed in the output of this register. One obtains (in the bmLogic level by the buf_ix index) distributing from multichannel between the register output to the possible input of master register. Buffer state is decoded in the bmLogic level. For using desirable table any value C.2.1 or the value of its front of circulating of being shown in of logic state machine. The image desirable front face amount of number or front face amount increased by 1 (or 1 add δ, if H.261 normal with wish that time reference is different). 8 adders provide this value in the piece by appearing at. That at first be input to this adder is this_Pnum, the current image numbers of writing data.
Show C.2.1 buffer state value
Buffer stateValue
Empty     00
Full     01
Be ready to     10
Using     11
This need individually storage (in its own master-slave register, arranging) so that any one of 3 buffer image numbers registers at an easy rate according to the image numbers of current (or in the past) rather than own according to them before image numbers (this number is almost frequent out-of-date) renewal. This_Pnum is reset to-1 so that when first image arrives, it and output addition from adder, and therefore, what be input to the first buffer image numbers register is zero.
At current version Notable is that in default of the time reference piece that the δ value is provided, δ connects to zero. C.2.3.2.4 present number
8 number of presenting registers combine with presenting mark, be used in and represent in the state machine that the number of presenting changes after its upper level is verified, this needs, because visual clock is in fact asynchronous and effective during any state, is not limited to those and relates to the number of presenting state. Remaining circuit is the visual clock pulses that has occurred about detecting in this piece, and " memory " these facts. The number of presenting can be updated when it should change in this way. The expression sequence of Figure 156 presented event. Behind re_timed image rising edge clock, the signal incr_prn cycle becomes effectively, and lasts till the arrival of a state, and the number of presenting is modified during this state. This can be represented by signal en_Prnum. About the reason that only allows the number of presenting during some state, to revise, be because it is driven a large amount of logics by the user, comprise standard block, unusual 8 of piece adders not, these adders provide signal rdyst. Therefore, it does not change during must not using at state in succession this result's state. C.2.3.2.5 time reference
From present pixel format device embodiment, omit according to time reference piece of the present invention, but for for the purpose of complete, its work is explained.
The effect of this piece is to calculate δ, and the time reference value that namely receives in the token in the data flow H.261 is with poor between the time reference (1 adds front face amount) of wishing. In H.261, allow to omit some frames. The time reference token is left in the basket in all non-H.261 data flow. Calculated value is used in the status block and is buffer computational picture number, and omitting its impact of this piece image numbers in any sequence from bmLogic is continuous forever, even indicate some frame to skip at stream H.261.
The major part of this piece (in schematic diagram bm_tref as seen) is as tr, the register of exptr and delta. In the present invention, tr is reset to 0, and loads from the input data register in due course. Similarly, exptr is reset to-1, and increases by 1 or δ during the time reference sequence of states, and in addition, delta is reset to 0, and the difference between 2 registers of packing in addition. These 3 registers all reset behind the FLUSH token. Adder is used for calculating delta and exptr in this piece, namely does respectively to subtract and add operation with signal delta_calc control. C.2.3.2.6 control register (bm uregs)
Control register as the buffer register resides in the bm_uregs piece. They are access bit registers, and register (maximum number of definition external buffer and inner/outer image clock) and target indexed registers are set. Access bit is synchronous according to hope. Signal stopd_0, stopl_1 and nstopd_1 originate from access bit and 2 event position of rests or. UPi address decoder to whole bmLogic is to finish by piece bm_udec, and it gets data/address bus low 4 of UPi, and 2 of the top level address of pixel format device decoding select signals. C.2.3.2.7 state of a control machine
Logic state machine begins with its piece bm_state. Because of the formation reason of code, it puts and belongs to second of bmLogic schematic diagram now in order.
The major part of this logic is identical. It comprises decoding, and generation and new state coding for the logical signal of controlling other bmLog-ic piece comprise for the mark from_ps and the from_fl that come selecting paths by state machine. Each autonomous block is that Dm_stns and Dm_index produce the mux control signal.
Signal in state machine hardware provided the simplified Chinese character matronymic so as the typewriting and consult. They all be listed in table C.2.2 in. The logical expression that represents in addition simultaneously them, they also appear in the characteristic M explanation of bmLogic (bmLogic.M) as note.
C.2.2, table is used for the signal name of state machine
Signal nameLogical expression
    A   ST_PRES1.presffg.(bstate==FULL).rdytst.(rdy==0).(ix==max)
    B   ST_PRES1.presffg.(bstate==FULL).rdytst.(rdy==0).(ixt=max)
    C   ST_PRES1.presffg.(bstate==FULL).rdytst.(rdyt=0)
    D   ST_PRES1.presffg.!((bstate==FULL).rdytst.(ix==max)
    E   ST_PRES1.presffg.!((bstate==FULL).rdytst).(ixt=max)
    F   ST_PRES1.presffg
    G   ST_DRQ.drq_vatid.disp_acc.(rdy==0).(disp!=0)
    PP   ST_DRQ.drq_vaid.disp_acc.(rdy==0).(disp!=0).fromos
    CQ   ST_DRQ.drq_valid.disp_acc.(rdy==0).(disp!=0).fromfl
    RR   ST_DRQ.drq_valid.disp_acc.(rdy==0).(disp!=0).!(fromps+fromfl)
    H   ST_DRQ.drq_valid.disp_acc.(rdy!=0).(dsp!=0)
    I   ST_DRQ.drq_valid.disp_acc.(rdy!=0).(disp==0)
    J   ST_DRQ.drq_valid.disp_acc.(rdy==0).(disp==0).fromps
    NN   ST_DRQ.drq_valid.disp_acc.(rdy==0).(disp==0).fromfl
    OO   ST_DRQ.drq_valid.disp_acc.(rdy==0).(disp==0).!(fromps+frdomfl)
    K   ST_DRQ.!(drq_valid.disp_acc).fromps
    LL   ST_DRQ.!(drq_valid.disp_acc).fromfl
    MM   ST_DRQ.!(drq_valid.cisp_acc).!(fromps+fromfl)
    L   ST_TOKEN.ivr.oar.(idr==TEMPORAL_REFERENCE)
    SS   ST_TOKEN.ivr.oar.(idr==TEMPORAL_REFERENCE).H251
    TT   ST_TOKEN.ivr.oar.(idr==TEMPORAL_REFERENCE).!H261
    M   ST_TOKEN.ivr.oar.(idr==FLUSH)
    N   ST_TOKEN.ivr.oar.(idr==PICTURE_START)
    O   ST_TOKEN.ivr.oar.(idr==PICTURE_END)
    P   ST_TOKEN.ivr.oar.(idr==<OTHER_TOKEN>)
    JJ   ST_TOKEN.ivr.oar.(idr==<OTHER_TOKEN>.in_extn
    KK   ST_TOKEN.ivr.oar.(idr==<OTHER_TOKEN>).lin_extn
    Q  ST_TOKEN.!(ivr.oar)
C.2.2, table is used for the signal name (continuing) of state machine
Signal nameLogical expression
    S   ST_PICTURE_END.(ix==arr).!rdytstoar
    T   ST_PICTURE_END.(ix==arr).rdytst.(rdy==0).oar
    U   ST_PICTURE_END.(ix==arr).rdytst.(rdy!=0).oar
    W   ST_PICTURE_END.!oar
    RorVV   ST_PICTURE_END.!((ix==arr).oar)
    V   ST_TEMP_REFO.ivr.oar
    W   ST_TEMP_REFO.!(ivr.oar)
    X   ST_OUTPUT_TAIL.ivr.oar
    FF   ST_OUTPUT_TAIL.ivr.oar.!in_extn
    Y   ST_OUTPUT_TAIL.!(ivr.oar)
    GG   ST_OUTPUT_TAIL.!(ivr.oar).in_extn
    DD   ST_FLUSH.(ix==max).((bstate==VAC)+((bstate==USE).(ix==disp))
    Z   ST_FLUSH.(ix!=max).((bstate==VAC)+((bstate==USE).(ix==disp))
    DDorEE   !((bstate==VAC)+((bstate==USE).(ix==disp))+(ix==max)
    AA   ST_ALLOC.(bstate==VAC).oar
    BB   ST_ALLOC.(bstate!=VAC).(ix==max)
    CC   ST_ALLOC.(bstate!=VAC).(ix!=max)
    UU   ST_ALLOC.!oar
C.2.3.2.8 δ policer operation (bminfo)
Comprise in the present invention module bminfo, so that buffer status information, pointer value and present number and during simulating, can be observed. It is written among the M, and produces output when its one of input changes. C.2.3.3 register address conversion.
The address space of buffer-manager is divided into 2 zones, but user's access section and test section. Therefore, there are 2 independent enable lines to obtain from the top layer decoding. But C.2.3 table shows user's access function resister, and C.2.4 table shows test space content.
Show C.2.3 user-accessible register
Register nameAccessThe positionReset modeFunction
  BU_BM_ACCESS   0x10 [0]     1Buffer-manager access position
  BU_BM_CTLO   0x11 [0] [1]     1     1Max buf lsb:1 → 3 buffers.0 → 2 external image clock selecting
  BU_BM_TARGET_IX   0x12 [3:0]     0x0The arrival of image detects
  BU_BM_PRES_NUM   0x13 [7:0]     0x00The number of presenting
  BU_BM_THIS_PNUM   0x14 [7:0]     0xFFCurrent visual number
  BU_BM_PIC_NUM0   0x15 [7:0]     noneBuffer 1 interior image numbers
  BU_BM_PIC_NUM1   0x16 [7:0]     noneBuffer 2 interior image numbers
  BU_BM_PIC_NUM2   0x17 [7:0]     noneBuffer 3 interior image numbers
  BU_BM_TEMP_REF   0x18 [4:0]     0x00The time reference that flows automatically
Show C.2.4 scratchpad register
Register nameAccessThe positionReset modeFunction
BU_BM_PRES_FLAG  0x80 [0] 1Show tags
BU_BM_EXP_TR  0x81 [4:0] 0xFFThe time reference of expectation
BU_BM_TR_DELTA  0x82 [4:0] 0x00Increment
BU_BM_ARR_IX  0x83 [1:0] 0x0The index of arrival buffer
BU_BM_DSP_IX  0x84 [1:0] 0x0The display buffer index
BU_BM_ROY_IX  0x85 [1:0] 0x0The preparationbuffer index
BU_BM_BSTATE
 3  0x86 [1:0]0x0Buffer 3 states
BU_BM_BSTATE2  0x87 [1:0]0x0Buffer 2 states
BU_BM_BSTATE1  0x88 [1:0]0x0Buffer 1 state
BU_BM_INDEX  0x89 [1:0] 0x0When the anterior bumper index
BU_BM_STATE  0x8A [4:0] 0x00The buffer-manager state
BU_BM_FROMPS  0x8B [0] 0x0From the next sign of PICTURE_START
BU_BM_FROMFL  0x8C [0] 0x0From the next sign of FLUSH_TOKEN
C.2.4 the work of state machine
19 states are arranged in the state machine of buffer-manager, see table for details C.2.5. Its interaction is shown in Figure 157 and describes in behavioral illustrations bmLogic.M.
Show C.2.5 buffer state
StateValue
 PRES0     0x00
 PRES1     0x10
 ERROR     0x1F
 TEMP_REF0     0x04
 TEMP_REF1     0x05
 TEMP_REF2     0x06
 TEMP_REF3     0x07
 ALLOC     0x03
 NEW_EXP_TR     0x0D
 SET_ARR_IX     0x0E
 NEW_PIC_NUM     0x0F
 FLUSH     0x01
 ORQ     0x0B
 TOKEN     0x0C
 OUTPUT_TAIL     0x08
 VACATE_RDY     0x17
 USE_RDY     0x0A
 VACATE_DISP     0x09
 PICTURE_END     0x02
C.2.4.1 reset mode
Reset mode is PRES0, follows mark to be set to zero, like this so that the major cycle initialization. C.2.4.2 major cycle
The state that the circulation road of state machine comprises is shown in Figure 153 (deepening line in master map 152), and state PRES0 and PRES1 relate to detect visual clock through signal Presflg. Related test is allowed 2 cycles, because they all depend on the value of rdyst and at illustrated adder output signal C.2.3.2.4. Be detected if present mark, all buffers are considered to possibility " being ready to ", otherwise state machine enters state DRQ. Around different buffer of each period measuring of PRES0~PRES1 circulation, check full and be ready to state. If these conditions satisfy, the ready buffer in front (if an existence is arranged) is eliminated, and distributes new ready buffer and upgrades its state. This process be repeated until all buffers detected (pointer==Max buf) then state continue. When following any one is considered to as showing all set for the true time buffer
(Pic_num>Pres_num)&&((pic_num·Pres_num)>=128)
Or (the ﹠﹠ ((Pres_numpic_num)<=128) of Pic_num<Pres_num)
Or (Pic_num==pres_num.
State DRQ checks display buffer request (drg_Valid_reg ﹠ ﹠ disp_acc_ reg). If without request, state advances normally state token-will illustrate below, otherwise the display buffer pointer provides as follows, if the buffer that is not ready for, the pointer of front provides again. If without the front display buffer, provide blank pointer (zero). If the demonstration that buffer is done is prepared, its pointer is presented and its state just is updated. If the front display buffer is eliminated if required, state machine advanced as in the past.
The state token is to select for the typical case who finishes major cycle. If effective input and output failsafe, token by test (in the back part explanation), turns back to state PRES0 otherwise control as key value.
When some condition satisfied, control only broke away from major cycle. These below part explanations. C.2.4.3 distribute and be ready to buffer pointer
If in PRES0~PRES1 cycle period, have a buffer to be judged as and be ready to be ready to because at any time can only there be a buffer to be specified into, any before ready buffer need to give up the throne. State VACATE=RDY is that VACANT removes original ready buffer by the state that it is set, and the buffer pointer that resets is 1, so that when control turns back to the PRES0 state, all buffers are got detection ready with valid. The reason of doing like this is that current pointer is pointed out that the ready buffer in front (for removing its purpose) and does not record the new preparation buffer pointer that we want. Therefore need to retest all buffers. C.2.4.4 distribute the display buffer pointer
The distribution of display buffer pointer or directly produce or produce through state VACATE_DISP from state DRQ (state USE_RDY), this VACATE_DISP removes the state of old display buffer. Selecting display buffer is with mark IN_USE, and the value of rdy_buf is arranged to zero and pointer is reset to 1 to turn back to state DRQ. And disp_buf provides required pointer and two-wire interface line (disp_valid, drg_acc) is correspondingly controlled. Control only turns back to state DRQ because do not need when state USE_RDY, at state TOKEN, make definite between FLUSE and the ALLOC. C.2.4.5 the operation when receiving PICTURE_END
When receiving the PICTURE_END token, control is sent to state PICTURE_END from state TOKEN. If pointer does not point to current arrival buffer, then arrange and point to it so that its state can be updated. The two is very to suppose out_acc_reg and en_full, and state can be updated by following explanation, and they both are true otherwise control maintains state PICTURE_END. The en_full signal is provided by writing address generator, and its expression alternate buffering device replaces. Be best one and successfully write that therefore, this is the state of security update buffer.
The buffer of just having finished is ready to test for judgement, and gives do well FULL or READY according to test result. If it is to be ready to, rdy_buf provides its pointer value, and set_la _ ev signal (reaching late event) is set to height (demonstration of expression hope has surpassed decoding in time). The present vanishing of new value of arr_buf, and if the ready buffer in front needs state to remove, then pointer is set to point to the there, and control moves on to state VACAT_ RDY. Otherwise pointer is reset to 1, and control turns back to the beginning of major cycle. C.2.4.6 the operation when receiving PICTURE_START (reaching the distribution of buffer)
When the PICTURE_START token reaches during state TOKEN, mark from_ps is set up, and makes the basic status machine change circulation so that Access status ALLOC replaces state TOKEN. State ALLOC relate to distribute one arrive buffer (arriving pictorial data can write it) and through the buffer circulation until to find its state be VACANT. If out_acc_reg is high level, only distribute a buffer, thereby it is exported in the data two-wire interface. Correspondingly, the circulation in cycle will continue until these situations are determined. In case find the suitable buffer that reaches, pointer is assigned to arr_buf, its state is marked as IN_VSE. Pointer is set to 1, and mark from_ps is reset, and state further is set to NEW_EXP_TR. Form and to check (being included in the word immediately following the PICTURE_START back of back) at the pointer of image to judge whether that it is (when the assembling object pointer of appointment) identical with targ_ix, if identical set_if+_ ev (seeking the event pointer) is set to height.
3 N state EW_EXP_TR, SET_ARR_IX and NEW_PIC_NUM are provided with the time reference of new hope and enter the image numbers of data. It is that arr_buf is so that correct image numbers register is updated (meriting attention this_pnum also is updated) that this state placed in the middle only arranges pointer. Then control proceeds to state OUTPUT_TAIL and exports data (presenting suitable two-wire interface signal) until run into a low expansion. Major cycle this moment is restarted, and this means all data blocks (64) output, wherein asks not test to presenting mark or showing. C.2.4.7 the operation when receiving FLUSH
The FLUSH token represents that sequence information (number of presenting, image numbers, rst_ fld) will be reset in data flow. This occurs over just when all data of leading over FLUSH and correctly processes, therefore, need reception one FLUSH to go to monitor the state of all buffers until confirm all frames and turned demonstration, namely all buffers all have state EMPTY except one, and another is IN_USE (making display buffer). At this moment, one " new sequence " can be used safely.
When the FLUSH token checks out in state TOKEN. Mark from_fl is set, and makes the varying cyclically of basic status machine, and is accessed so that state FLUSH replaces state TOKEN. State FLUSH checks the state of each buffer successively, waits for that it becomes VACANT or IN_ USE as demonstration. State machine simply loop cycle until condition is true, then rise in value it index and reprocessing until all buffer is accessed. When the final buffer device satisfied this condition, the reset values rst_fid that the number of presenting, image numbers and all time reference registers present them was set to 1. Mark form_fl resets, and normal major cycle operation restarts. C.2.4.8 the operation when receiving TEMPORAL_REFERENCE
When running into the TEMPORAL_REFERENCE token, to H.261 position formation inspection, if set is accessed 4 state TEMP_REF0 to TEMP_REF3. Carry out following operation:
TEMP_REF0:temp_ref=in_data_reg;
TEMP_REF1:delta=temp_ref_exp_tr;index=arr_buf;
TEMP_REF2:exp_tr=delta+exp_tr;
TEMP_REF3:pic_num[i]=tnis_pnum+delta; In dex=1. C.2.4.9 other token and ending
State TOKEN except transfer control under the described all situations in the above to state OUTPUT_TALL, keep control at this, until run into the last character (in_ extn_reg is low) of token, then reenter major cycle. C.2.5 use points for attention C.2.5.1 state machine stop buffer-manager input
This needs are repeatedly done asynchronous inspection to the asynchronous timed events request of visual clock, and display buffer. The requirement that stops the buffer-manager input during these check means when providing data to continuously the input of buffer-manager, and the data transfer rate by buffer-manager will be restricted. A typical state sequence can be PRES0, PRES1, DRQ, TOKEN, OUTPUT_TALL, and each except OUTPUT_TALL continues one-period. This means the expense that 3 cycles will be arranged the piece of each 64 data item. Input therebetween is stopped (during state PRSE0, PRSE1 and DRQ). Therefore, make writing speed slow down 3/64, inother words 5%. When the secondary branch of state machine was carried out under worst case, this number rose to the expense in 13 cycles sometimes. Should be noted that only applicable every frame situation once of so large expense. C.2.5.2 the feature that presents number in during the visit
C.2.3.2.4 the specific embodiments of the bm_pres that the scheme shown in is illustrated means the number of presenting dally during the visit at VPI (free_runs). When obtaining accessing, also need the number of presenting identical when abandoning accessing, this can realize by writing back the number of presenting before reading the number of presenting and just abandoning access after obtaining access so. It should be noted that this is asynchronous, thus it as required repeated access repeatedly, further to guarantee efficient. C.2.5.3 time reference number H.261
Module bm_tref (not showing) should be included among the bmLogic, and H.261 the time reference value can correctly be processed by directly inputting δ from bmtref to the bm_stus module. If frame is always sequenced, the input of δ can be retained as zero. C.3 the generation of write address preface C.3.1
Function according to write address generation hardware of the present invention is to produce block address to write back in the buffer for data. This considers the snubber base location, the component that represents in stream, the vertical and level sampling in the macro block, dimension of picture and coding standard. Data reach with the macro block form but must storages, thus easily retrieve rows in order to show. C.3.2 functional profile
When arriving data flow, a new piece represents with data token, writing address generator need to produce a new block address, the DRAM interface do not need to produce immediately the address, because can store nearly 64 data words (in the alternate buffering device) before the actual needs address. This means that various address elements can be added in the operation sum in the consecutive periods, therefore, do not need hardware multiplier. Count of macroblocks device function is stored the impact that crucial terminal value reaches distance of swimming count value in register file. After each block address is calculated these operands relatively being reached condition upgrades.
Consider to be presented at pixel format among Figure 161, wish that address sequence can take from the data flow of normal data stream and similar H.261 form, this following demonstration. Merit attention, this form and the untrue H.261 specification that meets, because these sheets not enough wide (3 macro block rather than 11) and used same " half image width " at this for convenient, and supposed that sequence is " H.261 type ". Data arrive the 4:2:0 of whole macro block shown in example, and each composition is stored in the zone of specifying in the buffer it. Normal address series: 000,001,00C, 00D, 100,200;
          002,003,00E,00F,101,201;
          004,005,010,011,102,202;
          006,007,012,013,103,203;
          008,009,014,015,104,105;
          00A,00B,016,017,105,205;
          018,019,024,025,106,107;
          01A,01B,026......
          ......
          080,081,08C,08D,122,222;
082,083,08E, 08F, 123,223; H261 type sequence 000,001, C0C, 00D, 100,200; 002,003,00E, 00F, 101,201; 004,005,010,011,102,202; 018,019,024,025,106,107; 01A, 01B, 026,027,107,207; 01C, 01D, 028,029,108,208; 030,031,03C, 03D, 10C, 20C, 032,033,03E, 03F, 10D, 20D; 034,035,040,041,10E, 20E; 006,007,012,013,103,203; 008,009,014,015,104,105; 00A, 00B, 016,017,105,205; 01E, 01F, 02A, 02B, 109,209; 020,021,02C, 02D, 10A, 20A; 022,023,02E, 02F, 10B, 20B; 036,037,042,043,10F, 20F; 038,039,044,045,110,210; 03A, 03B, 046,047,111,211; 048,049,054,055,112,212; 04A, 04B, 056.......... ... .. 06A, 06B, 076,077,11D, 21D; 07E, 07F, 08A, 08B, 121,221; 080,081,08C, 08D, 122,222; 082,083,08E, 08F, 123; 223; C.3.3 C.3.3.1.1 structure C .3.3.1 interface arrives the interface of buffer-manager
Buffer-manager directly outputs to writing address generator with data and buffer index, and this is to realize under the control of two-wire interface. In some method, consider that the expansion of writing address generator piece as buffer-manager because both very closely link to each other, and they are by the work of two independences (but similar) clock generator. C.3.3.1.2 arrive the interface of dramif
Writing address generator provides data and address for the DRAM interface. Both all have own two-wire interface, and dramif is to use every kind in them in the different clock standards. Especially, the address is sent out the generator clock with clock timing with write address in Dramit irrelevant, therefore in output synchronously. C.3.3.1.3 MPI
Writing address generator uses 3 bit microprocessor address spaces to add 8 bit data bus and read-write gating. There is a single selection position to be used for register access. C.3.3.1.4 event
Writing address generator can form 5 different event. Two corresponding to the dimension of picture information that appears in the data flow (hmbs and Vmbs), and 3 corresponding to DEFINE_SAMPLING token (event of each component). C.3.3.2 basic structure
The structure of writing address generator is shown among the waddrgen.sch, and it comprises a data channel, some control logic and spy out device and synchronization thereof. C.3.3.2.1 data channel (bwadpath)
In the C.5 joint of presents, the data path type has been described, has comprised 18 adder/subtracter and register file (seeing C.3.3.4) and produce zero flag (according to adder output) for using in control logic. C.3.3.2.2 control logic
The hardware driving signal that control logic of the present invention is loaded for generation of all register files. The adder control signal, but two-wire interface signal composition also comprises write control register. C.3.3.2.3 detector and synchronization thereof
In data and address port senior detector is arranged all. Detector is controlled by the senior detector from Zcells in data channel. The address is synchronous between writing address generator clock and dramif " clk " mode. Syncifs is used among the Zcell, and as the two-wire interface signal, simplifying synchronized is that the address is used for the data road. C.3.3.3 control logic and state machine I/O piece (wa inout) C.3.3.3.1
This part comprises input and 2 output two-wire interface, inputs in addition data with (to token decode) latch and arrives buffer index (being 4 kinds of method decodings). C.3.3.3.2 dicyclic controll block (wa fc)
Mark fc (period 1) remains on this and instruction state machine whether in the centre (namely comprising the operation that adds) of a two cycles operation. C.3.3.3.3 component is counted (wa comp)
Data block in each component is needed absolute address, and this piece considers to keep current component according to the data head type that receives in the inlet flow. C.3.3.3.4 the control ofmodule 0~3 (wa_nod3)
When producing address sequence for data flow H.261, need to go out along screen (seeing C.3.2) 3 row of macro block with double method number. This is subject to keeping the impact ofmodule 0~3 counter, and accesses its increase when new macro-block line at every turn. C.3.3.3.5 control register (wa uregs)
Module wa_uregs comprises set-up register and coding standard register, and the latter loads from data flow, and set-up register is with 3: QCIF (lsb) and the largest component (position 1 and position 2) of wishing in data flow. Access bit also resides in this piece (resembling usually by synchronously), and position of rest is taken from the next stage (waLogic) on upper strata, and it is access bit and event position of rest ' or '. The Microprocessor Address decoding is finished by piece wa_udec, and this piece is got the read and write gating, and that selects line and address bus hangs down two. C.3.3.3.6 state of a control machine (wa state)
Logic in this part is divided into several obvious zones. State decode, new state coding, the finding the solution of intermediate logic signal, data path control signal (drive a, drive b, load, adder control and select signal), variable connector control, two-wire interface control and 5 event signals. C.3.3.3.7 event produces
Because obtaining some token result at input produces 5 event bit. Importantly, in each case, all token is received in the past in any event of generation, calculates because the Event Service subprogram is carried out according to the new value that receives. For this reason, each before being input to event hardware to whole cycle delay. C.3.3.4 register address conversion
Two register set are arranged in the writing address generator parts, and they are that the top layer that is positioned at the standard block part arranges type register and keyhole data path register. C.3.2 C.3.1 these registers be listed in respectively table and.
Show C.3.1 top layer register
Register nameAddress bitReset modeFunction
BU_WADDR_COD_STD   0x4
    2     0From coding standard in the dataflow
BU_WADDR_ACCESS   0x5
    1     0The accessposition
BU_WADDR_CTL1   0x6
    3     0Largest component [2:1] and QCIF [0]
BU_WA_ADDR_SNP2   0xB0   8Spy out device address C/p on the writing addressgenerator
BU_WA_ADDR_SNP1   0xB1
   8
BU_WA_ADDR_SNP0   0xB2
   8
BU_WA_DATA_SNP1   0xB4
   8Spy out device WA in the write address dataoutput
BU_WA_DATA_SNP0   0xB5
   8
Show C.3.2 pixel format device address generator keyhole
The keyhole register nameThe keyhole addressThe positionNote
  BU_WADCR_BUFFER0_BASE_MSB   0x85
  2Must load
  BU_WADDR_BUFFER0_BASE_MID   0x96   3
  BU_WADDR_BUFFER0_BASE_LSB   0x97   8
  BU_WADDR_BUFFER1_BASE_MSB   0x99   2Must load
  BU_WADDR_BUFFER1_BASE_MID   0x9a   3
  BU_WACDR_BUFFER1_BASE_LSB   0x9b   3
  BU_WADDR_BUFFER2_BASE_MSB   0x8d   2Must load
  BU_WADDR_BUFFER2_BASE_MID   0x8e   3
  BU_WADDR_BUFFER2_BASE_LSB   0x8f   3
  BU_WADCR_COMP0_HMBADDR_MSB   0x91   2Only test
  BU_WADDR_COMP0_HMBADDR_MID   0x92   8
  BU_WACDR_COMP0_HMBADDR_LSB   0x93   3
  BU_WACDR_COMP1_HMBADDR_MSa   0x95   2Only test
  BU_WADDR_COMP1_HMBADDR_MID   0x96   8
  BU_WADDR_COMP1_HMBADDR_LSB   0x97   8
  BU_WADDR_COMP2_HMBADDR_MSB   0x99   2Only test
  BU_WADDR_COMP2_HMBADDR_MID   0x9a   8
  BU_WADDR_COMP2_HMBADDR_LSB   0x9b   3
  BU_WADOR_COMP0_VMBADDR_MSB   0x9d   2Only test
  BU_WADDR_COMP0_VMBADDR_MID   0x9e   8
  BU_WADDR_COMP0_VMBADDR_LSB   0x9f   3
  BU_WADDR_COMP1_VMBADDR_MSB   0xa1   2Only test
  BU_WADDR_COMP1_VMBADDR_MID   0xa2   8
  BU_WADDR_COMP1_VMBADDR_LSB   0xa3   8
  BU_WADDR_COMP2_VMBADDR_MSB   0xa5   2Only test
  BU_WADDR_COMP2_VMBADDR_MID   0xa6   8
  BU_WADDR_COMP2_VMBADDR_LSB   0xa7   8
  BU_WADDR_VBADDR_MSB   0xa9   2Only test
  BU_WADDR_VBADDR_MID   0xaa   8
  BU_VADDR_VBADDR_LSB   0xab   8
Show C.3.2 pixel format device address generator keyhole (continuing)
The keyhole register nameThe keyhole addressThe positionNote
  BU_WADDR_COMP0_HALF_WIDTH_IN_BLOCKS_MSB   0xad
 2Must load
  BU_WADDR_COMP0_HALF_WIDTH_IN_BLOCKS_MID   0xae  8
  BU_WADDR_COMP0_HALF_WIDTH_IN_BLOCKS_LSB   0xaf  8
  BU_WADDR_COMP1_HALF_WIDTH_IN_BLOCKS_MSB   0xb1  2Must load
  BU_WADDR_COMP1_HALF_WIDTH_IN_BLOCKS_MID   0xb2  8
  BU_WADDR_COMP1_HALF_WIDTH_IN_BLOCKS_LSB   0xb3  8
  BU_WADDR_COMP2_HALF_WIDTH_IN_BLOCKS_MSB   0xb5  2Must load
  BU_WADDR_COMP2_HALF_WIDTH_IN_BLOCKS_MID   0xb6  8
  BU_WADDR_COMP2_HALF_WIDTH_IN_BLOCKS_LSB   0xb7  8
  BU_WADDR_HB_MSB   0xb9  2Only test
  BU_WADDR_HB_MID   0xba  8
  BU_WADDR_HB_LSB   0xbb  8
  BU_WADDR_COMP0_OFFSET_MSB   0xbd  2Must load
  BU_WADDR_COMP0_OFFSET_MID   0xbe  8
  BU_WADDR_COMP0_OFFSET_LSB   0xbf  8
  BU_WADDR_COMP1_OFFSET_MSB   0xc1  2Must load
  BU_WADDR_COMP1_OFFSET_MID   0xc2  3
  BU_WADDR_COMP1_OFFSET_LSB   0xc3  8
  BU_WADDR_COMP2_OFFSET_MSB   0xc5  2Must load
  BU_WADDR_COMP2_OFFSET_MID   0xc6  8
  BU_WADDR_COMP2_OFFSET_LSB   0xc7  8
  BU_WADDR_SCRATCH_MSB   0xc9  2Only test
  BU_WADDR_SCRATCH_MID   0xca  8
  BU_WADDR_SCRATCH_LSB   0xcb  8
  BU_WADDR_MBS_WIDE_MSB   0xcd  2Must load
  BU_WADDR_MBS_WIDE_MID   0xce  8
  BU_WADDR_MBS_WIDE_LSB   0xcf  8
  BU_WADDR_MBS_HIGH_MSB   0xd1  2Must load
  BU_WADDR_MBS_HIGH_MID   0xd2  8
 BU_WADDR_MBS_HIGH_LSB   0xd3  8
The keyhole register nameThe keyhole addressThe positionNote
  BU_WADDR_COMP0_LAST_MB_ROW_MSB  0x105   2Must load
  BU_WADDR_COMP0_LAST_MB_ROW_MID  0x106   8
  BU_WADDR_COMP0_LAST_MB_ROW_LSB  0x107   8
  BU_WADDR_COMP1_LAST_MB_ROW_MSB  0x109   2Must load
  BU_WADDR_COMP1_LAST_MB_ROW_MID  0x10a   8
  BU_WADDR_COMP1_LAST_MB_ROW_LSB  0x10b   8
  BU_WADDR_COMP2_LAST_MB_ROW_MSB  0x10d   2Must load
  BU_WADDR_COMP2_LAST_MB_ROW_MID  0x10e   8
  BU_WADDR_COMP2_LAST_MB_ROW_LSB  0x10f   8
  BU_WADDR_COMP0_HBS_MSB  0x111   2Must load
  Bu_WADDR_COMP0_HBS_MID  0x112   8
  BU_WADDR_COMP0_HBS_LSB  0x113   8
  BU_WADDR_COMP1_HBS_MSB  0x115   2Must load
  BU_WADDR_COMP1_HBS_MID  0x116   8
  BU_WADDA_COMP1_HBS_LSB  0x117   8
  BU_WADDR_COMP2_HBS_MSB  0x119   2Must load
  BU_WADDR_COMP2_HBS_MID  0x11a   8
  BU_WADDR_COMP2_HBS_LSB  0x11b   8
  BU_WADDR_COMP0_MAXHB  0x11f   2Must load
  BU_WADDR_COMP1_MAXHB  0x123   2
  BU_WADDR_COMP2_MAXHB  0x127   2
  BU_WADDR_COMP0_MAXVB  0x12b   2Must load
  BU_WADDR_COMP1_MAXVB  0x12f   2
  BU_WADDR_COMP2_MAXVB  0x133   2
Show C.3.2 pixel format device address generator keyhole (continuing)
This keyhole hole register mainly is divided into two classes, and a class is loaded another kind of comprising all operations of (horizontal and vertical) piece of all kinds and count of macroblocks before must calculating in company with dimension of picture parameter what address in office. The dimension of picture parameter can be packed in response to any interruption that writing address generator forms, namely pack into when appearing at data flow when any dimension of picture or sampling token, or, if dimension of picture knew that before receiving data stream they just in time write after resetting. For example C.13 giving the example setting in the joint, the dimension of picture parameter register defines at lower joint. C.3.4 writing address generator is programmed
Following data path register must comprise correct dimension of picture information before carrying out address computation, they illustrate in Figure 162.
1, WADDR_HALF_WIDTH_IN_BLOCKS: it has defined the half-breadth that enters image with piece.
2, WADDR_MBS_WIDE: it has defined the width that enters image with macro block.
3, WADDR_MBS_HIGH: it has defined the height that enters image with macro block.
4, WADDR_LAST_MB_IN_ROW: it has defined in the macro-block line of a single full width, the piece in the upper left corner of last macro block number. The block number of starting from scratch from the upper left corner of left macro block increases with every along frame, also increases along with the piece of next line in the macro-block line subsequently.
5, WADDR_LAST_MB_IN_HALF_ROW: this is similar to the clauses and subclauses of front, and it has defined the piece number of the upper left hand block in last macro block in half wide macro-block line.
6, WADDR_LAST_ROW_INMB: it has defined the piece number of the Far Left piece in last column piece in macro-block line.
7, WADDR_BLOCKS_PER_MB_ROW: it defined be included in one single, the total block data in the full duration macro-block line.
8, WADDR_LAST_MB_ROW: it has defined in image the address of upper left of macro block the most left in the last macro block.
9, WADDR_HBS: it has defined the width that enters image with piece.
10, WADDR_MAXHB: it defined in single macro block piece capable in the piece number of rightmost piece.
11, WADDR_MAXVB: it has defined the height-1 of single macro block with piece.
In addition, the register of definition DRAM tissue must be programmed, and they are n component offset registers of 3 buffer base registers. The n here is desired number of components in the data flow (it can define in data flow, and minimum is 1, is 3 to the maximum).
Note many parameter specified block number or block address. This is because wish that the final address is a block address, and calculates the algorithm that is based on accumulation.
Figure 162 illustrates layout structure, register value below producing:
         1)WADDR_HALF_WIDTH_IN_BLOCKS=0x16
         2)WADDR_MBS_WIDE=0x16
         3)WADDR_MBS_HIGH=0x12
         4)WADDR_LAST_MB_IN_ROW=0x2A
         5)WADDR_LAST_MB_IN_HALF_ROW=0x14
         6)WADDR_LAST_ROW_IN_MB=0x2C
         7)WADDR_BLOCKS_PER_MB_ROW=0x58
         8)WADDR_LAST_MB_ROW=0x5D8
         9)WADDR_HBS=0x2C
         10)WADDR_MAXVB=1
11) the C.3.5 operation of state machine of WADDR_MAXHB=1
19 kinds of states are arranged in the state machine of buffer-manager, be specified in table C.3.3. It connects each other and be shown in Figure 164 and also explaining in behavioral illustrations bmlogic.M.
Show C.3.3 writing address generator state
StateValue
 IDLE     0x00
 DATA     0x10
 CODING_STANDARD     0x0C
 HORZ_MBS0     0x07
 HORZ_MBS1     0x06
 VERT_MBS0     0x0B
 VERT_MBS1     0x0A
 OUTPUT_TAIL     0x08
 HB     0x11
 MB0     0x1D
 MB1     0x12
 MB2     0x15
 MB3     0x13
 MB4     0x05
 MB5     0x14
 MB6     0x15
 MB4A     0x18
 MB4B     0x09
 MB4C     0x17
 MB4D     0x16
 ADDR1     0x19
 ADDR2     0x1A
 ADDR3     0x1B
 ADDR4     0x1C
 ADDR5     0x03
 HSAMP     0x05
 VSAMP     0x04
 PIC_ST1     0x0f
 PIC_ST2     0x01
 PIC_ST3     0x02
C.3.5.1 the calculating of address
The major part of writing address generator state machine is along the left lower side explanation of Figure 164. When receiving data token, state machine moves on to state ADDR1 from state ID LE. Then arrive state ADDR5,18 block address and two-wire interface control are output from state ADDR5. By state ADDR1 until the calculating that ADDR5 carries out be:
        BU_WADDR_SCRATCH=BU_BUFFERn_BASE
        +BU_COMPm_OFFSET;
        BU_WADDR_SCRATCH=BU_WADDR_SCRATCH
        +BU_WADDR_VMBADDR;
        BU_WADDR_SCRATCH=BU_WADDR-SCRATCH
        +BU_WADDR_HMBADDR;
        BU_WADDR_SCRATCH=BU+WADDR_SCRATCH
        +BU_WADDR_VBADDR;
        out_addr=BU_WADDR_SCRATCH+BU_WADDR_HB;
Used register is as giving a definition:
1, BU_WADDR_VMBADDR: the block address (left top block) of left macro block of macro-block line comprises that in macro block the address is just at calculated.
2, BU_WADDR_HMBADDR: the block address (left top block) of the top macroblock of macro block row comprises that in macro block the address is just at calculated.
3, BU_WADDR_VBADDR: in macro-block line, the block address of Far Left piece during piece is capable comprises that in these pieces its address is just at calculated.
4, BU_WADDR_HB: the address in macro block is just in calculated horizontal block number.
5, BU_WADDR_SCRATCH: be used for temporarily storing the temporary register of intermediate object program.
Consider Figure 163, for example take the calculating of the piece of address 0x62, following calculating order will occur:
SCRATCH=BUFFERn_BASE+COMPm_OFFSET;(assume 0) 
SCRATCH=0+0x5D8;
SCRATCH=0x5D8+0x28;
SCRATCH=0x600+0x2C;
Block address=0x62c+1=0x62D;
The content of various registers is illustrated in the drawings. C.3.5.2 the newly calculating of screen position parameter
When export the address, state machine is carried out and is calculated in order to upgrade as mentioned above various screen position continuous parameters. State HB and MBO are until MB6 calculates, and to state DATA, the prompting of data token from then on state is output at some some transfer control.
These states carry out in pairs, and first pair is calculated poor between current counting and its end value, then produce zero flag. Second pair of reseting register or be fixed (according to the value in the register of arranging that is got by screen size) biasing. In each case, reached its end value (being the set zero flag) if calculate, control continues down to arrive state " MB " sequence, otherwise, think that all order computation correct (being prepared as next address calculates) and control forward state DATA to.
Notice that two cycles of conditionings expense that all relate to addition and subtraction finish (allowing the ripple carry adder with standard), this is subjected to the impact of the use of mark fc (period 1), and this mark changes the adder state between 1 and 0.
All address computation and the supposition of screen position computing mode allow data output when being fit to the two-wire interface situation. C.3.5.2.1 to the calculating of standard (MPEG type) sequence
Job order following (wherein zero flag is exported according to adder):
State HB and MBO:
    scratch=hb-maxhb;    if(z)      hb=0;    else    (      hb=hb+1          new_state=DATA;    )     states MB1 and MB2:    scratch=vb_addr-last_row_in_mb;    if(z)      vb_addr=0;    else    (      vb_addr=vb_addr+width_in_blocks;      new_state=DATA;    )    states MB3 and MB4:    scratch=hmb_addr-last_mb_in_row;    if(z)          hmb_addr=0;    else    (       hmb_addr=hmb_addr+maxhb;      new_state=DATA;    )    states MB5 and MB6:    scratch=vmb_addr-last_mb_row;    if(!z)      vmb_addr=vmb_addr+blocks_per_mb_row;
(V resets after the PICTURE_START token is detectedmb_ addr, rather than when the end of image is released from calculating) C.3.5.2.2 to the H.261 calculating of sequence
Sequence to H.261 calculating is different with state MB4 standard sequence:
State HB and MBO:-are as above
State MB1 and MB2:-are as above
State MB3 and MB4:
scratch=hmb_addr-last_mb_in_row;if(z &amp;(mod3==2))/*end of slice on right of screen*/(  hmb_addr-0;  new_state-MB5;)else if(z)/*end of row on right of screen*/(  hmb_addr=half_width_in_blocks;  new_state=MB4A;)else(  scratch=hmb_addr-last_mb_in_half_row;  new-state=MB4B;}<!-- SIPO <DP n="574"> --><dp n="d574"/>state MB4A:vmb_addr=vmb_addr+blocks_per_mb_row;new_state=DATA;state(MB4)and MB4B:(scratch=hmb_addr-last_mb_in_half_row;)if(z &amp;(mod3==2))/*end of slice on left of screen*/(  hmb_addr=hmb_addr+maxhb;  new_state=MB4C;}else if(z)/*end of row on left of screen*/(  hmb_addr=0;  new_state=MB4A;}else(  hmb_addr=hmb_addr+maxhb;  new_state=DATA;}states MB4C and MB4D:vmb_addr=vmb_addr-blocks_per_mb_row;vmb_addr=vmb_addr-blocks_per_mb_row;new_state=DATA:states MB5and MB6:-as above
C.3.5.3 according to the operation of PICTURE_START token
When receiving token PICTURE_START, state PIC_ST1 is passed in control, and Vb_addr register (BU_WADDR_VBADDR) resets to zero there. Each of state PIC_ST2 and PIC_ST3 is accessed, each component once, hmb_addr and Vmb_ addr reset respectively. Then control through state OUTPUT_TAIL and turn back to IDLE. C.3.5.4 operate according to the DEFINE_SAMPLING token
When receiving token DEFINE_SAMPLING, the component register is with inputting 2 the poorest loadings of validity of data. Through state HSAMP and VSAMP, maxhb and maxvb register pair component load in addition. And suitable definition sample event position is triggered (postponing to allow to write whole tokens through one-period). C.3.5.5 the operation of HORIZONTAL_MBS and VERTICAL_MBS
When each HORIZONTAL_MBS and VERTICAL_MBS arrival, be included in 14 place values in the token, in two cycles, be written in the corresponding register. The dependent event position is triggered, and has postponed one-period. C.3.5.6 other token
The CODING_STANDARD token is detected also writes top layer BU_WADDR_ COD_STD register to input data, and these data are decoded, and nh261 mark (not being H.261) is arrived the buffer management piece by Hardware. All other tokens make control move on to state OUTPUT_TAIL, and receive data is until the token end there. It should be noted that in fact it does not export any data. C.4 C.4.1 reading address generator is summarized
Reading address generator of the present invention is comprised of 4 state machines/data path piece, and first " dline " produces row address and gives other 3 (one of each component) identical page or leaf/block address generators, " dramctls " these address assignment. All pieces link by two-wire interface, and mode of operation comprises that all interlaced/progressive combine, above first/below in conjunction with and in the above/following/top and bottom frame starting combines. C.3.4, table has shown the name of dispaddr control register, address and reset mode, and C.13 joint has provided two kinds of address generator programming examples. C.4.2 row address generator (dline)
This piece calculates the capable enabling address of each component. C.3.4, table has shown indline 18 data path register.
Note, the same ADDR_register_name of DISP_register_name, difference only is among the dispaddr between the DISD_ name register, means that register is directed to will be read out the viewing area of DRAM. ADDR_name means the thing of some relevant external buffer structure of register description.
Operation
The basic operation of dline is: (having ignored all modes repeats etc.)
if(vsync_start)/* first active cycle of vsync*/(comp=0DISP_VB_CNT_COMP[comp]=0;LINE[comp]=BUFFER_BASE[comp]+0;LINE[comp]=LINE[comp]+DISP_COMP_OFFSET[comp];while(VB_CNT_COMP[comp]<DISP_VBS_COMP[comp](while(line_count[comp]<8)(    (while(comp<3)(-OUTPUT LINE[comp]to dramctl[comp]line[comp]=LINE[comp]+ADDR_HBS_COMP[comp];comp=comp+1;)line_count[comp]=line_count[comp]+1;)    VB_CNT_COMP[comp]=VB_CNT_COMP[comp]+1;line_count[comp]==0;))
Show C.3.4 Dispaddr data path register
Register nameBusThe keyhole addressExplanationNote
BUFFER_BASE0   A   0x00,01,   02,03The starting block address of each bufferThese registers were loaded by UPI before the operation beginning
BUFFER_BASE1   A   0x04,05,   06,07
BUFFER_BASE2   A   0x08,09,   0a,0b
DISP_COMP_ OFFSET0   B   0x24,25,   26,27,From the buffer base address to the address offset amount of reading to begin
DISP_COMP_ OFFSET1   B   0x28,29,   2a,2b
DISP_COMP_ OFFSET2   B   0x2c,2d,   2e,2f
DISP_VBS_COMP0   B   0x30,31,   32,33Institute's vertical blocks of reading number
DISP_VBS_COMP1   B   0x34,35,   36,37
DISP_VBS_COMP2   B   0x38,39,   3a,3b
ADDR_HBS_COMP0   B   0x3C,3d,   3e,3fHorizontal block number in data
Show C.3.4 Dispaddr data path register (continuing)
Register nameBus   Keyhole   addressExplanationNote
 ADDR_HBS_COMP1   B   0x40,41,   42,43
 ADDR_HBS_COMP2   B   0x44,45,   46,47
 LINE0   A   0x0C,0d,   0e,0fThe current line addressThis register is to note as timi requirement by dispddr: all registers are from MPI R/W
 LINE1   A   0x10,11,   12,13
 LINE2   A   0x14,15,   16,17
 DISP_VB_  CNT_COMP0   A   0x18,19,   1a,1bThe number of the vertical blocks that continues
 DISP_VB_  CNT_COMP1   A   0x1c,1d,   1e,1f
 DISP_VB_  CNT_COMP2   A   0x20,21,   22,23
C.4.3 Dline control register
Top-operation is revised by the dispaddr control register. These registers are following be presented at table C.4.3 in.
Show C.4.3 control register
Register nameThe addressThe positionReset modeFunction
LINES_IN_ LAST_ROW0   0x08 [2:0]   0x07These 3 registers determine to read the line number (beyond 8) of the last row of piece
LINES_IN_ LAST_ROW1   0x09 [2:0]   0x07
LINES_IN_ LAST_ROW2   0x0a [2:0]   0x07
DISPADDR_ ACCESS   0x0b [0]   0x00To the dispaddr access bit
DISPADDR _ CTLO sees the detailed description of following these control bits   0x0c [1:0]   0x0   SYNC_MDDE
[2]   0x0   READ_START
[3]   0x1   INTERLACED/PROG
[4]   0x0   LSB_INVERT
[7:5]   0x0   LINE_RPT
DISPADDR _CTL1   0x0d [0]   0x1   COMPOHOLD
The Dispaddr control register is the LINES_IN_LAST_ROW[component C.4.3.1]
These 3 registers are determined the capable line number of last piece of reading for each component, and therefore, the height of reading window can be any line number. This is a reserve feature, because the top of window, a left side and the right be along on block boundary, and o controller can be cut out the row that (abandoning) surpasses. C.4.3.2 DISPADDR_ACCESS
This is the access position for whole dispaddr, and during one writing, dispaddr and clock synchronous stop on this position. To remain " 0 " until dispaddr safety stop from the access position value of reading back. After reaching this state, it is safe that all dispaddr registers are carried out asynchronous upi access. Should be noted that in fact upi is locked by data path register, until the access position is " 1 ". In order in the situation of not interrupting the operation of current demonstration or data path, to finish the access to dispaddr, only access the release access in the situation below.
Stop: only finished it when the first two synchronous operation (if it did) at data path, allowed when being high level to access from " safety signal " of o controller. This signal represents the zone that is positioned at display window at screen, is programmed (non-dispaddr) in o controller. Attention: therefore before attempting dispaddr acquisition access, need to programme to o controller.
Starting access only is high or is released during vsync at " safe ". This guarantees to show and can not start near the valid window place very much.
This scheme makes the control software can request access, and inquiry finishes to revise dispaddr and discharges access until show. If software is too slow, until just discharge access bit behind the vsync, dispaddr will be until just start next safety period. Border color will show (not being rubbish) during this " lost (being dropped) " image. C.4.3.3 DISPADDR_CTL0[7:0]
When reading the following describes, difference was important between understanding interlaced data and interlacing showed.
Interlaced data has two kinds of forms. The top layer register is supported field-image (each buffer comprises a field) and frame image (each buffer comprises one complete frame-interlacing or not interlacing).
DISPADDR_CTL0[7:0] comprise following control bit:
SYNC_MODE[1:0]
Show that with interlacing the VSYNCS that relates to top and bottom field is distinguished by the field_info pin. About this point, field, field_info=HIGH meaning top. These two control bits determine which Vsyncs dispaddr will be from the new display buffer of buffer-manager request one. Therefore, make field synchronization in buffer midfield (if data are interlacing) and the display.
0: the new display buffer in the field, top
1: field, the end
2: two
3: two
When starting, dispaddr at each vsync to buffer-manager request one buffer. Dispaddr will receive one zero (without showing) buffer until buffer is ready to. When it obtained an effective buffer pointer at last, dispaddr did not also know that at display where it. Therefore, it is synchronous demonstration to be started with correct vsync.
READ_START
When interlacing showed startup, which vsync this position was determined to show with reality from. And after receiving the display buffer pointer, dispaddr can " sitout " current vsync, for the display field of aliging with the buffer midfield.
INTERLACED/PROGRESSIVE
0: line by line
1: interlacing
In row-by-row system, all row of sense buffer viewing area. And in interlace mode, only read spaced rows, read initial the first row or the second row to be determined by field_info that note resembling for (interlacing) field pattern, system wishes from each buffer to read all row, so this setting will be line by line. Map between field_info and the first/the second start of line can anti-phase by lsb_invert (this name be because of historical reasons).
LSB_INVERT
When being set up, this bit Inverting is seen the field_info signal by linage-counter. Therefore, reading can be in the correct row startup of a frame and to the display adjustment. No matter convention, display or top layer register that encoder adopts.
LINE_RPT[2:0]
When arranging, every makes the row of respective component read secondary (position 0 affectscomponent 0 etc.), and this forms the first of vertically passing sampling (upsampling). It is used for QFIF is converted to 601 required 8 colourity over-samplings (upsampling).
COMP0HOLD
This is used for line number and thecomponent 1 and 2 ratios of reading line number thatcomponent 0 is read is programmed (showing together opposite).
0: line number is identical, i.e. 4:4:4 data in buffer
1:2 is doubly tocomponent 0 line number, i.e. 4:2:0
Page or leaf block address generator (dramctls)
When transmitting a row address, these pieces produce a series of page or leaf/row addresses and piece, to be read by row. Usually, suppose that 8 are the minterm width, and output comprises a page address, one 3 line number, 3 piece starts and 3 piece halt address (line number is calculated not add by dline and passed through dramctls with revising). Therefore, for the 3rd BOB(beginning of block) is at 48 pixels (along the arbitrfary point of going arbitrarily) that read out from 5 row of page or leaf 0xaa from the left side, the address of passing to the DRAM interface will be:
Page=0xaa
Line=5
Block start=2
Block stop=7
Each of these 3 devices has 5 data path registers. These be shown in table C.3.4 in. Each dramctl fundamental characteristics is:
    Block start=2    Block stop=7    while(true)    {    CNT_LEFT=0;    GET_A_NEW_LINE_ADDRESS from dline;    BLOCK_ADDR=input_Block_addr+0;    PAGE_ADDR=input_page_addr+0;    CNT_LEFT=DISP_HBS+0:    while(CNT_LEFT>BLOCKS_LEFT)    {    BLOCKS_LEFT=8-BLOCK_ADDR:    -->output PAGE_ADDR,start=BLOCK_ADDR,stop=7.    PAGE_ADDR=PAGE_ADDR+1;    BLOCK_ADDR=0;    CNT_LEFT=CNT_LEFT-BLOCKS_LEFT;    }    [Last Page of line*/    CNT_LEFT=CNT_LEFT+BLOCK_ADDR;    CNT_LEFT=CNT_LEFT-1;    -->output PAGE_ADDR,start=BLOCK_ADDR,stop=CNT_LEFT    }
Show C.3.5 Dramctl (0,2 , ﹠2) data path register
Register nameBusThe keyhole addressExplanationNote
DISP_COMP0 _HBS   A  0x48,49,  4a,4bThe horizontal block of reading is counted c.f. ADDR-HBSMust load register before the operation beginning
DISP_COMP1 _HBS   A  0x4c,4d,  4e,4f
DISP_COMP2 _HBS   A  0x50,51,  52,53
CNT_left0   A  0x54,55,  56,57,The rest block number that continuesThese registers are by dispa-ddr, and annotate with the time location: all register R/W is from upi
CNT_left1   A  0x58,59,  5a,5b
CNT_left2   A  0x5c,5d  5e,5f
PAGE_ADDR0   A  0x60,61,  62,63The address of current page
PAGE_ADDR1   A  0x64,65,  66,67
PAGE_ADDR2   A  0x68,69,  6a,6b
BLOCK_ADDR0   B  0x6c,6d,  6e,6fThe current block address
Show C.3.5 Dramctl (0,2 , ﹠2) data path register (continuing)
Register nameBusThe keyhole addressExplanationNote
 BLOCK_ADDR1   B   0x70,71,   72,73
 BLOCK_ADDR2   B   0x74,75,   76,77
 BLOCK_left0   B   0x78,79,   7a,7bRest block in the current page
 BLOCK_left1   B   0x7c,7d,   7e,7f
 BLOCK_left2   B   0x80,81,   82,83
Programming
Below 15 dispaddr registers must before the operation beginning, be programmed.
            BUFFER_BASE0,1,2
            DISP_COMP_OFFSET0,1,2
            DISP_VBS_COMP0,1,2
            ADDR_HBS_COMP0,1,2
            DISP_COMP0,1,2_HBS
Use the reset mode of dispaddr control register will provide the 4:2n interlacing demonstration that asynchronous row repeats, and the beginning (field_info=HIGH) on the top. Figure 159 "buffer 0 that comprises SIF (22 * 18 macro block) image " has shown a typical buffer setting to the SIF image. (C.13 joint relates to this example in more detail). Notice that in this example, DISP_HBS_COMPn equals ADDR_HBS_COMPn. Similarly, vertical register DISP_ VBS_COMPn and suitable writing address generator register equate that the zone of namely reading is whole buffering areas.
Form window with reading address generator.
The address assignment of can programming is so that its part of read buffer (window) only. The size of window is passed through register DISP_HBS, DISP-VBS, and COMPONENT_OFFSET and LINES_IN_LAST_ROW are to each component programming. Figure 160 " theSIF component 0 with display window ". Shown how this accomplishes. (only to component 0).
In the present example, register will be set to:
          BUFFER_BASEO=0x00
          DISP_COMP_OFFSETO=0x2D
          DISP_VBS_COMPO=0x22
          ADDR_HBS_COMPO=0x2C
          DISP_HBS_COMO=0x2A
Attention:
Window only can begin and finish at block boundary.
We make LINES_IN_LAST_ROW equal 7 (meaning whole 8) in this example.
This example is unpractical except the 4:4:4 data. For correspondence, can not be on block boundary to the window edge of other 2 compositions.
If the data that receive are not 4:4:4, the color space conversion device need not, this means that these reading window methods must be programmed to finish it together with non-sampler (up samplers). C.5 address generation data path
The data path that uses in dispaddr and waddrgen (18) on structure and width is identical, only is that quantity, some shielding of register and the mark that turns back to state machine are different. Figure 165 has shown a part of circuit (the circuit of one slice) " part of data path (Slice of Datapath) ". Register is gone to drive A or B bus by independent assignment, and in controller, their use (distribution) is optimised. All registers can load from the C bus. Yet not all " loading " signal is all driven. The all operations that relates to adder comprises two cycles, has common pulsation carry to allow adder. Figure 166, " two cycleoperations of data path " shown two registers being reinstalled " A " bus register two cycles and sequential. Various marks are changed by " Ph0 " in data path, in order to produce the C code. Same reason, the structure of data path scheme slightly has exception, and all in one single, this piece has saved the C-path in the unit to all registers (on A and B bus), therefore the C code can be generated preferably. For the data path being obtained the upi access, the access position must be set, if if because will be removed by lock because of this access position upi without the visit. The upi access is different from read and write:
Write: when the access position was set up, all Load Signal was without effect. The suitable byte of one of driving register of 3 byte addressing write gates, upi data/address bus are passed through data path (being replicated the 2-8-8 position) vertically downward, and 18 bit registers are write just as writing 3 independent byte.
Read: this uses A and B bus to finish. Again, the access position must be set. Addressable register is driven on A or the B bus, and the upi byte chooses a byte from associated bus lines, and orders about it on the upi bus.
Because binary cycle data path action need A and B bus keep their value (and upi access interrupt these), access must be only given by the state of a control machine before any data path operation beginning.
All data path registers in two address generators are addressed at top level address by the keyhole of one 9 bit wides, and 0x28 (msb) and 0x29 (lsb) are to keyhole, and 0x2A is to data. C.11.2, this keyhole address provides at table.
Attention:
1) all address registers in address generator (dispaddr and waddrgen) all comprise block address, and pixel addresses is never used, and comprises that the register of row address only has 3 LINES_IN_LAST_ROW registers.
2) some register is replicated between address generator, and for example, BUFFER_BASEO appears at the address space into dispaddr and waddrgen. These are two independently registers, and they all need to be loaded, and this can display window (only reading the part of display-memory), and are easy to show the form outside 3 component vides. C.6 C.6.1 the DRAM interface is summarized
In the present invention, spatial decoder, temporal decoder and video format device all comprise the DRAM interface block with special chip. At all in these 3 kinds of equipment, the effect of DRAM interface is by the block address that provided by address generator data to be sent to outside DRAM and from outside DRAM to chip from chip.
Say that typically the DRAM interface is by clock operation, this clock and two address generators all are asynchronous, and are also asynchronous with the various clocks that data transmit, yet, this asynchronous be manageable because clock is roughly in same frequency work.
Data communication device is everlasting and is transmitted (prediction data is sole exception in the time decoder) between the remainder of the chip in DRAM interface and 64 block of bytes. The generation that transmits relies on the device that is called " alternate buffering device ". This is in fact a pair of RAMS that works in the double buffering configuration, and the DRAM interface fills up a RAM or sky, and at this moment another part of chip makes another RAM for sky or fills up. The independent bus line that carries an address from address generator interrelates with each alternate buffering device.
Each chip has 4 alternate buffering devices, but the function of these alternate buffering devices is different in each case. In space encoder, an alternate buffering device is used to transmit coded data to DRAM, and another reads coded data from DRAM, the 3rd data to DRAM transmission token, and the 4th is to read the token data from DRAM. In time decoding, an alternate buffering device is used to write inside or predictive image data to DRAM, reads inside or predictive image data for second from DRAM, and other two is to read a previous and rear prediction data. In the video format device, alternate buffering device is used for transmitting data to DRAM, and other 3 is read data from DRAM, and to each brightness (Y), red difference data and blue difference data (being respectively Cr and Cb) are with an alternate buffering device.
Being operated in the space encoder file of general DRAM interface illustrates. Lower part is illustrated the characteristics of DRAM interface according to the present invention, especially the characteristics of video format device is described. C.6.2 video format device DRAM interface
In the video format device, data are write among the outside DRAM with piece, but read with the grating order. For writing, identical with the space encoder that has illustrated, and slightly complicated about reading.
Data in the outside DRAM of video format device are organized, so that at least 8 data blocks are arranged into one page. These 8 pieces are 8 continuous horizontal block. When rasterizing, need from each of this 8 continuous blocks, to read 8 bytes, and write in the alternate buffering device (that is, the same delegation in each of 8 pieces).
Consider top line (supposing a byte wide interface), x address (3 LSBS) is set to zero, and same Y address (3MSBS) also is set to zero. Then, whenever each of beginning 8 bytes just increases the x address when being read out. At this moment, the top of address (position 6 and above_ Lsb=bito) is increased, and x address (3LSBS) set is zero. This process is repeated, until 64 bytes run through. For the interface with 16 or 32 bit wides that leads to outside DRAM, the x address just increases by 2 or 4, rather than 1.
Address generator can send signal notice DRAM interface and be less than 64 bytes and should be read out (this may need in the capable beginning of grating with when finishing), although usually read 8 multiple. This obtains by using to start and stop value. The startup value is used to the top (position is more than 6 and 6) of address, and stops value with this startup value relatively, and produces the signal when an indication should stop to read. C.7 the vertically passing C.7.1 preface of sampling
After its input carried out raster scanning to a kind of pixel of chrominance component, vertically passing sampler according to the present invention can provide the output scanning of 2 times of height. Mode is selected to allow to form output pixel value with various ways. C.7.2 port
The input two-wire interface:
·In_valid
·in_accept
·in_data[7:0]
·in_lastpel
·in_lastline
The output two-wire interface:
  ·out_valid
  ·out_accept
  ·out_data[9:0]
  ·out_last
  mode[2:0]
  nupdata[7:0],upaddr,upsel[3:0],uprstr,upwstr    ramtest
  tdin,tdout,tpho,tckm,tcks
Pho, ph1, notrsto is mode C.7.3
By input bus mode[2:0] selection mode
Mode registerintermediate value 1 and 7 is no
In in the aforesaid way each, the output pixel is with 10 value rather than use byte representation. In this piece, do not round off and block. For using same range as, in the place of needs, numerical value is by to moving to left. C.7.3.1 mode 0:Fifo
This piece is as just memory of Fifo. Output is identical with the output number of picture elements. This value is to moving to left 2. C.7.3.2 mode 2: repeat
Every row in the input scan is repeated to produce the output scanning of 2 times of height. Pixel value is by again to moving to left 2.
A->ABACBDBCCDD ismode 4 C.7.3.3: following (Lower)
Each line of input produces 2 output row, and in this " Lower " mode, the second row in this two row (delegation below in demonstration) is identical with line of input. This a pair of the first row is current entering a profession and front line of input average. If first line of input does not have the row of front to use, line of input is repeated.
When the luma samples of chroma samples and reduction is during at same position, should select this pattern.
A->ABAC (A+B)/2DB (B+C)/2C (C+D)/2D ismode 5 C.7.3.4: top (upper)
Be similar to " following (Lower) mode ", but line of input forms output to top row in this case, and following row is the average of adjacent line of input. Last output row is the repetition of last line of input.
When chroma samples during at same position, should be selected this pattern with the upper luma samples.
A->AB (A+B)/2CBD (B+C)/2C (C+D)/2DD ismode 6 C.7.3.5: center (central)
This " Central " mode is positioned at the situation of the centre between luma samples corresponding to chroma samples. In order to make output colourity pixel and luminance pixel at same position. Used weighted average to form the output row.
C.7.4 how it works A->AB (3A+B)/4C (A+3B)/4D (3B+C)/4 (B+3C)/4 (3C+D)/4 (C+3D)/4D
Two kinds of row storages are arranged, imagine that they are designated as " a " and " b ". In " FIFO " and " repetition " mode, only use row storage " a ". But the row of each memory outfit as many as 512 pixel (vertical up-sampling should be finished before horizontal up-sampling), in " FIFO " mode, the length of row is unrestricted.
Input signal in_Lastpel and in_LastLine are used for representing the end of line of input and the end of image. In_Lastpel should be high level when the final pixel of every row occurs. Uprising of In _ LastLine should be consistent with the appearance of final pixel of last column of image.
Output signal out_Last is high level when the final pixel of each output row occurs.
In " repetition " mode, every row is written into memory " a ". Then this row is read out 2 times. When it was read out for the second time, next line can begin to be written into.
" below ", " top " and " " center " mode, capable memory " a " or " b " of being write in turn. The first row of image always is written into memory " a ". 2 little state machines, one of each memory remember what each storage content is, and which output row just is formed. Produced the read and write requirement that row memory RAM goes from these states, and determined when that next line can fold the signal that is written on the current data.
When writing in-Lastpel and be high level, register (Lastaddr) storage write address, thus the length of row is provided for exporting capable formation. C.7.5 UPI
This piece comprises 2 512 * 8 array ram, and they can be accessed with typical method by MPU interface. There is not the register with microprocessor access. C.8 C.8.1 horizontal over-sampling device is summarized
In the present invention, the top layer register comprises the horizontal over-sampling device of one of 3 each identical chrominance component. Therefore these 3 all independent controls, need one of explanation at this. From User Perspective, unique each horizontal over-sampling device that is not both is transformed different address set in the storage image table.
Horizontal over-sampling is carried out to unite and is copied and filtering operation. One has 4 kinds of working methods:
Table is horizontal over-sampling mode C.7.1
ModeFunction
    0Straight-through (not processing), resetmode
    1The non-mistake adopted, with 3-tap FIR wave filter
    2     x2Over-sampling and filtering
    3     x4Over-sampling and filtering
C.8.2 the use of horizontal over-sampling device
The address mapping figure of each horizontal over-sampling device is comprised of 25 unit, corresponding to 12 13 potential coefficient registers and one 2 s' mode register. The number of writing mode register has determined working method, such as explanation in showing C.7.1. Can use some or all coefficient register according to mode. Corresponding FIR wave filter illustrates below.
According to working method, input XnWithin 1,2 or 4 clock cycle, keep constant. The actual coefficients that every kind of mode is programmed is as follows:
Show the C.7.2 coefficient ofmode 1
CoefficientWhole clock cycle
    k0     c00
    k1     c10
    k2     c20
Show the C.7.3 coefficient ofmode 2
CoefficientWhole clock cycleThe second clock cycle
    k0     c00     c01
    k1     c10     c11
    k2     c20     c21
Show the C.7.4 coefficient ofmode 3
CoefficientThe 1st clock cycleThe 2nd clock cycleThe 3rd clock cycleThe 4th clock cycle
    k0     c00     c01     c02     c03
    k1     c10     c11     c12     c13
    k2     c20     c21     c22     c23
The coefficient that is not used in a certain AD HOC is being need not programme with the sort of work pattern.
In order to obtain even filtering, first of every row is repeated before filtering with last pixel. For example, through behind 2 over-samplings, first of every row andlast pixel repeat 4 times rather than 2 times. Because remaining data are dropped at every row end in wave filter, the pixel of output still always is one times, two times or four times that count in the input traffic definitely.
According to the value of coefficient, output is sampled or is placed together or remove from input sample with input sample. The below is some example value of coefficient in some sample mode. "-" expression coefficient value is " inessential ". All value is for hexadecimal.
Show C.7.5 to sample coefficient
Coefficient x2Sampling, the output pixel is consistent with the input pixel x2Over-sampling, the output pixel is between the input pixel x4Over-sampling, the output pixel is between input resembles
    c00     0000     01BD     00E9
    c01     0000     0108     00B6
    c02     -     -     012A
    c03     -     -     0102
    c10     0800     0538     0661
    c11     0400     0538     0661
    c12     -     -     0446
    c13     -     -     029F
    c20     0000     010B     00B6
    c21
    0400     01BD     00B9
    c22     -     -     0290
    c23     -     -     045F
C.8.3 the explanation of horizontal over-sampling
The data path of horizontal over-sampling illustrates in Figure 168.
To X4The over-sampling situation, its work overview is as follows. In addition, X2Over-sampling and X1Filtering (mode 2 and 1) degeneracy is the whole wave filter of this situation bypass (mode 0), data from input latch through last demultplexer through to the following explanation of output latch.
1) be latched in the input latch (" L ") when valid data, it keeps 4 clock cycle.
2) each coefficient register (being expressed as " COEFF ") is entered multiplier with a clock cycle by the order multichannel, and 2 devices of 4 pipeline registers (being expressed as " PIPE ") are timed simultaneously. Therefore, to input data Xn, the first pipeline is with being worth c00.Xn,c01.Xn, c02.Xn,c03.XnInsert.
3) similarly, second multiplier will be taken advantage of X successively with its coefficientn, the 3rd multiplier taken advantage of successively with its whole coefficients.
Can see that output will be with the form of table shown in C.7.6.
Table is C.7.6 to the output sequence ofmode 3
Clock cycleOutput
  0   c20.xn+c10.xn-1+c00.xn-2
  1   c21.xn+c11.xn-1+c01.xn-2
  2   c22.xn+c12.xn-1+c02.xn-2
  3   c23.xn+c13.xn-1+c03.xn-2
From the angle of output, each clock cycle forms an independent pixel. Because each output pixel is the weighted value (although 3 different values are only arranged) that determines 12 input pixels, this can regard as at x412 tap filtering have been carried out on the input pixel of over-sampling. (12 tap filter).
To X2Over-sampling, except the input data only kept 2 clock cycle, its work essence is identical. In addition, only two coefficients are used, and " PIPE " piece has been shortened by illustrated multiplexer (MUX. To X1Wave filter, input only keeps a clock cycle. As desired, used a coefficient and one " PIPE " level.
We discuss now some notes of some feature of relevant the present invention's realization and translate.
1) data path width and coefficient width (13 2 complement codes) can be selected. Thereby, when the color space conversion device is designed, can use same multiplier. These width are more than sufficient for the purpose of horizontal over-sampling device.
2) multiplexer (MUX and the UPI readback data that coefficient are compound on the multiplier are shared. This causes some complexity of structure (main because the C code forms difficulty) of figure, but side circuit is less.
3) just as the color space conversion device, the Carry save array multiplier is used, and its result is only finally finding the solution.
Can regard a single two-wire interface level as to the control of whole horizontal over-sampling devices, this interface can form 2 times or 4 times to the data volume of its input at its output. Determine the length of shift register able to programme (bob) by this mode of UPI programming. Per 1 clock cycle of selected mode, per 2 clock cycle or per 4 clock cycle produce an output pulse. This controls host state machine conversely, and the state of this host state machine is also by in_valid, and out_accept (to two-wire interface) and signal " in_Last " are determined. This signal transmits from the vertically passing sampler, and is high level to last pixel of every row. This is so that the beginning of every row and final pixel repeat secondary (twice_over), and streamline in the ranks is eliminated (clearning down). (after delegation finished, streamline comprised the redundant data that part is processed). C.9 C.9.1 the color space conversion device is summarized
Color space conversion device (CSC) is carried out 3 * 3 matrix multiplications to 9 bit data that enter in the present invention, then makes addition:y0y1y2=c01c02c03c11c12c13c12c22c23&times;x0x1x2+c04c14c24
Here X0-2The input data, Y0-2To export data and CnmIt is coefficient. Matrix coefficient has specially been used not too traditional title, because these names are corresponding to the signal name among the figure.
CSC can realize conversion in a large amount of different color spaces, even only use the finite aggregate of these conversions in the top layer register. The design color space conversion is as follows:
            ER,EC,EB→Y,CR,CB
            R,G,B→Y,CR,CB
            Y,CR,CB→ER,EG,EB
            Y,CR,CB→R,G,B
Here R, G and B are in (0......511) scope, and all other amounts are in (32 ... 470) scope. Because the input to top layer register CSC is Y, CR,CB, only having the 3rd of these equatioies is relevant with the 4th.
In the CSC design, the precision of coefficient can be selected, to 9 bit data, all output valves produce in the simulation by full floating-point arithmetic, in positive and negative one scope of value. (this is available optimum precision). This is to CX0~CX3Provide 13 2 complement code coefficient, and to CX4Provide 14 2 complement code coefficient. Coefficient to all design conversions decimally provides with hexadecimal is following:
Table is various conversion coefficients C.8.1
    ER→Y     R→Y     Y→E3     Y→R
  Coeff   Dec   Hex   Dec   Hex   Dec   Hex   Dec   Hex
  c01   0.299   0132   0.256   1.0   04C0   1.159   04AD
  c02   0.587   0269   0.502   1.402   059C   1.539   C68E
  c03   0.114   0075   0.098   0.0   C0C0   0.0   CCC0
  c04   0.0   0000   16   -179.456   F4C8   -223.478   FIS8
  c11   0.5   0200   0.428   1.0   04C0   1.159   04AD
  c12   -0.419   FE63   -0.358   -0.714   FD25   -0.335   FCA8
  c13   -0.081   FFAD   -0.070   -0.344   FEA0   -0.432   FE64
  c14   129.0   0800   128   135.5   0878   139.7   OEBA
  c21   -0.169   FF53   -0.144   1.0   04C0   1.159   C4AD
  c22   -0.331   FEAD   -0.283   0.0   0CC0   0.0   0CC0
  c23   0.5   0200   0.427   1.772   0717   2.071   C848
  c24   128   0800   128   -226.816   F1D2   -233.84   EE42
All these numbers can calculate from fundamental equation:
       Y=0.299ER+0.587EG+0.0114EBAnd following color difference equation:
                 CR=ER-Y
                 CB=EB-Y
R, the equation among G and the B have considered to draw after whole scale ranges of this tittle. C.9.2 the use of color space conversion device
When resetting, C01,C12And C23Set is 1, and other all coefficient set are 0. Therefore, Y0=X0,Y1=X1,Y2=X2And all data are passed through with remaining unchanged. In order to select color space conversion, as long as simply suitable coefficient (for example from table C.8.1) is written to the address mapping table in the special element.
Reference diagram, X0......2Corresponding to in_data0......2And Y0......2Corresponding to out_data0.......2 The user should remember that the data that are input to CSC must be sampled into 4:4:4. If not like this, not only color space transformation is meaningless, and chip will pin.
It should be noted that each output can add (or subtracting) constant from the combination of any permission of coefficient input and is formed. Therefore, to any given color space conversion, the order of output can change by the row (being the address that coefficient writes) in the exchange transition matrix.
CSC can guarantee to be all conversion work in showing C.8.1. If use other conversion, the user must remember these points:
1) if any intermediate object program requires precision (comprising sign bit) greater than 10 in calculating, hardware will not worked.
2) output of CSC by saturated be 0 to 511. Namely, any number less than 0 replaces with 0, and any number greater than 511 replaces with 511. The realization of saturation logic is that the hypothesis result is only less times greater than 511 or be slightly less than 0. If CSC is programmed by mistake, so public sign will be that output all occurs saturated in all (or most of) times. C.9.3 the explanation of CSC
The structure of CSC illustrates in Figure 169, because spatial constraints, the there has only shown 2 in 3 components. In the drawings, " register " i.e. " R " refers to master-slave register, and " latch " or " L " refers to a transparent latch.
All coefficients are loaded into does not have the obviously read-write UPI register of expression among the figure. For understanding its work, consider following order with reference to Far Left component (it produces output out_data0):
1) data reach input X0-2(in_data0-2). The single pixel of this representative in the input color space. This is latched.
2)X0Be multiplied by C01And be latched into first pipeline register. X1And X2Continue a mobile register.
3)X1Be multiplied by C02, be added to X1·C01In go, and be latched into next pipeline register. X2Continue in the mobile register.
4) X is multiplied by C03And and the results added of (3), form (X1·C01+X2·C02+X3·C03). This result is latched in the next pipeline register.
5) the same C of the result of (4)04Addition. Keep with the Carry save array form because data communication device is crossed multiplier, this adder also is used for finding the solution the data from the multiplier chain. Its result is latched in subordinate's pipeline register.
6) last operation is saturated data. Partial results is delivered to saturate block to realize this step from finding the solution adder.
Can see, when beginning as this part indication in the matrix equation the result be Y0 Similarly, form Y with same method1And Y2
Used 3 multipliers, coefficient is as multiplicand, and data are as coefficient. This can obtain effective scheme, and partial results is downward along data path, and identical input data communication device is crossed 3 parallel and identical data paths, corresponding one of each output.
In order to be implemented in C.9.2 said reset mode in the joint, each of 3 components must reset with distinct methods. For fear of 3 set schemes (Schematics) and 3 kinds of slightly different line maps (Layout) are arranged, to force into the UPI register input of high or low level at top layer.
CSC does not almost have relative control. However, each pipeline stages is a two-wire interface level, so there is one effectively can receive chain of latches and the control (in_accept=out_accept_r+Lin_Valid-r) relevant with them. Therefore, CSC is 5 grades of dark two-wire interface, can keep 10 data Layers when stopping.
The output of CSC comprises the re-synchronization latch, because the next function in the viewing pipeline is not considered a different clocks generator. C.10 o controller preface C.10.1
O controller according to the present invention has lower surface function:
It provides data in the mode of one of 3 kinds of modes
24 bit 4:4:4
16 bit 4:2:2
8 bit 4:2:2
Its levelling data is to video display window, and this window is by vsync and hsync pulse and programmable time register definitions.
If necessary, it adds frame around video window. C.10.2 port
The input two-wire interface:
·in_Valid
·in_accept
·in_data[23:0]
The output two-wire interface:
·out_Valid
·out_accept
·out_data[23:0]
·out_active
·out_window
·out_comp[1:0]
in_vsync,in_hsync
Nupdata[7:0], upaddr[4:0], upsel, rstr, wstr, tdin, tdout, tph0, tckm, tcks, chiptest, Ph0, Ph1, notrst0, notrst1. C.10.3 the way of output
The form of output is selected C.10.3.1mode 0 by writing the working method register
This mode is 24 4:4:4 RGB or YCrCB. The input data are directly led to output. C.10.3.2mode 1 and 2
These modes provide 4:2:2 YCrCb, suppose in_data[23:16] be Y, in_ data[15:8] be Cr, and in_data[7:0] be Cb. C.10.3.2.1mode 1
In 16 YCrCb, Y is presented on out_data[15:8] on. Cr and Cb multichannel are compound to out_data[7:0] on time, Cb is front. Out_data[23:16] be not used. C.10.3.2.2mode 2
In 8 YCrCb, Y, Cr and Cb are that the order multichannel according to Cb, Y, Cr, Y is compound to out_data[7:0] on time. Out_data[23:8] be not used. C.10.3.3 output timing
In video display window, lower column register is used for putting data.
Vdelay-followed the quantity of the hsync pulse after the vsync pulse before the first row of video or frame.
Hdelay-is the clock cycle number between first pixel of hsync and video or frame.
The height of height-video window is take line number as unit.
The width of width-video window is take number of picture elements as unit.
North, south-are respectively the above and following frame height of video window, take line number as unit.
West, east-are respectively the border width on the video window left side and the right, take pixel as unit.
Minimum vdelay is zero. First hsync is that first is effectively capable. The minimum of a value that can be been programmed into hdelay is 2. Yet it should be noted that be hdelay+1 cycle from in_hsync to first actual delay of effectively exporting pixel.
Any edge on limit can have value zero. By writing register border-r, border-g and border-b select border color. By writing register bank-r, bank-g and bank-b select the color of frame exterior domain. Note compound frame and the blank component of also will affecting of multichannel of in the way ofoutput 1 and 2, carrying out. Namely, in these register intermediate values corresponding to in_ data[23:16], in_data[15:8] and in_data[7:0]. C.10.4 output token
Out_activo indication output data are parts of valid window, i.e. video data or frame.
Out_window represents to export the part that data are video windows.
Out_comp[1:0] represent that chrominance component is present in the out_ data[7:0 in the way ofoutput 1 and 2] on. Inmode 1,0=Cb, 1=Cr. Inmode 2,0=Y, 1=Cr, 2=Cb. C.10.5 two-wire system
Among the present invention, by beingwrite 1, the two-wire register selects two-wire system. It is not selected after resetting. In two-wire system, output timing register and sync signal all are left in the basket, and are controlled by out_accept by the data flow of piece. Notice that in normal operation, out_accept should remain on high level. C.10.6 spy out device
Have one superly to spy out the output that device places piece, this piece comprises the access to output token. C.10.7 how to work
2 identical down counters are recording the current location in the display. " Vcount " subtracts counting to hsyncs, and loads or load when its is finally counted from suitable sequential register pair Vsync. " Hcount " subtracts counting to each pixel, and hsync is loaded or loads when its is finally counted. Note, in the way ofoutput 2, suitable two clock cycle of pixel. C.11 C.11.1 Clock dividers is summarized
Top layer register in the present invention comprises two identical Clock dividers, and one produces PICTURE_CLK, and another produces AUDIO_CLK. Clock dividers is identical and is independently controlled. Therefore, need one of explanation at this. From User Perspective, two Clock dividers unique is not both the divisor register and is mapped to diverse location in the storage image address.
The effect of Clock dividers provides a clock frequency that is divided into 4X SYSCIK, and equal equal duty ratio (mark-space) is not required.
Divisor need to be 0 to 16,000, and in 000 the scope, so it can represent with 24. And limiting minimum divisor is 16. This is because by using 1/2nd divisors, and Clock dividers will because available maximum clock frequency is SYSCLK, available maximum crossover frequency be SYSCLK/2 near a same duty cycle (at a SYSCLK in the cycle). And because used 4 counters in cascade, divisor/2 must be from being not less than 8, otherwise the output frequency division clock is driven to positive supply. C.11.2 the use of Clock dividers
The address mapping of each Clock dividers is comprised of 4 unit, and 38 of their correspondences are removed number register and 11 access register. Clock dividers is invalid when just having powered up, when effective by the access that it is removed number register.
Except number register can write with any order according to address mapping in showing C.10.1. Clock dividers detects synchronized 0 to 1 conversion by the access position at it and is activated. When beginning to detect a conversion, Clock dividers will leave reset mode, and produce a frequency-dividing clock. The conversion of back (the supposition divisor also changes) only makes Clock dividers lock onto a new frequency and " on-the-fly ". In case be activated, except with the chip reset, can't stop Clock dividers.
Show C.10.1 Clock dividers register
The addressRegister
    00bAccess bit
    01bDivisor highest order (MSB)
    10bDivisor
    11bDivisor lowest order (LSB)
Divider value usable range from 14 to 16,77,216. C.11.3 the explanation of Clock dividers
Clock dividers can be realized with 4 22 digit counters. These counters are unified into cascade, so that when a counter carry, it will activate next counter successively. A counter is before carry, and counter is with 1/4 divider value frequency division. Therefore, each counter will receive it successively, to form the pulse of frequency-dividing clock frequency.
After the carry, counter will reinstall with divisor/8, and be produced the frequency-dividing clock of roughly the same dutycycle by frequency division. Because when each counter was activated by the prime counter, it was from reinstalling except number register. The frequency-dividing clock frequency can on_thd_fly and is changed. These changes only need change simply the divisor content and realize.
Each counter is with it oneself independently clock generator timing, so that clock alignment between the control counter accurately, and make each counter by different clock apparatus timing.
A state machine is controlled the formation of divisor/4 and divisor/8 values. Simultaneously also correct source clock is compound to clock generator and goes from phaselocked loop (PLL) multichannel. According to the value of divisor, counter is by the different clocks timing. This is because different its edges of the formed frequency-dividing clock of divider value is decided with the different clocks combination that PLL provides. C.11.4 test clock frequency divider
Clock dividers can be by being that height powers up on the chip and tests with CHIPTEST. Such effect is all clocked logics SYSCIK timing of forcing in the Clock dividers, and relative time clock is produced by PLL with it.
Clock dividers is with the full scan design, and therefore, next step can be with the JTAG access test of standard, as long as chip as above powers up.
If equipment is with normal operation the time, CHIPTEST remains on high level, and then the function of Clock dividers can not guarantee. C.12 C.12.1 top level address conversion of address mapping
Note:
1) to the register of top level address conversion, as set in showing C.1l.1, be used during the design name. These names do not need to appear in the tables of data.
2) because this is address mapping completely, listed many unit comprise the unit that only uses for test herein.
Show C.11.1 top layer register A top level address conversion
Register nameThe addressThepositionNote
BU_EVENT
 0×0 8Writing 1 resets
BU_MASK  0×1 8  R/W
BU_EN_INTERRUPTS
 0×2 1  R/W
BU_WADDR_COD_STD
 0×4 2  R/W
BU_WADDR_ACCESS
 0×5 1The R/W access
BU_WADDR_CTL1
 0×6 3  R/W
BU_DISPADDR_LINES_IN_LAST_ROW0
 0×8 3  R/W
BU_DISPADDR_LINES_IN_LAST_ROW1
 0×9 3  R/W
BU_DISPADDR_LINES_IN_LAST ROW2
 0×a 3  R/W
BU_DISPADDR_ACCESS
 0×b 1The R/W access
BU_DISPADDR_CTL0
 0×c 8  R/W
BU_DISPADDR_CTL1
 0×d 1  R/W
BU_BM_ACCESS
 0×10 1  R/W-access
BU_BM_CTL0
 0×11 2  R/W
BU_BM_TARGET_IX
 0×12 4  R/W
BU_BM_PRES_NUM
 0×13 8R/W is asynchronous
BU_BM_THIS_PNUM  0×14 8  R/W
BU_BM_PIC_NUM0
 0×15 8  R/W
BU_BM_PIC_NUM1
 0×16 8  R/W
BU_BM_PIC_NUM2
 0×17 8  R/W
BU_BM_TEMP_REF
 0×18 5  RO
Register nameThe addressThe positionNote
  BU_ADDRGEN_KEYHOLE_ADDR_MSB   0×29   1The R/W address generator is spied out device and is seen Table C.11.2 content
  BU_ADDRGEN_KEYHOLE_ADDR_LSB   0×29   8
  BU_ADDRGEN_KEYHOLE_DATA   0×2a   8
  BU_IT_PAGE_START   0×30   5   R/W
  BU_IT_READ_CYCLE   0×31   4   R/W
  BU_IT_WRITE_CYCLE   0×32   4   R/W
  BU_IT_REFRESH__CYCLE   0×33   4   R/W
  BU_IT_RAS_FALLING   0×34   4   R/W
  BU_IT_CAS_FALLING   0×35   4   R/W
  BU_IT_CONFIG   0×36   1   R/W
  BU_OC_ACCESS   0×40   1The R/W access
  BU_OC_MODE   0×41   2   R/W
  BU_OC_2WIRE   0×42   1   R/W
  BU_OC_BORDER_R   0×49   8   R/W
  BU_OC_BORDER_G   0×4a   8   R/W
  BU_OC_BOROER_B   0×4b   8   R/W
  BU_OC_BLANK_R   0×4d   8   R/W
  BU_OC_BLANK_G   0×4e   8   R/W
  BU_OC_BLWK_9   0×4f   8   R/W
  BU_OC_HDELAY_1   0×50   3   R/W
  BU_OC_HDELAY_0   0×51   8   R/W
  BU_OC_WEST_1   0×52   3   R/W
  BU_OC_WEST_0   0×53   8   R/W
  BU_OC_EAST_1   0×54   3   R/W
  BU_OC_EAST_0   0×55   8   R/W
  BU_OC_WIDTH_1   0×56   3   R/W
  BU_OC_WIDTH_0   0×57   8   R/W
  BU_OC_VDELAY_1   0×58   3   R/W
  BU_OC_VDELAY_0   0×59   8   R/W
  BU_OC_NORTH_1   0×5a   3   R/W
  BU_OC_NORTH_0   0×5b   8   R/W
  BU_OC_SOUTH_1   0×5c   3   R/W
  BU_OC_SOUTH_0   0×5d   8   R/W
  BU_OC_HEIGHT_1   0×5e   3   R/W
  BU_OC_HEIGHT_0   0×51   8   R/W
Show C.11.1 top layer register A top level address conversion (continuing)
Register nameThe addressThe positionNote
  BU_IF_CONFIGURE   0×60   5   R/W
  BU_UV_MODE   0×61   6   R/W-x000x000
  BU_COEFF_KEYADDR   0×62   7   R/W.See Tatle.C.::3   foc conterts
  BU_COEFF_KEYDATA   0×63   8
  BU_GA_ACCESS   0×68   1   R/W
  BU_GA_BYPASS   0×69   1   R/W
  BU_GA_RAM0_ADDR   0×6a   8   R/W
  BU_GA_RAM0_DATA   0×6b   8   R/W
  BU_GA_RAM1_ADDR   0×6c   8   R/W
  BU_GA_RAM1_DATA   0×6d   8   R/W
  BU_GA_RAM2_ADDR   0×6e   8   R/W
  BU_GA_RAM2_DATA   0×6f   8   R/W
  BU_DIVA_3   0×70   1   R/W
  BU_DIVA_2   0×71   8   R/W
  BU_DIVA_1   0×72   8   R/W
  BU_DIVA_0   0×73   8   R/W
  BU_DIVP_3   0×74   1   R/W
  BU_DIVP_2   0×75   8   R/W
  BU_DIVP_1   0×76   8   R/W
  BU_DIVP_0   0×77   8   R/W
  BU_PAD_CONFIG_1   0×78   7   R/W
  BU_PAD_CONFIG_0   0×79   8   R/W
  BU_PLL_RESISTORS   0×7a   8   R/W
  BU_REF_INTERVAL   0×7b   8   R/W
  BU_REVISION   0×ff   8   RO-revision
They are different from and appear in the tables of data following register in the test space
  BU_BM_PRES_FLAG   0×80   1   R/W
  BU_BM_EXP_TR   0×81   --   These reisters are   missing on revA
  BU_BM_TR_DELTA   0×82   --
  BU_BM_ARR_IX   0×83   2   R/W
  BU_BM_DSP_IX   0×84   2   R/W
  BU_BM_RDY_IX   0×85   2   R/W
  BU_BM_BSTATE3   0×86   2   R/W
  BU_BM_BSTATE2   0×87   2   R/W
Show C.11.1 top layer register A top level address conversion (continuing)
Register nameThe addressThepositionNote
BU_BM_BSTATE1
 0×88  2 R/W
BU_BM_INDEX
 0×89  2 R/W
BU_BM_STATE
 0×8a  6 R/W
BU_BM_FROMPS
 0×8b  1 R/W
BU_BM_FROMFL
 0×8c  1 R/W
BU_DA_COMP0_SNP3
 0×90  8R/W spies out device in display address generatoraddress output
BU_DA_COMP0_SNP2
 0×91  8
BU_DA_COMP0_SNP1  0×92  8
BU_DA_COMP0_SNP0  0×93  8
BU_DA_COMP1_SNP3  0×94  8
BU_DA_COMP1_SNP2  0×95  8
BU_DA_COMP1_SNP1  0×96  8
BU_DA_COMP1_SNP0  0×97  8
BU_DA_COMP2_SNP3  0×98  8
BU_DA_COMP2_SNP2  0×99  8
BU_DA_COMP2_SNP1  0×9a  8
BU_DA_COMP2_SNP0  0×9b  8
BU_UV_BAM1A_ADDR_1  0×a0  8R/W is to the test access of vertically passing samplingRAM upi
BU_UV_RAM1A_ADDR_0
 0×a1  8
BU_UV_RAM1A_DATA  0×a2  8
BU_UV_RAM1B_ADDR_1  0×a4  8
BU_UV_RAM1B_ADDR_0  0×a5  8
BU_UV_RAM1B_DATA  0×a6  8
BU_UV_RAM2A_ADDR_1  0×a8  8
BU_UV_RAM2A_ADDR_0  0×a9  8
BU_UV_RAM2A_DATA  0×aa  8
BU_UV_RAM2B_ADDR_1  0×ac  8
BU_UV_RAM2B_ADDR_0  0×ad  8
BU_UV_RAM2B_DATA  0×ae  8
BU_WA_ADDR_SNP2  0×b0  8R/W exports the device of spying out of processing in the writing addressgenerator address
BU_WA_ADDR_SNP1
 0×b1  8
BU_WA_ADDR_SNP0  0×b2  8
BU_WA_DATA_SNP1  0×b4  8Spy out device in the output of R/WWA data
BU_WA_DATA_SNP0
 0×b5  8
Show C.11.1 top layer register A top level address conversion (continuing)
Show C.11.1 top layer register A top level address conversion (continuing)
Register nameThe addressThepositionNote
 BU_IF_SNP0_1
 0×b8  8R/W spies out device for 3 in the output of dramifdata
 BU IF_SNP0_0
 0×b9  8
 BU_IF_SNP1_1  0×ba  8
 BU_IF_SNP1_0  0×bb  8
 BU_IF_SNP2_1  0×bc  8
 BU_IF_SNP2_0  0×bd  8
 BU_IFRAM_ADDR_1  0×c0  1If R/W RAM UPI accesses it
 BU_IFRAM_ADDR_0  0×c1  8
 BU_IFRAM_DATA  0×c2  8
 BU_OC_SNP_3  0×c4  8Spy out device in the R/Wchip output
 BU_OC_SNP_2
 0×c5  8
 BU_OC_SNP_1  0×c6  8
 BU_OC_SNP_0  0×c7  8
 BU_YAPLL_CONFIG  0×c8  8  R/W
 BU_BM_FRONT_BYPASS
 0×ca  1  R/W
Show C.11.2 Top-Level register A address generator keyhole
The keyhole register nameThe keyhole addressThepositionNote
BU_DISPADDR_BUFFER0_BASE_MSB
 0×01  2Eight bit register must load
BU_DISPADDR_BUFFER0_BASE_MID  0×02  8
BU_DISPADDR_BUFFER0_BASE_LSB  0×03  8
BU_DISPADDR_BUFFER1_BASE_MSB  0×05  2Must load
BU_DISPADDR_BUFFER1_BASE_MID  0×06  8
BU_DISPADDR_BUFFER1_BASE_LSB  0×07  8
BU_DISPADDR_BUFFER2_BASE_MSB  0×09  2Must load
BU_DISPADDR_BUFFER2_BASE_MID  0×0a  8
BU_DISPADDR_BUFFER2_BASE_LSB  0×0b  8
BU_DLDPATH_LINE0_MSB  0×0d  2Only be used fortest
BU_DLDPATH_LINE0_MID
 0×0e  8
BU_DLDPATH_LINE0_LSB  0×0f  8
BU_DLDPATH_LINE1_MSB  0×11  2Only be used fortest
BU_DLDPATH_LINE1_LSB
 0×12  8
BU_DLDPATH_LINE1_LSB  0×13  8
BU_DLDPATH_LINE2_MSB  0×15  2Only be used fortest
BU_DLDPATH_LNE2_MID
 0×16  8
BU_DLDPATH_LINE2_LSB  0×17  8
BU_DLDPATH_VBCNT0_MSB  0×19  2Only be used fortest
BU_DLDPATH_VBCNT0_MID
 0×1a  8
BU_DLDPATH_VBCNT0_LSB  0×1b  8
BU_DLDPATH_VBCNT1_MSB  0×1d  2Only be used fortest
BU_DLDPATH_VBCNT1_MID
 0×1e  8
BU_DLDPATH_VBCNT1_LSB  0×1f  8
BU_DLDPATH_VBCNT2_MSB  0×21  2Only be used fortest
BU_DLDPATH_VBCNT2_MID
 0×22  8
BU_DLDPATH_VBCNT2_LSB  0×23  8
Show C.11.2 Top-Level register A address generator keyhole (continuing)
The keyhole register nameThe keyhole addressThepositionNote
BU_DISPADDR_COMP0_OFFSET_MSB
 0×25   2Must load
BU_DISPADDR_COMP0_OFFSET_MID  0×25   8
BU_DISPADOR_COMP0_OFFSET_LSB  0×27   8
BU_DISPADDR_COMP1_OFFSET_MSB  0×29   2Must load
BU_DISPACDR_COMP1_OFFSET_MID  0×2a   8
BU_DISPADDR_COMP1_OFFSET_LSB  0×2b   8
BU_DISPADDR_COP2_OFFSET_MSB  0×2d   2Must load
BU_DISPADDR_COMP2_OFFSET_MID  0×2e   8
BU_DISPADDR_COMP2_OFFSET_LSB  0×2f   2
BU_DISPADDR_COMP0_VBS_MSB  0×31   2Must load
BU_DISPADDR_COMP0_VBS_MID  0×32   8
BU_DISPAOOR_COMP0_VBS_LSB  0×33   8
BU_DISPADDR_COMP1_VBS_MSB  0×35   2Must load
BU_DISPADDR_COMP1_VBS_MID  0×36   8
BU_DISPADDR_COMP1_VBS_LSB  0×37   8
BU_DISPADDR_COMP2_VBS_MSB  0×39   2Must load
BU_DISPADDR_COMP2_VBS_MID  0×3a   8
BU_DISPADDR_COMP2_VBS_LBS  0×3b   8
BU_ADDR_COMP0_HBS_MSB  0×3d   2Must load
BU_ADDR_COMP0_HBS_MID  0×3e   8
BU_AOOR_COMP0_HBS_LSB  0×3f   8
BU_ADDR_COMP1_HBS_MSB  0×41   2Must load
BU_ADDR_COMP1_HBS_MID  0×42   8
BU_ADDR_COMP1_HBS_LSB  0×43   8
BU_ADDR_COMP2_HBS_MSB  0×45   2Must load
BU_ADDR_COMP2_HBS_MID  0×46   8
BU_ADDR_COMP2_HBS_LSB  0×47   8
BU_DISPADDR_COMP0_HBS_MSB  0×49   2Must load
BU_DISPADDR_COMP0_HBS_MID  0×4a   8
BU_DISPADDR_COMP0_HBS_LSB  0×4b   8
BU_DISPADDR_COMP1_HBS_MSB  0×4d   2Must load
BU_DISPADDR_COMP1_HBS_MID  0×4e   8
BU_DISPADDR_COMP1_HBS_LSB  0×4f   8
Show C.11.2 Top-Level register A address generator keyhole (continuing)
The keyhole register nameThe keyhole addressThepositionNote
  BU_DISPADDR_COMP2_HBS_MSB
 0×51  2Must load
  BU_DISPADDR_COMP2_HBS_MID  0×52  8
  BU_DISPADDR_COMP2_HBS_LSB  0×53  2
  BU_DISPADDR_CNT_left0_MSB  0×55  2Only be used fortest
  BU_DISPADDR_CNT_left0_MID
 0×56  8
  BU_DISPADDR_CNT_left0_LSB  0×57  8
  BU_DISPADDR_CNT_left1_MSB  0×59  2Only be used fortest
  BU_DISPADDR_CNT_left1_MID
 0×5a  8
  BU_DISPADDR_CNT_left1_LSB  0×5b  8
  BU_DISPADDR_CNT_left2_MSB  0×5d  2Only be used fortest
  BU_DISPADDR_CNT_left2_MID
 0×5e  8
  BU_DISPADDR_CNT_left2_LSB  0×5f  8
  BU_DISPADDR_PAGE_ADDR0_MSB  0×61  2Only be used fortest
  BU_DISPADDR_PAGE_ADDR0_MID
 0×62  8
  BU_DISPADDR_PAGE_ADDR0_LSB  0×63  8
  BU_DISPADDR_PAGE_ADDR1_MSB  0×65  2Only be used fortest
  BU_DISPADDR_PAGE_ADDR1_MID
 0×66  8
  BU_DISPADDR_PAGE_ADDR1_LSB  0×67  8
  BU_DISPADDR_PAGE_ADDR2_MSB  0×69  2Only be used fortest
  BU_DISPADDR_PAGE_ADDR2_MID
 0×6a  8
  BU_DISPAOOR_BLOCK_AOOAR_TSB  O×6b  8
  BU_DISPADDR_BLOCK_ADDR0_MSB  0×6d  2Only be used fortest
  BU_DISPADDR_BLOCK_ADDR0_MID
 0×6e  8
  BU_DISPADDR_BLOCK ADDR0_LSB  0×6f  8
  BU_DISPADDR_BLOCK_ADDR1_MSB  0×71  2Only be used fortest
  BU_DISPADDR_BLOCK_ADDR1_MID
 0×72  8
  BU_DISPADDR_BLOCK_ADDR1_LSB  0×73  8
  BU_DISPADDR_BLOCK_ADDR2_MSB  0×75  2Only be used fortest
  BU_DISPADDR_BLOCK_ADDR2_MID
 0×76  8
  BU_DISPADDR_BLOCK_ADDR2_LSB  0×77  8
  BU_DISPADDR_BLOCKS_left0_MSB  0×79  2Only be used fortest
  BU_DISPADDR_BLOCKS_left0_MID
 0×7a  8
  BU_DISPADDR_BLOCKS_left0_LSB  0×7b  8
Show C.11.2 Top-Level register A address generator keyhole (continuing)
The keyhole register nameThe keyhole addressThepositionNote
 BU_DISPADDR_BLOCKS_left1_MSB
 0×7d  2Only be used fortest
 BU_DISPADDR_BLOCKS_left1_MID
 0×7e  8
 BU_DISPADDR_BLOCKS_left1_LSB  0×7f  8
 BU_DISPADDR_BLOCKS_left2_MSB  0×81  2Only be used fortest
 BU_DISPADDR_BLOCKS_left2_MID
 0×82  8
 BU_DISPADDR_BLOCKS_left2_LSB  0×83  8
 BU_WADDR_BUFFER0_BASE_MSB  0×85  2Must load
 BU_WADDR_BUFFER0_BASE_MID  0×86  8
 BU_WADDR_BUFFER0_BASE_LSB  0×87  8
 BU_WADDR_BUFFER1_BASE_MSB  0×89  2Must load
 BU_WADDR_BUFFER1_BASE_LSB  0×8a  8
 BU_WAOOR_BUFFFR2_BASE_MSB  0×8b  8
 BU_WADDR_BUFFER2_BASE_MSB  0×8d  2Must load
 BU_WADDR_BUFFER2_BASE_MID  0×8e  8
 BU_WAOOR_BUFFER2_BASE_LSB  0×8f  8
 BU_WADDR_COMP0_HMBADDR_MSB  0×91  2Only be used fortest
 BU_WADDR_COMP0_HMBADDR_MID
 0×92  8
 BU_WADDR_COMP0_HMBADDR_LSB  0×93  8
 BU_WADDR_COMP1_HMBADDR_MSB  0×95  2Only be used fortest
 BU_WADDR_COMP1_HMBADDR_MID
 0×96  8
 BU_WAOOR_CLMP1_HMSAOOR_LSB  0×97  3
 BU_WADDR_COMP2_HMBADDR_MSB  0×99  2Only be used fortest
 BU_WADDR_COMP2_HMBADDR_MID
 0×9a  8
 BU_WADDR_COMP2_HMBADDR_LSB  0×9b  8
 BU_WADDR_COMP0_VMBADDR_MSB  0×9d  2Only be used fortest
 BU_WACOR_COMPO_VMBADOR_MIO
 0×9e  8
 BU_WADDR_COMP0_VMBADDR_LSB  0×9f  3
 BU_WADDR_COMP1_VMBADDR_MSB  0×a1  2Only be used fortest
 BU_WADDR_COMP1_VMBADDR_MID
 0×a2  8
 BU_WADDR_COMP1_VMBADDR_LSB  0×a3  8
 BU_WADDR_COMP2_VMBADDR_MSB  0×a5  2Only be used fortest
 BU_WADDR_COMP2_VMBADDR_MID
 0×a6  8
 BU_WADDR_COMP2_VMBAOOR_LSB  0×a7  8
Show C.11.2 Top-Level register A address generator keyhole (continuing)
The keyhole register nameThe keyhole addressThepositionNote
 BU_WADDR_VBADDR_MSB
 0×a9  2Only be used fortest
 BU_WADDR_VBADDR_MID
 0×aa  8
 BU_WADDR_VBADDR_LSB  0×ab  3
 BU_WADDR_COMP0_HALF_WIDTH_IN_BLOCKS_MSB  0×ad  2Must load
 BU_WADDR_COMP0_HALF_WIDTH_IN_BLOCKS_MID  0×ae  8
 BU_WADDR_COMP0_HALF_WIDTH_IN_BLOCKS_LSB  0×af  8
 BU_WADDR_COMP1_HALF_WIDTH_IN_BLOCKS_MSB  0×b1  2Must load
 BU_WADDR_COMP1_HALF_WIDTH_IN_BLOCKS_MID  0×b2  8
 BU_WADDR_COMP1_HALF_WIDTH_IN_BLOCKS_LSB  0×b3  8
 BU_WADDR_COMP2_HALF_WIDTH_IN_BLOCKS_MSB  0×b5  2Must load
 BU WADDR_COMP2_HALF_WIDTH_IN_BLOCKS_MID  0×b6  8
 BU_WADDR_COMP2_HALF_WIDTH_IN_BLOCKS_LSB  0×b7  8
 BU_WADDR_HB_MSB  0×b9  2Only be used fortest
 BU_WADDR_HB_MID
 0×ba  8
 BU_WADDR_HB_LSB  0×bb  8
 BU_WADDR_COMP0_OFFSET_MSB  0×bd  2Must load
 BU_WADDR_COMP0_OFFSET_MID  0×be  8
 BU_WADDR_COMP0_OFFSET_LSB  0×bf  8
 BU_WADDR_COMP1_OFFSET_MSB  0×c1  2Must load
 BU_WADDR_COMP1_OFFSET_MID  0×c2  8
 BU_WADDR_COMP1_OFFSET_LSB  0×c3  8
 BU_WADDR_COMP2_OFFSET_MSB  0×c5  2Must load
 BU_WADDR_COMP2_OFFSET_MID  0×c6  8
 BU_WADDR_COMP2_OFFSET_LSB  0×c7  8
 BU_WADDR_SCRATCH_MSB  0×c9  2Only be used fortest
 BU_WADDR_SCRATCH_MID
 0×ca  8
 BU_WADDR_SCRATCH_LSB  0×cb  8
 BU_WADDR_MBS_WIDE_MSB  0×cd  2Must load
 BU_WADDR_MBS_WIDE_MID  0×ce  8
 BU_WADDR_MBS_WIDE_LSB  0×cf  8
 BU_WADDR_MBS_HIGH_MSB  0×d1  2Must load
 BU_WADDR_MBS_HIGH_MID  0×d2  8
 BU_WADDR_MBS_HIGH_LSB  0×d3  8
Show C.11.2 Top-Level register A address generator keyhole (continuing)
The keyhole register nameThe keyhole addressThepositionNote
 BU_WADDR_COMP0_LAST_MB_IN_ROW_MSB
 0×d5  2Must load
 BU_WADDR_COMP0_LAST_MB_IN_ROW_MID  0×d6  8
 BU_WADDR_COMP0_LAST_MB_IN_ROW_LSB  0×d7  8
 BU_WADDR_COMP1_LAST_MB_IN_ROW_MSB  0×d9  2Must load
 BU_WADDR_COMP1_LAST_MB_IN_ROW_MID  0×da  8
 BU_WADDR_COMP1_LAST_MB_IN_ROW_LSB  0×db  8
 BU_WADDR_COMP2_LAST_MB_IN_ROW_MSB  0×dd  2Must load
 BU_WADDR_COMP2_LAST_MB_IN_ROW_MID  0×de  8
 BU_WADDR_COMP2_LAST_MB_IN_ROW_LSB  0×df  8
 BU_WADDR_COMP0_LAST_MB_IN_HALF_ROW_MSB  0×e1  2Must load
 BU_WADDR_COMP0_LAST_MB_IN_HALF_ROW_MID  0×e2  8
 BU_WADDR_COMP0_LAST_MB_IN_HALF_ROW_LSB  0×e3  8
 BU_WADDR_COMP1_LAST_MB_IN_HALF_ROW_MSB  0×e5  2Must load
 BU_WADDR_COMP1_LAST_MB_IN_HALF_ROW_MID  0×e6  8
 BU_WADDR_COMP1_LAST_MB_IN_HALF_ROW_LSB  0×e7  8
 BU_WADDR_COMP2_LAST_MB_IN_HALF_ROW_MSB  0×e9  2Must load
 BU_WADDR_COMP2_LAST_MB_IN_HALF_ROW_MID  0×ea  8
 BU_WADDR_COMP2_LAST_MB_IN_HALF_ROW_LSB  0×eb  8
 BU_WADDR_COMP0_LAST_ROW_IN_MB_MSB  0×ed  2Must load
 BU_WADDR_COMP0_LAST_ROW_IN_MB_MID  0×ee  8
 BU_WADDR_COMP0_LAST_ROW_IN_MB_LSB  0×ef  8
 BU_WADDR_COMP1_LAST_ROW_IN_MB_MSB  0×f1  2Must load
 BU_WADDR_COMP1_LAST_ROW_IN_MB_MID  0×f2  8
 BU_WADDR_COMP1_LAST_ROW_IN_MB_LSB  0×f3  8
 BU_WADDR_COMP2_LASTRROW_IN_MB_MSB  0×f5  2Must load
 BU_WADDR_COMP2_LAST_ROW_IN_MB_MID  0×f6  8
 BU_WADDR_COMP2_LAST_ROW_IN_MB_LSB  0×f7  8
 BU_WADDR_COMP0_BLOCKS_PER_MB_ROW_MSB  0×f9  2Must load
 BU_WADDR_COMP0_BLOCKS_PER_MB_ROW_MID  0×fa  8
 BU_WADDR_COMP0_BLOCKS_PER_MB_ROW_LSB  0×fb  8
 BU_WADDR_COMP1_BLOCKS_PER_MB_ROW_MSB  0×fd  2Must load
 BU_WADDR_COMP1_BLOCKS_PER_MB_ROW_MID  0×fe  8
 BU_WADDR_COMP1_BLOCKS_PER_MB_ROW_LSB  0×ff  8
Show C.11.2 Top-Level register A address generator keyhole (continuing)
The keyhole register nameThe keyhole addressThepositionNote
 BU_WADDR_COMP2_BLOCKS_PER_MB_ROW_MSB
 0×101  2Must load
 BU_WADDR_COMP2_BLOCKS_PER_MB_ROW_MID  0×102  8
 BU_WADDR_COMP2_BLOCKS_PER_MB_ROW_LSB  0×103  8
 BU_WADDR_COMP0_LAST_MB_ROW_MSB  0×105  2Must load
 BU_WADDR_COMP0_LAST_MB_ROW_MID  0×106  8
 BU_WADDR_COMP0_LAST_MB_ROW_LSB  0×107  8
 BU_WADDR_COMP1_LAST_MB_ROW_MSB  0×109  2Must load
 BU WADDR_COMP1_LAST_MB_ROW_MID  0×10a  8
 BU_WADDR_COMP1_LAST_MB_ROW_LSB  0×10b  8
 BU_WADDA_COMP2_LAST_MB_ROW_MSB  0×10d  2Must load
 BU_WADDR_COMP2_LAST_MB_ROW_MID  0×10e  8
 BU_WADDR_COMP2_LAST_MB_ROW_LSB  0×10f  8
 BU_WADDR_COMP0_HBS_MSB  0×111  2Must load
 BU_WADDR_COMP0_HBS_MID  0×112  8
 BU_WADDR_COMP0_HBS_LSB  0×113  8
 BU_WADDR_COMP1_HBS_MSB  0×115  2Must load
 BU_WADDR_COMP1_HBS_MID  0×115  8
 BU_WADDR_COMP1_HBS_LSB  0×117  8
 BU_WADDR_COMP2_HBS_MSB  0×119  2Must load
 BU_WADDR_COMP2_HBS_MID  0×11a  8
 BU_WADDR_COMP2_HBS_LSB  0×11b  8
 BU_WADDR_COMP0_MAXHB  0×11f  2Must load
 BU_WADDR_COMP1_MAXHB  0×123  2
 BU_WADDR_COMP2_MAXHB  0×127  2
 BU_WADDR_COMP0_MAXVB  0×125  2Must load
 BU_WADDR_COMP1_MAXVB  0×121  2
 BU_WADDR_COMP2_MAXVB  0×133  2
Table is horizontal over-sampling and C space keyhole address mapping table C.11.3.
The keyhole register nameThe keyhole addressThe positionNote
    BU_UH0_A00_1     0×0     5     R/W-Coeff 0.0
    BU_UH0_A00_0     0×1     8
    BU_UH0_A01_1     0×2     5     R/W-Coeff 0.1
    BU_UH0_A01_0     0×3     8
    BU_UH0_A02_1     0×4     5     R/W-Coeff 0.2
    BU_UH0_A02_0     0×5     8
    BU_UH0_A03_1     0×6     5     R/W-Coeff 0.0
    BU_UH0_A03_0     0×7     8
    BU_UH0_A10_1     0×8     5     R/W-Coeff 1.0
    BU_UH0_A10_0    0×9    8
    BU_UH0_A11_1     0×a     5     R/W-Coeff 1.1
    BU_UH0_A11_0     0×b     8
    BU_UH0_A12_1     0×c     5     R/W-Coeff 1.2
    BU_UH0_A12_0     0×d     8
    BU_UH0_A13_1     0×e     5     R/W-Coeff 1.3
    BU_UH0_A13_0     0×f     8
    BU_UH0_A20_1    0×10    5    R/W-Coeff 2.0
    BU_UH0_A20_0     0×11     8
    BU_UH0_A21_1     0×12     5     R/W-Coeff 2.1
    BU_UH0_A21_0    0×13    8
    BU_UH0_A22_1     0×14     5     R/W-Coeff 2.2
    BU_UH0_A22_0     0×15     8
    BU_UH0_A23_1     0×16     5     R/W-Coeff 2.3
    BU_UH0_A23_0     0×17     8
    BU_UH0_MODE     0×18     2     R/W
    BU_UH1_A00_1     0×20     5     R/W-Coeff 0.0
    BU_UH1_A00_0     0×21     8
    BU_UH1_A01_1     0×22     5     R/W-Coeff 0.1
    BU_UH1_A01_0     0×23     8
    BU_UH1_A02_1     0×24     5     R/W-Coeff 0.2
    BU UH1_A02_0     0×25     8
    BU_UH1_A03_1     0×26     5     R/W-Coeff 0.0
    BU_UH1_A03_0     0×27     8
Table is horizontal over-sampling and C space keyhole address mapping table (continuing) C.11.3.
The keyhole register nameThe keyhole addressThe positionNote
 BU_UH1_A10_1  0×28  5  R/W-Coeff 1.0
 BU_UH1_A10_0  0×29  8
 BU_UH1_A11_1  0×2a  5  R/W-Coeff 1.1
 BU_UH1_A11_0  0×2b  8
 BU_UH1_A12_1  0×2c  5  R/W-Cceff 1.2
 BU_UH1_A12_0  0×2d  8
 BU_UH1_A13_1  0×2e  5  R/W-Coeff 2.3
 BU_UH1_A13_0  0×2f  8
 BU_UH1_A20_1  0×30  5  R/W-Coeff 2.0
 BU_UH1_A20_0  0×31  8
 BU_UH1_A21_1  0×32  5  R/W-Coeff 2.1
 BU_UH1_A21_0  0×33  8
 BU_UH1_A22_1  0×34  5  R/W-Coeff 2.2
 BU_UH1_A22_0  0×35  8
 BU_UH1_A23_1  0×36  5  R/W-Coeff 2.3
 BU_UH1_A23_0  0×37  8
 BU_UH1_MOOE  0×38  2  R/W
 BU_UH2_A00_1  0×40  5  R/W-Coeff 0.0
 BU_UH2_A00_0  0×41  8
 BU_UH2_A01_1  0×42  5  R/W-Coeff 0.1
 BU_UH2_A01_0  0×43  8
 BU_UH2_A02_1  0×44  5  R/W-Coeff 0.2
 BU_UH2_A02_0  0×45  8
 BU_UH2_A03_1  0×46  5  R/W-Coeff 0.0
 BU_UH2_A03_0  0×47  8
 BU_UH2_A10_1  0×48  5  R/W-Coeff 1.0
 BU_UH2_A10_0  0×49  8
 BU_UH2_A11_1  0×4a  5  R/W-Coeff 1.1
 BU_UH2_A11_0  0×4b  8
 BU_UH2_A12_1  0×4c  5  R/W-Coeff 1.2
 BU_UH2_A12_0  0×4d  8
 BU_UH2_A13_1  0×4e  5  R/W-Coeff 1.3
 BU_UH2_A13_0  0×4f  8
Table is horizontal over-sampling and C space keyhole address mapping table (continuing) C.11.3.
The keyhole register nameThe keyhole addressThepositionNote
 BU_UH2_A20_1
  0×50   5   R/W-Coeff 2.0
 BU_UH2_A20_0   0×51   8
 BU_UH2_A21_1   0×52   5   R/W-Coeff 2.1
 BU_UH2_A21_0   0×53   8
 BU_UH2_A22_1   0×54   5   R/W-Coeff 2.2
 BU_UH2_A22_0   0×55   8
 BU_UH2_A23_1   0×56   8   R/W-Coeff 2.3
 BU_UH2_A23_0   0×57   8
 BU_UH2_MOOE   0×58   2   R/W
 BU_CS_A00_1
  0×60   5   R/W
 BU_CS_A00_0
  0×61   8
 BU_CS_A10_1   0×62   5   R/W
 BU_CS_A10_0
  0×63   8
 BU_CS_A20_1   0×64   5   R/W
 BU_CS_A20_0
  0×65   8
 BU_CS_B0_1   0×66   6   R/W
 BU_CS_B0_0
  0×67   8
 BU_CS_A01_1  0×68  5  R/W
 BU CS_A01_0
  0×69   8
 BU_CS_A11_1   0×6a   5   R/W
 BU_CS_A11_0
  0×6b   8
 BU_CS_A21_1   0×6c   5   R/W
 BU_CS_A21_0
  0×6d   8
 BU_CS_B1_1   0×6e   6   R/W
 BU_CS_B1_0
  0×61   8
 BU_CS_A02_1   0×70   5   R/W
 BU_CS_A02_0
  0×71   8
 BU_CS_A12_1   0×72   R/W
 BU_CS_A12_0
  0×73   8
 BU_CS_A22_1   0×74   5   R/W
 BU_CS_A22_0
  0×75   8
 BU_CS_B2_1   0×76   6   R/W
 BU_CS_B2_0
  0×77   8
Table is horizontal over-sampling and C space keyhole address mapping table (continuing) C.11.3.
The keyhole register nameThe keyhole addressThepositionNote
  BU_UH2_A20_1
  0×50   5   R/W-Coeff 2.0
  BU_UH2_A20_0   0×51   8
  BU_UH2_A21_1   0×52   5   R/W-Coeff 2.1
  BU_UH2_A21_0   0×53   8
  BU_UH2_A22_1   0×54   5   R/W-Coeff 2.2
  BU_UH2_A22_0   0×55   8
  BU_UH2_A23_1   0×56   5   R/W-Coeff 2.3
  BU_UH2_A23_0   0×57   8
  BU_UH2_MODE   0×58   2   R/W
  BU_CS_A00_1
  0×60   5   R/W
  BU_CS_A00_0
  0×61   8
  BU_CS_A10_1   0×62   5   R/W
  BU_CS_A10_0
  0×63   8
  BU_CS_A20_1   0×64   5   R/W
  BU_CS_A20_0
  0×65   8
  BU_CS_B0_1   0×66   6   R/W
  BU_CS_B0_0
  0×67   8
  BU_CS_A01_1   0×68   5   R/W
  BU_CS_A01_0
  0×69   8
Table is horizontal over-sampling and C space keyhole address mapping table (continuing) C.11.3.
 BU_CS_11_1  0×6a  5  R/W
 BU_CS_A11_0
 0×6b  8
 BU_CS_A2_1  0×6c  5  R/W
 BU_CS_A21_0
 0×6d  8
 BU_CS_B1_1  0×6e  6  R/W
 BU_CS_B1_0
 0×6f  8
 BU_CS_A02_1  0×70  5  R/W
 BU_CS_A02_0
 0×71  8
 BU_CS_A12_1  0×72  5  R/W
 BU_CS_A12_0
 0×73  8
 BU_CS_A22_1  0×74  5  R/W
 BU_CS_A22_0
 0×75  8
 BU_CS_B2_1  0×76  6  R/W
 BU_CS_B2_0
 0×77  8
C.13 dimension of picture parameters C .13.1 preface
The correspondence that the coding fragment of following form describes in detail is interrupted required processing requirements from the dimension of picture of writing address generator. Notice that the dimension of picture parameter can be by sending HORIZ-ONTAL_MBS, the combination of VERTICAL_MBS and DEFINE-SAMPLING (to each component) token and change " on-the-fly ", the result produces writing address generator and interrupts. These tokens can arrive with any order, and usually, any one all needs the dimension of picture parameter is recomputated. Yet at setup times, the arrival that detected all events before carrying out any calculating will be more effective.
When arranging, it is possible writing particular value for the dimension of picture parameter register, therefore can not rely on corresponding to the interruption of token and process. Because this reason has also provided the suitable register value to the SIF image. C.13.2 the interruption of dimension of picture parameter is processed
Have 5 kinds of dimension of picture events, the initial communication of each event is following to be provided:
   if(hmbs_event)        load(mbs_wide);       else if(vmbs_event)        load(mbs_high);       else if(def_samp0_event)       {        load(maxhb(0));        load(maxvb(0));       }       else if (def_sampl_event)       {        load(maxhb(1));        load(maxvb(1));       }       else if (def_samp2_event)       {        load(maxhb(2))        load(maxvb(2));       }
In addition, be the dimension of picture parameter that is consistent, the calculating below needing:
if(hmbs_event||vmbs_event||        def_samp0_event||def_sampl_event||def_samp2_event)    {      for(i=0;i<max_component;i++)      {       hbs(i)=addr_hbs(i)=(maxhb(il+1)·mbs_wide;       half_width_in_blocks(i)=((maxhb(i)+l)·mbs_wide)/2;       last_mb_in_row(i)=hbs(i)-(maxhb(i)+1);       last_mb_in_half_row(i)=half_width_in_blocks(i)-     (maxhb(i)+1);       last_row_in_mb(i)=hbs(i)·maxvb(i);       blocks_per_mb_row(i)=last_row_in_mb(i)+hbs(i);       last_mb_row(i)=bocks_per_mb_row(i)·(mbs_high-1);
Although be not strict with the value (for example display window size) of interrupting going to revise the dispaddr register according to dimension of picture. According to the requirement of concrete application, may wish to do like this. C.13.3 to the register value of SIF image
After SIF as above being interrupted process, be included in the value in all dimension of picture registers, to 4:2:0 stream with as follows: 3.3.1 initial value C.1
BU_WADDR_MBS_WIDE=0×16
BU_WADDR_MBS_HIGH=0×12
BU_WADDR_COMP0_MAXHB=0×01
BU_WADDR_COMP1_MAXHB=0×00
BU_WADDR_COMP2_MAXHB=0×00
BU_WADDR_COMP0_MAXVB=0×01
BU_WADDR_COMP1_MAXVB=0×00
BU_WADDR_COMP2_MAXVB=0 * 00 is C.13.3.2 after second sub-value-calculating
BU_WADDR_COMP0_HBS=0×2C
BU_WADDR_COMP1_HBS=0×16
BU_WADDR_COMP2_HBS=0×16
BU_ADDR_COMP0_HBS=0×2C
BU_ADDR_COMP1_HBS=0×16
BU_ADDR_COMP2_HBS=0×16
BU_WADDR_COMP0_HALF_WIDTH_IN_BLOCKS=0×16
BU_WADDR_COMP1_HALF_WIDTH_IN_BLOCKS=0×0B
BU_WADDR_COMP2_HALF_WIDTH_IN_BLOCKS=0×0B
BU_WADDR_COMP0_LAST_MB_IN_ROW=0×2A
BU_WADDR_COMP1_LAST_MB_IN_ROW=0×15
BU_WADDR_COMP2_LAST_MB_IN_ROW=0×15
BU_WADDR_COMP0_LAST_MB_IN_HALF_ROW=0×14
BU_WADDR_COMP1_LAST_MB_IN_HALF_ROW=0×0A
BU_WADDR_COMP2_LAST_MB_IN_HALF_ROW=0×0A
BU_WADDR_COMP0_LAST_ROW_IN_MB=0×2C
BU_WADDR_COMP1_LAST_ROW_IN_MB=0×0
BU_WADDR_COMP2_LAST_ROW_IN_MB=0×0
BU_WADDR_COMP0_BLOCKS_PER_MB_ROW=0×58
BU_WADDR_COMP1_BLOCKS_PER_MB_ROW=0×16
BU_WADDR_COMP2_BLOCKS_PER_MB_ROW=0×16
BU_WADDR_COMP0_LAST_MB_ROW=0×5D8
BU_WADDR_COMP1_LAST_MB_ROW=0×176
BU_WADDR_COMP2_LAST_MB_ROW=0×176
Note, if these values are written into clearly, must consider so the multibyte characteristic of most of unit when arranging.
Note appended drawings, these figure have to those that people of technology has done explanation as one thing of this area, and they comprise this application, understand the present invention's detailed construction to external world and the effect of operation with further.
Pipeline system of the present invention described above satisfies a long-standing demand, with improved system. Improved pipeline system involved in the present invention has an input, a lot of levels of processing between output and the input and output. This a lot of processing level is got up interior bonds by two-wire interface, along streamline transmission token, and control and/or data token are with the form of general adaptation unit, in streamline, process the level interfaces with all, in streamline with selected level interaction, finishing in processing level the control data and/or to jointly control-data function, so, processing level in streamline is provided flexibility highly in configuration with in processing. According to the present invention, processing level can be configured according to a token of identifying at least. One that processes level can be detector for initial code, and this detector receives input and produces and/or the conversion token.
The invention still further relates to and change the influent stream line system. The Space Solutions code system that this system has the video data that comprises the Huffman decoder to use, pointer and a microcode ROM of pointing to data and arithmetic and logical unit, to in a large amount of different image compression/decompression standard each, microcode ROM has the separate storage program, these programs can be selected by token, thereby just easy to the processing of a large amount of different image standards.
Improved system comprises a multi-standard video decompressing device, and this device has a lot of levels, and these grades are aligned to as the two-wire interface of pipeline processor is inner and link to each other. Control token and data token are by single two-wire interface, with token form communications of control data. One token decode circuit is placed in some level, is relevant to this grade the control token to identify some token, and transmits unacquainted control token along streamline. The treatment circuit that reconfigures is to be positioned to choose level, and the control token of an identification is responded, and removes to process a data token of recognizing to reconfigure such level. In order to implement this system, the support subsystem circuit of a variety of uniquenesses and treatment technology are open.

Claims (7)

1, system has an input, an output, and have at least one to process level between the input and output, it is characterized in that:
Described processing level can be configured with the identification of at least one token, and token is with the form of general adaptation unit, to set up control and/or data function.
Thereby described processing level is provided, with the flexibility that strengthens configuration and process.
2, a kind of pipeline processes mechanism, in very high processing level is arranged, they link to each other by the two-wire interface bus is inner, it is characterized in that: control token and data token are by described two-wire interface; Be positioned at some described token decoding circuit of processing level and be used for identifying definite token as the control token relevant with that grade, and transmit unacquainted control token along streamline.
3, reconfigurable treatment system, have a lot of processing levels, at least some processing will be used two-wire interface in these processing levels, and control token in addition and data token passes through two-wire interface, it is characterized in that: it works token decode circuit to the token that electric two-wire interface circuit transmits, so that identification and configurable some said token system relevant with treatment system are to produce the output of an expression token characteristic; Level is differentiated in an action, and it works to the output of token decode circuit, so that process reconfiguring of level; And have at least one to process level and can differentiate that level be reconfigured according to described action, the data token that is received with control is by described two-wire interface.
4, in the Space Solutions code system of a pair of video data, have a Huffman decoder, one points to the pointer of data and ALU, it is characterized in that:
Each has the program of independent storage in its corresponding a lot of different image compression/decompression standard of microcode ROM, and described program can be selected by the interface adaptation unit with the token form, in order to process for a lot of image standards.
5, in the digital image information treatment system, it is characterized in that:
Certain methods for disposing selectively described system, is come deal with data according to a lot of different image compression/decompression standard.
6, for using one to have a lot of systems that process level, it is characterized in that:
General adaptation unit with interactive interface token form, to be implemented as the function of control and/or data in described each processing level, therefore, the described processing level that provides has strengthened the flexibility in disposing and processing.
7, an input is arranged in system, an output, and a lot of processing levels are arranged between the input and output, it is characterized in that:
The interface token of a mutual change structure, it has defined a general adaptability unit,
In described processing level control and/or data function are arranged, therefore, the described processing level that provides has increased the executory flexibility of different task.
CN95103246A1994-03-241995-03-24 reconfigurable processing stagePendingCN1137212A (en)

Applications Claiming Priority (4)

Application NumberPriority DateFiling DateTitle
GB9405914.41994-03-24
GB9405914AGB9405914D0 (en)1994-03-241994-03-24Video decompression
GB9504047.31995-02-28
GB9504047AGB2288521B (en)1994-03-241995-02-28Reconfigurable process stage

Related Child Applications (1)

Application NumberTitlePriority DateFiling Date
CN98103849ADivisionCN1235483A (en)1994-03-241998-02-16Prediction filter

Publications (1)

Publication NumberPublication Date
CN1137212Atrue CN1137212A (en)1996-12-04

Family

ID=26304581

Family Applications (2)

Application NumberTitlePriority DateFiling Date
CN95103246APendingCN1137212A (en)1994-03-241995-03-24 reconfigurable processing stage
CN98103849APendingCN1235483A (en)1994-03-241998-02-16Prediction filter

Family Applications After (1)

Application NumberTitlePriority DateFiling Date
CN98103849APendingCN1235483A (en)1994-03-241998-02-16Prediction filter

Country Status (5)

CountryLink
JP (4)JP3302527B2 (en)
KR (1)KR100291532B1 (en)
CN (2)CN1137212A (en)
CA (3)CA2145219C (en)
GB (1)GB2288521B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US7873105B2 (en)2005-04-012011-01-18Broadcom CorporationHardware implementation of optimized single inverse quantization engine for a plurality of standards
CN103106120A (en)*2011-08-182013-05-15国际商业机器公司Multithreaded physics engine with impulse propagation
CN108111865A (en)*2013-01-042018-06-01索尼公司JCTVC-L0226:VPS and VPS_EXTENSION updates
CN109508206A (en)*2013-06-282019-03-22英特尔公司Processor, the method and system loaded dependent on the partial width of mode is carried out to wider register
CN109901044A (en)*2017-12-072019-06-18英业达科技有限公司The central processing unit differential testing system and method for multi circuit board
CN110688159A (en)*2017-07-202020-01-14上海寒武纪信息科技有限公司Neural network task processing system
CN113939776A (en)*2019-06-042022-01-14大陆汽车有限责任公司Active data generation taking uncertainty into account

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
FR2794601B1 (en)*1999-06-022001-07-27Dassault Automatismes COMMUNICATION DEVICE FOR COLLECTIVE INFORMATION RECEPTION, IN PARTICULAR OF DIGITAL TELEVISION IMAGES AND / OR MULTIMEDIA DATA
EP1148727A1 (en)*2000-04-052001-10-24THOMSON multimediaMethod and device for decoding a digital video stream in a digital video system using dummy header insertion
KR100354768B1 (en)2000-07-062002-10-05삼성전자 주식회사Video codec system, method for processing data between the system and host system and encoding/decoding control method in the system
US8284844B2 (en)2002-04-012012-10-09Broadcom CorporationVideo decoding system supporting multiple standards
KR100722428B1 (en)*2005-02-072007-05-29재단법인서울대학교산학협력재단 Reconfigurable array structure with resource sharing and pipelining configuration
KR100711088B1 (en)*2005-04-132007-04-24광주과학기술원 Integer Converter for Moving Picture Encoder
KR100718135B1 (en)2005-08-242007-05-14삼성전자주식회사 Apparatus and method for image prediction for multi-format codecs and apparatus and method for image encoding / decoding using same
KR101354659B1 (en)2006-11-082014-01-28삼성전자주식회사Method and apparatus for motion compensation supporting multicodec
JP5698428B2 (en)*2006-11-082015-04-08三星電子株式会社Samsung Electronics Co.,Ltd. Motion compensation method, recording medium, and motion compensation device
KR101553648B1 (en)2009-02-132015-09-17삼성전자 주식회사 Processor with reconfigurable architecture
CA2794717C (en)*2010-04-022016-04-19Fujitsu LimitedApparatus and method for orthogonal cover code (occ) generation, and apparatus and method for occ mapping
JP6223323B2 (en)*2014-12-122017-11-01Nttエレクトロニクス株式会社 Decimal pixel generation method
US10720942B2 (en)*2015-07-032020-07-21Intel CorporationApparatus and method for data compression in a wearable device
CN113591795B (en)*2021-08-192023-08-08西南石油大学Lightweight face detection method and system based on mixed attention characteristic pyramid structure

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US4680581A (en)*1985-03-281987-07-14Honeywell Inc.Local area network special function frames
DE69229338T2 (en)*1992-06-301999-12-16Discovision Associates, Irvine Data pipeline system
US5325092A (en)*1992-07-071994-06-28Ricoh Company, Ltd.Huffman decoder architecture for high speed operation and reduced memory
US5298896A (en)*1993-03-151994-03-29Bell Communications Research, Inc.Method and system for high order conditional entropy coding
US5699460A (en)*1993-04-271997-12-16Array MicrosystemsImage compression coprocessor with data flow control and multiple processing units

Cited By (10)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US7873105B2 (en)2005-04-012011-01-18Broadcom CorporationHardware implementation of optimized single inverse quantization engine for a plurality of standards
CN103106120A (en)*2011-08-182013-05-15国际商业机器公司Multithreaded physics engine with impulse propagation
CN103106120B (en)*2011-08-182016-03-16国际商业机器公司There is the circuit arrangement of the multithreading physical engine of impulse propagation, system and method thereof
CN108111865A (en)*2013-01-042018-06-01索尼公司JCTVC-L0226:VPS and VPS_EXTENSION updates
CN109508206A (en)*2013-06-282019-03-22英特尔公司Processor, the method and system loaded dependent on the partial width of mode is carried out to wider register
CN109508206B (en)*2013-06-282023-08-29英特尔公司Processor, method and system for mode dependent partial width loading of wider registers
CN110688159A (en)*2017-07-202020-01-14上海寒武纪信息科技有限公司Neural network task processing system
CN109901044A (en)*2017-12-072019-06-18英业达科技有限公司The central processing unit differential testing system and method for multi circuit board
CN109901044B (en)*2017-12-072021-11-12英业达科技有限公司Central processing unit differential test system of multiple circuit boards and method thereof
CN113939776A (en)*2019-06-042022-01-14大陆汽车有限责任公司Active data generation taking uncertainty into account

Also Published As

Publication numberPublication date
JPH0918871A (en)1997-01-17
GB2288521B (en)1998-10-14
CN1235483A (en)1999-11-17
GB9504047D0 (en)1995-04-19
CA2145426A1 (en)1995-09-25
GB2288521A8 (en)1996-04-15
JPH0870453A (en)1996-03-12
CA2145549A1 (en)1995-09-25
CA2145219A1 (en)1995-09-25
JPH11266460A (en)1999-09-28
GB2288521A (en)1995-10-18
JP3302527B2 (en)2002-07-15
JPH08116260A (en)1996-05-07
KR100291532B1 (en)2001-06-01
CA2145219C (en)2001-11-27
KR950033896A (en)1995-12-26
CA2145549C (en)2001-02-20

Similar Documents

PublicationPublication DateTitle
CN1137212A (en) reconfigurable processing stage
US6697930B2 (en)Multistandard video decoder and decompression method for processing encoded bit streams according to respective different standards
US7095783B1 (en)Multistandard video decoder and decompression system for processing encoded bit streams including start codes and methods relating thereto
US6435737B1 (en)Data pipeline system and data encoding method
US6047112A (en)Technique for initiating processing of a data stream of encoded video information
US5805914A (en)Data pipeline system and data encoding method
US6330665B1 (en)Video parser
CN1133534A (en)Detector for initial code
CN1144434A (en)Video decompression
US6067417A (en)Picture start token
US5809270A (en)Inverse quantizer
US6079009A (en)Coding standard token in a system compromising a plurality of pipeline stages
CN1114489A (en)Pipeline

Legal Events

DateCodeTitleDescription
C06Publication
PB01Publication
C10Entry into substantive examination
SE01Entry into force of request for substantive examination
C02Deemed withdrawal of patent application after publication (patent law 2001)
WD01Invention patent application deemed withdrawn after publication

[8]ページ先頭

©2009-2025 Movatter.jp