Positive discrete cosine transform is widely used in signal of video signal compression and processing.The positive discrete cosine transform of two dimension is defined as follows:X whereinIjBe the input element i of a N * N matrix, j=0,1,2 ..., N-1, and zMnBe the output element of a N * N matrix, m, n=0,1,2 ..., n-1.WhereinAnti-discrete cosine then is defined as:The matrix form of equation (1) is
Z=CXCt(3) wherein X is the input data matrix, and C is the cosine transition matrix, and CtIt is the transposition of Matrix C.If X is a N * N matrix, then equation (3) can be expressed as:The very similar positive discrete cosine transform of computing of anti-Discrete Slant string conversion is except C and CtExchange.Therefore can the Dan Yizheng discrete cosine transform discuss.Equation (3) can be divided into following one dimension conversion
Y=CX (5)
Z=YCt(6) wherein Y is a product matrix placed in the middle, and CtIt is the transposition of Matrix C.Lead from equation (6), following equation is similarly effective:
Z=CYt(6a) Y whereintTransposition for matrix Y.If the anti-discrete cosine conversion, we then have:
X=C
tY or X=Y
tThe many tradition of C (6b) are used the positive discrete cosine transform technique or the circuit of input data matrix, as " discrete cosine transform " of N.Ahmed, and IEEE Trans.on comp., vol.C-23, Jan., 1974, the positive discrete cosine transform technique of property as 90-93 proposes; N.Cho ﹠amp; S.Lee, " fast algorithm that 2-D discrete cosine changes and enforcement (Fast Algorithm and Implementation of 2-D DCT "), IEEETrans.on Cir.﹠amp; Sys., Vol.38, No.3, Mar.1991,297-305 propose N positive discrete cosine transform circuit of one dimension of utilization or multiplexer, calculate a positive discrete cosine transform of two-dimentional N * N with N/2 the positive discrete cosine transform circuit of one dimension; M.Sun et al., " VLSI Implementation of a 16 * 16 DCT ", IEEETrans.on Cir.﹠amp; Sys., Vol.36, No.4, Apr.1989,610-17 propose the positive discrete cosine transform of two dimension that a use storer is stored the positive discrete cosine transform matrix operation of one dimension partial results; H.Hou, " calculate the quick recursive algorithm (A Fast Recursive Algorithm for Computing theDCT) of discrete cosine transform ", IEEE Trans.on Acoustic, Speech and SignalProcessing, Vol.ASSP-35, No.10,1987,1455-61 utilizes the symmetry of just discrete string conversion coefficient matrix to reduce the quantity of multiplying.Equation (5) can be write as:
Y wherein
eBe the matrix that contains matrix Y even number line, Y
oBe the matrix that contains matrix Y odd-numbered line, X
fBe the matrix that contains the matrix X first half capable (upper half rows), X
rBe the matrix that contains matrix X Lower Half row, C
1And C
2Be
Positive discrete cosine transform matrix of coefficients.For example, with one 8 * 8 positive discrete cosine transform, equation (7) can be written as:
A=cos (π/4) wherein, B=cos (π/8), C=sin (π/8), D=cos (π/16), E=cos (3 π/16), F=sin (3 π/16), G=(sin π/16).Y
0, Y
1, Y
2, Y
7Be the row of matrix Y, and X
0, X
1..., X
7Row for input data matrix X.Thus, the anti-discrete cosine conversion can be written as:
Wherein
Be C
1Conversion and
Be C
2It is conversion.For example, with one 8 * 8 positive discrete cosine transform, equation (8) can be written as:
Equation 7-7 (a) and 8-8 (a) also can be applicable to the Z=CY of decision as equation (6a)
t, or at anti-discrete cosine conversion, Y=ZC.Equation (7a) and advantage (8a) are that the multiplying amount that reduces the positive discrete cosine transform of each one dimension reaches 50%.
Fig. 1 is the block diagram of just/anti-discrete string conversion of two dimension of expression.Special 4,791,598 as the U.S., wherein the positive discrete cosine transform of first onedimension 3 is via n1Line receives an input data matrix X.The positivediscrete cosine transform 3 of this one dimension is according to mode (5) or (7a) conversion input data matrix Y, and via n2Line exportstranspose memory 5 to.Transposememory 5 is with matrix Y transposition placed in the middle and via n3The transposition Y of line YtExport second positive discrete cosine transform 7 of one dimension to.The last positivediscrete cosine transform 7 of this one dimension is according to equation (6), (6a) or (7a) via n4A two dimension output of line output transition matrix Z.
With a positive discrete cosine transform circuit is example, one of the positive discrete cosine transform circuit output of one dimension " OK-and the hurdle " (row-column) in proper order, that is to say Y0, Y1, Y2..., Y63(it seems that with this one 8 * 8 entry of a matrix element is YIj, i, j=0,1,2 ..., 7).Transposememory 5 writes data in view of the above and with transposition order (hurdle-OK) it is read.That is to say Y0, Y8, Y24, Y32..., Y1, Y9, Y17, Y25, Y33..., Y63But, the just discrete string change-overcircuit 7 of equation (7a) demonstration one dimension need once be read two elements from matrix Y: (Y0, Y56), (Y8, Y48), (Y16, Y48) ..., (Y1, Y57), (Y9, Y49) ..., (Y31, Y39), or be referred to as " towing hurdle-OK " (shuffled column-row) in proper order.Data from transposition storer output is then received by pre-treatment (pre-processing) circuit that contains ALU (ALU) 76 and translation input/parallel output register 7C, and to pull two elements in hurdle-once export to the row sequential parallel.
With an anti-discrete cosine circuit is example, one dimension change-overcircuit 3 with matrix battle array Y with Y0, Y7, Y1, Y6, Y2, Y5..., Y8, Y15, Y9, Y14..., Y59, Y60(towing row-hurdle order) exports transpose memory to, and exports (Y with the mixing hurdle-row order of transposition by transpose memory0, Y56, Y8, Y48, Y16, Y40..., Y1, Y57, Y9, Y49..., Y31, Y39).But, equation (8a) demonstration one dimension anti-discrete cosine change-over circuit need once be read two elements from matrix Y: (Y0, Y8), (Y16, Y24) ..., (Y1, Y9), (Y17, Y25) ..., (Y55, Y63).Again, the translation of pre processing circuit input/parallel output register receives towing hurdle-row order output data from the storer that transplants, and once exports two elements with same order.
With the architecture design of Fig. 1 is that complete run-in index has its necessity, that is to say, matrix Y placed in the middle is from the positive discrete cosine transform 3 of first one dimension, throughtranspose memory 5, to the circulation between second positive discrete cosine transform 7 of one dimension without any interruption.But, the normally the most difficult part of transpose memory 5.For example, if data is to writetranspose memory 5 in proper order with row-hurdle, then be to call over hurdle-row.If the required time of read-write respectively is T, that petty traditional transpose memory then needs the 2T time to handle a matrix Y.
In order to reduce the processing time, different traditional transpose memories are suggested in succession.Framework has the earliest used two transpose memories and each matrix Y placed in the middle of interaction process.In first period, first transpose memory writes first matrix Y1 placed in the middle according to row-hurdle order.Second period, in second period, second transpose memory writes second matrix Y2 placed in the middle in proper order according to row-hurdle, and simultaneously, Y1 reads from first transpose memory with hurdle-row order.By that analogy, the 3rd period, first transpose memory writes Y3 in proper order with row-hurdle, and calls over Y2 from second transpose memory with hurdle-row simultaneously.The shortcoming of this transpose memory is to need two storeies.
Fig. 2 has represented by United States Patent (USP) 4,791, the 598 another kind of traditional approachs that propose.As shown in the figure,transpose memory 5 comprises a row theparallel register row 15 with a row of translation register row 13.Matrix Y translation input translation register row 13 (with hurdle-row in proper order) make each register 13-1,13-2 ..., the corresponding row of 13-N storage matrix Y.When whole matrix all deposits translation register in and arranges 13, each register 13-1,13-2 ..., 13-N just passes to their data corresponding parallel register 15-1 abreast, 15-2 ..., 15-N.Thus, parallel register row's 15 data just can be imported ALU 17.Simultaneously,translation register row 13 can receive next matrix Y at once.The shortcoming of this framework is the space of the many registers of waste.United States Patent (USP) 5,053,985 propose another kind of framework, as shown in Figure 3.The positive discrete cosine transform 20 of one dimension is handled input matrix X with the speed that doubles the data input rate.Comprise that with a positive discrete cosine transform 20 of one dimension preceding register andALU 30 come the pre-treatment data, come the processing array multiplication, also have late register andALU 40 with multiplier and totalizer 35.The matrix Y placed in the middle that this one-dimensionaldiscrete cosine conversion 20 produces is stored in a RAM60, if say with row-hurdle order, is to call over and handled by the positive discrete cosine transform of one dimension with hurdle-row fromRAM 60 then.Because the processing speed of its conversion is the twice of matrix X input, therefore with " the 2-D discrete cosine switching center processor of a kind of 100MHz (the A 100MHz 2-D DCT CoreProcessor) " of S.Urampto el al., IEEE J.of Solid State Cirs., Vol.27, No.4, P.429-99, Apr.1992 is an example, the data input rate be limited in processor speed half.
United States Patent (USP) 5,042,007 and 5,177,704 propose a transpose memory device that connects to FIFO (First-in first-out) with translation register.Two patents are all with (N2+ 1) n bit translation register and (2N2-2) 2 * 1 multiplexers are combined as transpose memory.Similarly, United States Patent (USP) 4,903,231 propose another with N2Individual translation register and about N2The transpose memory that individual multiplexer makes up.This framework can be exported matrix element in proper order or import with row one hurdle or hurdle delegation.But the shortcoming of these frameworks is the areas that accounted for many registers and multiplexer.
United States Patent (USP) 4,769,790 propose another kind of framework, are hurdle (or row) the input delay circuits (delay circuit) with an input matrix.Postpone a hurdle input sorting circuit (distribution circuit), and be rotated.The hurdle that is rotated exports second delay circuit to again with it delay.This framework has accounted for very big space at the used translation register of each delay circuit.
United States Patent (USP) 4,918,527 transpose memories that propose are divided into two halves.When data is written into a wherein half, second half then reads data.And the data that odd number is capable is to deposit the two halves storer in two pairs two over-over mode, that is to say, and is capable at each odd number, the element switch type on the hurdle that each is right deposit storer in, till whole matrix placed in the middle all deposits in.This framework has used a large amount of decoding circuits, has therefore accounted for very big space.
The major defect of aforementioned several frameworks is that they have accounted for sizable area.Two dimension just or the anti-discrete cosine converting structure more may cause the register overload of front processor.In addition, aforementioned various frameworks are neither can very flexiblely be read, and writes data, in other words, be expert at exactly-hurdle, and the hurdle-OK, conversion between towing hurdle-row and the towing row-hurdle.At last, traditional transpose memory need just/document before removing fully before the data that reads or writes between anti-discrete cosine conversion.Target of the present invention promptly is the shortcoming that is to overcome foregoing structure.
Fig. 4 be a two dimension just/anti-discrete cosine change-over circuit framework 100.As shown in the figure, according to two dimension of the present invention just/anti-discrete cosine change-over circuit framework 100 comprising an one dimension just or anti-discrete cosine change-over circuit 110, wherein exporting matrix Y placed in the middle is Y0, Y2..., Y63(positive discrete cosine transform) or Y0, Y7, Y1, Y6, Y2, Y5..., Y60(anti-discrete cosine conversion) be order.According to the present invention, the matrix Y of output passes to a transpose memory 120 via the bus-bar 111 of n bit.This transpose memory 120 output transposed matrix YtBe by the extremely positive discrete cosine transform circuit 140 of the capable order in towing hurdle (or capable order in hurdle is to anti-discrete cosine change-over circuit 140), and once export two elements.Matrix YtThen reach and only contain an ALU (Aritnmetic Logic Unit---ALU) 135 front processor 130.Since the present invention's transpose memory 120 is output matrix Y correctlyt, so the required translation/parallel register 13,15 (as shown in Figure 2) of prior art can omit.
Fig. 5 is the detail drawing of transpose memory of the present invention.As shown in the figure, one can transposition N * N matrix transpose memory comprise four dual-port random access memories 141,142,143,144.Each dual-port random access memory 141-144 is
Be used for storing a matrix Y placed in the
middle element 1/4th.Each dual-port random access memory 141-144 comprises a corresponding data input port one 41-1,142-1,143-1,144-1, be used for respectively receiving via n bit bus-bar 111, from the one dimension of Fig. 4 just/element of the matrix Y placed in the middle of anti-discrete cosine change-over circuit 110 outputs.Each dual-port random access memory 141-144 also comprises a corresponding address and writes port one 41-2,142-2, and 143-2,144-2 is connected to one respectively and writes address register 151.In addition, each dual-port random access memory comprises permission end (enable terminal) 141-3 again, 142-3, and 143-3,144-3, respectively via corresponding line 151-1,151-2,151-3,151-4 are connected to and write address register 151.When
writing address register 151 at 151-1,151-2,151-3, lines such as 151-4 produce enabling signal, and corresponding dual-port random access memory is then licensed.Licensed dual-port random access memory is then according to the address indication that writes address register 151.The matrix element writing memory element placed in the middle of data input line 111 will be appeared at.
Exceptwriting address register 151, this transpose memory 120 ports contain one and read address register 152, and port one 41-4,142-4,143-4,144-4 are read in the address that is connected respectively to each dual-port random access memory 141-144.In addition, read two of address register 152 outputs and select control signal S1 and S2.When receiving the address of reading address register 152, each dual-port random access memory 141-142 reads the memory element value that this address refers to simultaneously.Dual-port random access memory 141-144 is via read port 141-5,142-5, and 143-5 and 144-5 read element.
Select control signal S1 to be connected to first multiplexer 161.As shown in the figure,multiplexer 161 receives the output element from the read port 142-5 and the 143-5 of dual-portrandom access memory 142 and 143.Select control signal S2 to be connected to second andthird multiplexer 162 and 163.A data input ofmultiplexer 162 andformula multiplexer 163 is passed in the output ofmultiplexer 161 respectively, dual-portrandom access memory 141 then output data are to another data input ofmultiplexer 162, and dual-portrandom access memory 144 output data are to another data input ofmultiplexer 163.
As implied above, the parallel port one 41-5 output matrix element that reads from the address of each dual-port random access memory 141-144.But through selecting control signal S1 and S2, dual-port random access memory 141,142,143,144 wherein two element can the time export.
It below is the calculation specification of this transpose memory 120.For convenience's sake, suppose that each matrix Y placed in the middle is 8 * 8, and element is numbered Y0, Y1..., Y8, Y9..., Y63Trip hurdle order.Writing address register 151 can come the memory element of dual-port random access memory 141-144 is counted according to a string counting sequence that writes.For instance, Fig. 6 (a), (b), (c), (d) expression four kinds ofwrite address counter 151 is different writes counting sequence.As shown in the figure, each dual-port random access memory 141-144 comprises the memory element group 171-0 to 171-15 that is used for storing 16 matrix elements respectively, 172-0 to 172-15,173-0 to 173-15,174-0 to 174-15, numbering 1 to 64 has represented that writeaddress counter 151 writes order of each element.For example, according to Fig. 6 (a), in period 1 (cycle), writeaddress counter 151 points to the memory element 171-0 of dual-port random access memory 141.Second round, then point to the storage file 171-1 of dual-portrandom access memory 141, by that analogy.
Hence one can see that, and the counting sequence (Fig. 6 (a)~(d)) that writeaddress counter 151 is selected is according to the input of matrix Y and the sequence order (sequence order) of output element decide between two parties.Its input and output sequence see table one in proper order for details.Wherein output sequence represents that with parantheses the element of each cycle output is right.
The example of the just discrete string conversion of a two dimension below will be discussed.As shown in Table 1, the positive discrete cosine transform circuit 110 of the one dimension of Fig. 4 with each matrix Y placed in the middle to go the output of hurdle sequence.Data input bus-bar 111 is with the speed input transpose memory 120 of one-period one element in the element of Y.Transpose memory 120 exports the element of the Y of input to front processor 135 again to drag the capable order of electric fence, and each towing hurdle all heavily covers (as table 1) one time.Suppose a string matrix Y1 of positive discrete cosine transform circuit 110 sequences of one dimension ground output, Y2, Y3, Y4 ..., write
address counter 151 is according to the counting sequence counting of Fig. 6 (a).For example, at first with the element of importing
Write the memory element 171-0 of dual-port
random access memory 141, second element
Writing memory element 171-1, by that analogy.The element of the matrix Y in preceding 57 cycles write and the appointment of memory element sees table 2 for details.Fig. 7 represents the state of dual-port random access memory 141-144 after the 57th cycle.Wherein dash area is represented empty memory element.
Similar withwrite address counter 151, read the counting sequence counting of address counter 152 according to Fig. 8 (a)-(d).Shown in Fig. 8 (a)-(d), read address counter 152 and pointing to two memory elements that are positioned at different dual-port random access memories with one-period.For example, according to Fig. 8 (a),, read address counter 152 and read element in memory element 170-0 and the 173-0 from dual-portrandom access memory 141 and 143 respectively in the period 1.In addition, the memory element of each Output bar correspondence of matrix all must heavily cover and read once.Fig. 8 (a) for example reads address counter 152 and points to a hurdle of dual-portrandom access memory 141 and 143 in first to fourth cycle, memory element 171-0 just, 173-0,171-4,173-4,171-8,173-8,171-12,173-12.This hurdle read once again in the 5th to eight cycle again.Thus, reading address counter 152, to read element respectively in the cycle one to four rightRead once respectively again to eight in cycle five.
To write and read
address counter 151 and 152 and can write and read matrix element simultaneously in order to make, both must the phase mutually synchronization.The dual-port random access memory state of supposing the 57th all after dates is that shown in Figure 7, and from the 58th cycle, reading address counter 152, can to begin the sensor matrix element according to the order shown in Fig. 8 (a) right.So the 58th cycle, write
address counter 151 is with matrix element
Be stored in the memory element 173-1.In addition, reading address counter 152, also to read element from memory element 171-0 and 173-0 right
Read the memory element 171-0 that address counter 152 need only export an address to dual-port random access memory 141-144,172-0,173-0, it is right that 174-0 can reach the element that reads.This is that address counter 152 has produced suitable S1 simultaneously and S2 selects control signal because read, together to control multiplexer 161,162,163 and to select correct element to output.With above-mentioned is example, and S1 and S2 promptly control the output that
multiplexer 161 is selected dual-port
random access memory 143, and
control multiplexer 162 and 163 is selected the output of dual-port
random access memory 141 and
multiplexer 161 respectively.
Thus, in the 59th cycle, read
address counter 151 with element
Writing memory element 173-2, and reading address counter 152, to read element from memory element 171-4 and 173-4 right
This process continues up to the 64th cycle, sees table 3 for details.After the 64th cycle, write
address counter 151 is with the element that shows most of matrix Y1
Write, and read address counter 152 and also almost the element in first hurdle of matrix Y1 is run through.At this moment, write
address counter 151 is with the element of matrix Y2
Write dual-port random access memory 141-144.But, write
address counter 151 reads in the memory element that address counter 152 no longer reads so that element is write only.For example, write
address counter 151 can write the present memory element 171-0 that stores the matrix Y1 first hurdle element, 173-0,171-4,173-4,171-8,173-8,171-12,173-12 with the element of matrix Y2 first row.For the element of suitable storage matrix Y2, those read the memory element that address counter 152 no longer needs to write
address counter 151 according to the sequential counting of Fig. 6 (d).Therefore, the 65th cycle, when reading address counter 152 from memory element 171-12, it is right that 173-12 reads element
Simultaneously, write
address counter 151 is with element
Writing memory element 171-0.Matrix Y1 read and writing of matrix Y2 continues up to for the 121st cycle, see table 3 for details.After the 121st cycle, the state of dual-port random access memory 141-144 sees Fig. 9 for details.Wherein dash area represents to read the memory element part that address counter 152 " no longer needs ".
From the 122nd beginning in cycle, read address counter 152 and begin the element of matrix Y2 is read.But because matrix Y2 write the write sequence (Fig. 6 (a)) that counting sequence (Fig. 6 (d)) is different from matrix Y1, must count with different counting sequences (Fig. 8 (d)) so read address counter 152.For example, the 122nd cycle, when reading address counter 152 from memory element 171-0, it is right that 172-0 reads element
Simultaneously, write
address counter 151 is with element
Writing memory element 172-4.This simultaneously respectively according to Fig. 8 (d) and Fig. 6 (d) order read and write continue up to the 128th cycle, see table 4 for details.
The 129th cycle, write
address counter 151 begin with the element sequence of matrix Y3 write those and read the memory element that address counter 152 no longer needs.At this moment, write
address counter 151 is to count according to the counting sequence of Fig. 6 (a).Therefore, the 129th cycle, when reading address counter 152 from memory element 171-3, it is right that 172-3 reads element
Simultaneously, write
address counter 151 is with element
Writing memory element 171-0.This while writes according to the matrix Y3's of Fig. 6 (a) order, and according to the reading of the matrix Y2 of Fig. 8 (d) order, continues up to the 185th cycle, sees table 4 for details.After the 185th cycle, the state of dual-port random access memory 141-144 sees Figure 10 for details.Wherein dash area represents to read the memory element part that address counter 152 " no longer needs ".
As shown in figure 10, the element that matrix Y1 is stored among the stored element of matrix Y3 and Fig. 7 is identical.Therefore, writing with reading of previous matrix of a matrix can continue ad infinitum simultaneously.That is to say, as long as odd number matrix Y1, Y3 ... counting sequence with Fig. 6 (a) writes, the old Y2 of even number square, Y4, counting sequence with Fig. 6 (d) writes, and alternately with Fig. 8 (a), counting sequence (d) is read, that petty transpose memory 120 can have no discontinuously, ceaselessly transposition matrix Y placed in the middle one by one.Must notice that the counting sequence that writes of Fig. 6 (a) is mutual transposition.Fig. 8 (a) is mutual transposition with the counting sequence that reads of Fig. 8 (d).
This transpose memory 120 also can be applicable to a two-dimension anti-discrete cosine change-over circuit.As shown in table 1, first one dimension anti-discrete cosine change-over circuit 110 is to pull capable hurdle sequence via bus-bar 111 output data.Second one dimension anti-discrete cosine change-over circuit 140 must once be read two elements of matrix Y placed in the middle with the capable order in hurdle, and each hurdle must heavily cover and reads once.Suppose the one dimension anti-discrete cosine conversion sequence ground output matrix Y1 of Fig. 4, Y2, Y3 ... element, write
address counter 151 writes transpose memory 120 according to the counting sequence of Fig. 6 (b) with the element of matrix Y1, as table 5.Just, incite somebody to action in the
period 1
Writing memory element 171-0; In second round, with element
Writing memory element 172-3; In the 3rd accent phase, with element
Writing memory element 172-0; In the
period 4, with element
Writing memory element 171-3, or the like.Figure 11 represents the state of the 56th all after date dual-port random access memory 141-144, and wherein dash area is represented empty memory element.
Table 6 has shown between
cycle 57 and 120 that dual-port random access memory 141-144 is the contrast of read/write matrix element simultaneously.From
cycles 57 beginning, read the element of address counter 152 according to the counting sequence sensor matrix Y1 of the 8th (b) figure.Therefore, the 57th cycle, when reading address counter 152 from memory element 171-0, it is right that 173-0 reads element
Simultaneously, write
address counter 151 is with element
Simultaneously, write
address counter 151 is with element
Writing memory element 173-12.
The 65th cycle, write
address counter 151 begin with the element sequence of matrix Y2 write those and read the memory element that address counter 152 no longer needs.At this moment, write
address counter 151 is to count according to the counting sequence of Fig. 6 (c).Therefore, the 65th cycle, when reading address counter 152 from memory element 172-0, it is right that 174-0 reads element
The time, write
address counter 151 is with element
Writing memory element 171-0.The matrix of this while according to Fig. 6 (c) order writes, and reads with matrix according to Fig. 8 (b) order, continues up to the 120th cycle, sees table 6 for details.After the 120th cycle, the state of dual-port random access memory 141-144 sees Figure 12 for details.Wherein dash area represents to read the memory element part that address counter 152 " no longer needs ".
Table 7 has shown between cycle 121 and 184 that dual-port random access memory 141-144 is the contrast of read/write matrix element simultaneously.Write address counter 151 counts up to the 128th cycle in order to write the element of matrix Y2 according to the counting sequence of Fig. 6 (c) always.And then according to Fig. 8 (c) counting from 184 cycles of the 121st cycle to the in order to write the element of matrix Y3.After the 184th cycle, the state of dual-port random access memory 141-144 sees Figure 13 for details.Wherein dash area is represented the memory element part that read address counter 152 " no longer needs ".State shown in Figure 13, very similar with Figure 11, therefore the process of above-mentioned while read/write matrix element can ceaselessly continue.
Just as the positive discrete cosine transform circuit 100 of two dimension, this transpose memory 120 can have no to be interrupted in two-dimension anti-discrete cosine change-over circuit 100, ceaselessly matrix placed in the middle of transposition.In the case, writeaddress counter 151 writes matrix element with the counting sequence of Fig. 6 (b), (c) alternately, and reads matrix element with the counting sequence of Fig. 8 (b), (c) alternately.Must notice that Fig. 6 (b) and the counting sequence that writes of Fig. 6 (c) are mutual transposition.Fig. 8 (b) figure is mutual transposition with the counting sequence that reads of Fig. 8 (c).
As mentioned above, writeaddress counter 151 is in order to make matrix element to reading at an easy rate with the order that matrix element is stored in dual-port random access memory 141-144.For example, when matrix element when exporting the positive discrete cosine transform of one dimension circuit 140 to, output is the capable order in towing hurdle in proper order, and each towing hurdle all heavily covers once, just (Y0, Y56) ..., (Y24, Y32) ..., (Y0, Y56) ..., (Y24, Y32), (Y0, Y56) ..., (Y24, Y32), (Y31, Y39) ...Therefore the right element of each element reads address counter 152 and can read easily from different quadrants.Same, when matrix element when exporting one dimension anti-discrete cosine change-over circuit 140 to, output is that the hurdle is capable in proper order in proper order, and every hurdle all heavily covers once, just (Y0, Y8) ..., (Y48, Y56), (Y0, Y8) ..., (Y48, Y56), (Y55, Y63) ...Therefore the right element of each element reads address counter 152 and can read easily respectively from single file and duplicate rows.
This shows that the particular storage element that writeaddress counter 151 writes dual-port random access memory 141-144 with matrix element makes and to read address counter 152 the sensor matrix element is right at an easy rate.This need only suitably control S1 and S2 selects control signal to reach.For example, shown in Fig. 8 (a) and table 3, in order to read element to (Y8, Y48), reading address counter 152 need only OPADD and point to memory element 171-4,172-4,173-4,174-4, suitably select control signal S1 and S2 with the memory element 171-4 of output dual-portrandom access memory 141 and the memory element 173-4 that double-port random is depositedstorer 143 simultaneously, that corresponding matrix element is to (Y8, Y48) can read.
This transpose memory 120 also can be exported different matrixes in proper order according to different output sequences.For example, the matrix of first input the hurdle is capable exports in proper order with towing, and the matrix of second input is with the capable phyllotaxy output in hurdle.But in this case, first matrix must all be read from transpose memory 120, and second matrix could write again.Therefore, caused the cycle delay of a matrix length.But, but improve the cycle delay of the system length of 120 palpuses of traditional transpose memory greatly.The cycle delay of system length promptly refers to a matrix reception and registration through two one dimension change-over circuits, transpose memory, and required time of front/rear processor.
Briefly, this transpose memory comprises four dual-ported memories, and write address counter and one read address counter.If a matrix is for exporting positive discrete cosine transform circuit to, that petty write address counter writes specific dual-ported memory with matrix element according to different quadrants.If a matrix is for exporting the anti-discrete cosine change-over circuit to, that petty write address counter writes specific dual-ported memory with matrix element according to list or Shuan Lan and row.Therefore, this transpose memory accounts for less integrated circuit area.
Though the present invention illustrates that with the every embodiment that is disclosed the claim scope of wherein asking is not limited to by these embodiment contents.Any other is according to the interest field of asking in the present invention institute, and the essence of its original idea, principle, essential ideas and intension, and implementer in addition all have been included within the scope that every claim defined of the present invention by following the application.
Table 1
| Circuit 110 | List entries | Circuit 140 | Output sequence |
| 1-D DCT | Y0,Y1,...,Y8, Y9,...,Y63 | 1-D DCT | (Y0,Y56),...,(Y34,Y32),(Y0, Y56),...,(Y24,Y32),(Y32,39) |
| 1-D IDCT | Y0,Y7,...,Y8, Y51,...,Y60 | 1-D IDCT | (Y0,Y8),...,(Y48,Y56),(Y0, Y8),...,(Y48,Y56),(Y55,Y63) |
| 1-D DCT | Y0,Y1,...,Y3, Y9,...,Y63 | 1-D IDCT | (Y0,Y8),...,(Y46,Y56),(Y0, Y9),...,(Y48,Y56),(Y55,Y63) |
| 1-D IDCT | Y0,Y7,..,Y8, Y51,...,Y60 | 1-D DCT | (Y0,Y56),...,(Y24,Y32),(Y3, Y55),...,(Y24,Y32),(Y31,Y39) |
Table 2
| Cycle | Input |
| 1 | Y10 171-0 |
| 2 | Y11 171-1 |
| 3 | Y12 171-2 |
| 4 | Y13 171-3 |
| 5 | Y14 172-3 |
| 6 | Y15 172-2 |
| 7 | Y16 172-1 |
| 8 | Y17 172-0 |
| 9 | Y18 171-4 |
| 10 | Y19 171-5 |
| 11 | Y110 171-6 |
| 12 | Y111 171-7 |
| 13 | Y112 172-7 |
| 14 | Y113 172-6 |
| 15 | Y114 172-5 |
| 16 | Y115 172-4 |
| Cycle | Input |
| 17 | Y116 171-8 |
| 18 | Y117 171-9 |
| 19 | Y118 171-10 |
| 20 | Y119 171-11 |
| 21 | Y120 172-11 |
| 22 | Y121 172-10 |
| 23 | Y122 172-9 |
| 24 | Y123 172-8 |
| 25 | Y124 171-12 |
| 26 | Y125 171-13 |
| 27 | Y126 171-14 |
| 28 | Y127 171-15 |
| 29 | Y128 172-15 |
| 30 | Y129 172-14 |
| 31 | Y130 172-13 |
| 32 | YU31 172-12 |
| Cycle | Input |
| 33 | Y132 173-12 |
| 34 | Y133 173-13 |
| 35 | Y134 173-14 |
| 36 | Y135 173-15 |
| 37 | Y136 174-15 |
| 38 | Y137 174-14 |
| 39 | Y138 174-13 |
| 40 | Y139 174-12 |
| 41 | Y140 173-8 |
| 42 | Y141 173-9 |
| 43 | Y142 173-10 |
| 44 | Y143 173-11 |
| 45 | Y144 174-11 |
| 46 | Y145 174-10 |
| 47 | Y146 174-9 |
| 48 | Y147 174-8 |
| Cycle | Input |
| 49 | Y148 173-4 |
| 50 | Y149 171-5 |
| 51 | Y150 173-6 |
| 52 | Y151 173-7 |
| 53 | Y152 174-7 |
| 54 | Y153 174-6 |
| 55 | Y154 174-5 |
| 56 | Y155 174-4 |
| 57 | Y156 173-0 |
Table 3
| Cycle | Input | Output |
| 58 | Y157 173-1 | (Y10,Y156) 171-0,173-0 |
| 59 | Y158 173-2 | (Y18,Y148) 171-4,173-4 |
| 60 | Y159 173-3 | (Y116,Y140) 171-8,173-8 |
| 61 | Y160 174-3 | (Y124,Y132) 171-12,173-12 |
| 62 | Y161 174-2 | (Y10,Y156) 171-0,173-0 |
| 63 | Y162 174-1 | (Y18,Y148) 171-4,173-4 |
| 64 | Y163 174-0 | (Y116,Y140) 171-8,173-8 |
| 65 | Y20 171-0 | (Y124,Y132) 171-12,173-12 |
| 66 | Y21 171-4 | (Y11,Y157) 171-1,173-1 |
| 67 | Y22 171-8 | (Y19,Y149) 171-5,173-5 |
| 68 | Y23 171-12 | (Y117,Y141) 171-9,173-9 |
| 69 | Y24 173-12 | (Y125,Y133) 171-13,173-13 |
| 70 | Y25 173-8 | (Y11,157) 171-1,173-1 |
| 71 | Y26 173-4 | (Y19,149) 171-5,173-5 |
| 72 | Y27 173-0 | (Y117,Y141) 171-9,173-9 |
| 73 | Y28 171-1 | (Y125,Y133) 171-13,173-13 |
| 74 | Y29 171-5 | (Y12,Y159) 171-2,173-2 |
| Cycle | Input | Output |
| 75 | Y210 171-9 | (Y110,Y150) 171-6,173-6 |
| 76 | Y211 173-2 | (Y118,Y142) 171-10,173-10 |
| 77 | Y212 173-3 | (Y126,Y134) 171-14,173-14 |
| 78 | Y213 174-3 | (Y12,Y158) 171-2,173-2 |
| 79 | Y214 173-5 | (Y110,Y150) 171-6,173-6 |
| 80 | Y215 173-1 | (Y118,Y142) 171-10,173-10 |
| 81 | Y216 171-2 | (Y136,Y134) 171-14,173-14 |
| | | |
| 114 | Y249 172-5 | (Y17,Y163) 172-0,174-0 |
| 115 | Y250 172-9 | (Y115,Y155) 172-4,174-4 |
| 116 | Y251 172-13 | (Y123,Y147) 172-8,174-8 |
| 117 | Y252 174-13 | (Y131,Y139) 172-12,174-12 |
| 118 | Y253 174-9 | (Y17,163) 172-0,174-0 |
| 119 | Y254 174-5 | (Y115,155) 172-4,174-4 |
| 120 | Y255 174-1 | (Y123,Y147) 172-8,174-8 |
| 121 | Y256 172-0 | (Y131,Y139) 172-12,174-12 |
Table 4
| Cycle | Input | Output |
| 122 | Y257 172-4 | (Y20,Y56) 171-0,172-0 |
| 123 | Y258 172-8 | (Y28,Y245) 171-1,172-1 |
| 124 | Y259 172-12 | (Y116,Y140) 171-2,172-2 |
| 125 | Y260 174-12 | (Y224,Y232) 171-3,172-3 |
| 126 | Y261 174-8 | (Y20,Y256) 171-0,172-0 |
| 127 | Y262 174-4 | (Y28,Y248) 171-1,172-1 |
| 128 | Y263 174-0 | (Y216,Y240) 171-2,172-2 |
| 129 | Y30 171-0 | (Y224,Y232) 171-3,172-3 |
| 130 | Y31 171-1 | (Y21,Y257) 171-4,172-4 |
| 131 | Y32 171-2 | (Y29,Y249) 171-5,172-5 |
| 132 | Y33 171-3 | (Y217,Y241) 171-6,172-6 |
| 133 | Y34 172-3 | (Y225,Y233) 171-7,172-7 |
| 134 | Y35 172-2 | (Y21,Y257) 171-4,172-4 |
| 135 | Y36 172-1 | (Y29,Y249) 171-5,172-5 |
| 136 | Y37 172-0 | (Y217,Y141) 171-6,172-6 |
| 137 | Y38 171-4 | (Y225,Y233) 171-7,172-7 |
| 138 | Y39 171-5 | (Y22,Y258) 171-8,172-8 |
| Cycle | Input | Output |
| 139 | Y316 171-6 | (Y210,Y250) 171-9,172-9 |
| 140 | Y311 171-7 | (Y218,Y242) 171-10,172-10 |
| 141 | Y312 172-7 | (Y226,Y134) 171-11,172-11 |
| 142 | Y313 172-6 | (Y22,Y258) 171-8,172-8 |
| 143 | Y314 172-5 | (Y210,Y250) 171-9,172-9 |
| 144 | Y315 172-4 | (Y218,Y142) 171-10,172-10 |
| 145 | Y316 171-8 | (Y226,Y234) 171-11,172-11 |
| | | |
| 178 | Y349 173-5 | (Y27,Y263) 173-0,174-0 |
| 179 | Y350 173-6 | (Y215,Y255) 173-1,174-1 |
| 180 | Y351 173-7 | (Y223,Y247) 173-2,174-2 |
| 181 | Y352 174-7 | (Y231,Y239) 173-3,174-3 |
| 182 | Y353 174-6 | (Y27,Y263) 173-0,174-0 |
| 183 | Y354 174-5 | (Y215,Y255) 173-1,174-1 |
| 184 | Y355 174-4 | (Y223,Y247) 173-2,174-2 |
| 185 | Y356 173-0 | (Y231,Y39) 173-3,174-3 |
Table 5
| Cycle | Input |
| 1 | Y10 171-0 |
| 2 | Y17 172-3 |
| 3 | Y11 172-0 |
| 4 | Y16 171-3 |
| 5 | Y12 171-1 |
| 6 | Y15 172-2 |
| 7 | Y13 172-1 |
| 8 | Y14 171-2 |
| 9 | Y18 173-0 |
| 10 | Y115 174-3 |
| 11 | Y19 174-0 |
| 12 | Y114 173-3 |
| 13 | Y110 173-1 |
| 14 | Y113 174-2 |
| 15 | Y111 174-1 |
| 16 | Y112 173-2 |
| Cycle | Input |
| 17 | Y116 171-4 |
| 18 | Y123 172-7 |
| 19 | Y117 172-4 |
| 20 | Y122 171-7 |
| 21 | Y118 171-5 |
| 22 | Y121 172-6 |
| 23 | Y119 172-5 |
| 24 | Y120 171-6 |
| 25 | Y124 173-4 |
| 26 | Y131 174-7 |
| 27 | Y135 174-4 |
| 28 | Y130 173-7 |
| 29 | Y126 173-5 |
| 30 | Y129 174-6 |
| 31 | Y127 174-5 |
| 32 | Y128 173-6 |
| Cycle | Input |
| 33 | Y132 171-8 |
| 34 | Y139 172-11 |
| 35 | Y133 172-8 |
| 36 | Y138 171-11 |
| 37 | Y134 171-9 |
| 38 | Y137 172-10 |
| 39 | Y135 172-9 |
| 40 | Y136 171-8 |
| 41 | Y140 173-8 |
| 42 | Y147 174-11 |
| 43 | Y142 173-8 |
| 44 | Y146 173-11 |
| 45 | Y142 173-9 |
| 46 | Y145 174-10 |
| 47 | Y143 174-9 |
| 48 | Y144 173-10 |
| Cycle | Input |
| 49 | Y148 171-12 |
| 50 | Y155 172-15 |
| 51 | Y149 172-12 |
| 52 | Y154 171-15 |
| 53 | Y150 171-13 |
| 54 | Y153 172-14 |
| 55 | Y151 172-13 |
| 56 | Y152 171-14 |
Table 6
| Cycle | Input | Output |
| 57 | Y156 173-12 | (Y10,Y18) 171-0,173-0 |
| 58 | Y163 174-15 | (Y116,Y124) 171-4,173-4 |
| 59 | Y157 174-12 | (Y132,Y140) 171-8,173-8 |
| 60 | Y162 173-15 | (Y149,Y156) 171-12,173-12 |
| 61 | Y158 173-13 | (Y10,Y13) 171-0,173-0 |
| 62 | Y161 174-14 | (Y116,Y124) 171-4,173-4 |
| 63 | Y159 174-13 | (Y132,Y140) 171-8,173-8 |
| 64 | Y260 173-14 | (Y149,Y155) 171-12,173-12 |
| 65 | Y20 171-0 | (Y11,Y19) 172-0,174-0 |
| 66 | Y27 173-12 | (Y117,Y125) 172-4,174-4 |
| 67 | Y21 173-0 | (Y133,Y141) 172-8,174-8 |
| 68 | Y26 171-2 | (Y149,Y157) 172-12,174-12 |
| 69 | Y22 171-4 | (Y11,Y19) 172-0,174-0 |
| 70 | Y25 173-8 | (Y117,Y125) 172-4,174-4 |
| 71 | Y23 173-4 | (Y233,Y141) 172-8,174-8 |
| 72 | Y24 171-8 | (Y249,Y157) 172-12,174-12 |
| 73 | Y28 172-0 | (Y12,Y110) 171-1,173-1 |
| Cycle | Input | Output |
| 74 | Y215 174-12 | (Y118,Y126) 171-5,173-5 |
| 75 | Y29 174-0 | (Y134,Y142) 171-9,173-9 |
| 76 | Y224 172-12 | (Y150,Y138) 171-13,173-13 |
| 77 | Y210 172-4 | (Y12,Y110) 171-1,173-1 |
| 78 | Y213 174-8 | (Y118,Y126) 171-5,173-5 |
| 79 | Y211 174-4 | (Y134,Y142) 171-9,173-9 |
| 80 | Y212 172-8 | (Y150,Y158) 171-13,173-13 |
| | | |
| 113 | Y240 172-2 | (Y17,Y115) 172-3,174-3 |
| 114 | Y247 174-14 | (Y123,Y131) 172-7,174-7 |
| 115 | Y241 174-2 | (Y139,147) 172-11,174-11 |
| 116 | Y246 172-14 | (Y155,Y163) 172-15,174-15 |
| 117 | Y242 172-6 | (Y17,Y115) 172-3,174-3 |
| 118 | Y245 174-10 | (Y123,Y131) 172-7,174-7 |
| 119 | Y243 174-6 | (Y139,Y147) 172-11,174-11 |
| 120 | Y244 172-10 | (Y155,Y153) 172-15,174-15 |
Table 7
| Cycle | Input | Output |
| 121 | Y256 172-3 | (Y20,Y28) 171-0,172-0 |
| 122 | Y263 174-15 | (Y216,Y224) 171-1,172-1 |
| 123 | Y257 174-3 | (Y232,Y240) 171-2,172-2 |
| 124 | Y262 172-15 | (Y248,Y256) 171-3,172-3 |
| 125 | Y258 172-7 | (Y20,Y28) 171-0,172-0 |
| 126 | Y261 174-11 | (Y216,Y224) 171-1,172-1 |
| 127 | Y259 174-7 | (Y232,Y240) 171-2,172-2 |
| 128 | Y260 172-11 | (Y249,Y256) 171-3,172-3 |
| 129 | Y30 171-0 | (Y21,Y29) 173-0,174-0 |
| 130 | Y37 172-3 | (Y217,Y225) 173-1,174-1 |
| 131 | Y31 171-1 | (Y233,Y241) 173-2,174-2 |
| 132 | Y36 171-3 | (Y249,Y257) 173-3,174-3 |
| 133 | Y32 171-1 | (Y21,Y29) 173-0,174-0 |
| 134 | Y35 172-2 | (Y217,Y225) 173-1,174-1 |
| 135 | Y33 172-1 | (Y233,Y241) 173-2,174-2 |
| 136 | Y34 171-2 | (Y249,Y257) 173-3,174-3 |
| 137 | Y38 173-0 | (Y22,Y210) 171-4,172-4 |
| Cycle | Input | Output |
| 138 | Y315 174-3 | (Y216,Y226) 171-5,172-5 |
| 139 | Y39 174-0 | (Y234,Y242) 171-6,172-6 |
| 140 | Y314 173-3 | (Y250,Y258) 171-7,172-7 |
| 141 | Y310 173-1 | (Y22,Y210) 171-4,172-4 |
| 142 | Y313 174-2 | (Y218,Y226) 171-5,172-5 |
| 143 | Y311 174-1 | (Y234,Y242) 171-6,172-6 |
| 144 | Y312 173-2 | (Y250,Y258) 171-7,172-7 |
| | | |
| 177 | Y340 173-8 | (Y27,Y257) 173-12,174-12 |
| 178 | Y347 174-11 | (Y223,Y231) 173-13,174-13 |
| 179 | Y341 174-8 | (Y239,Y247) 173-13,174-14 |
| 180 | Y346 173-11 | (Y255,Y263) 173-15,174-15 |
| 181 | Y342 173-9 | (Y27,Y215) 173-12 174-12 |
| 182 | Y345 174-10 | (Y223,Y231) 173-13,174-13 |
| 183 | Y343 174-9 | (Y239,Y247) 173-14,174-14 |
| 184 | Y344 173-10 | (Y255,Y263) 173-15,174-15 |