Movatterモバイル変換


[0]ホーム

URL:


CN1076838C - Transpose Memory for Discrete Cosine Transform/Inverse Discrete Cosine Transform Circuit - Google Patents

Transpose Memory for Discrete Cosine Transform/Inverse Discrete Cosine Transform Circuit
Download PDF

Info

Publication number
CN1076838C
CN1076838CCN94109115ACN94109115ACN1076838CCN 1076838 CCN1076838 CCN 1076838CCN 94109115 ACN94109115 ACN 94109115ACN 94109115 ACN94109115 ACN 94109115ACN 1076838 CCN1076838 CCN 1076838C
Authority
CN
China
Prior art keywords
matrix
memory
mentioned
elements
counter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN94109115A
Other languages
Chinese (zh)
Other versions
CN1122026A (en
Inventor
张翊峰
高进南
黄柏川
Original Assignee
Industrial Technology Research Institute ITRI
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial Technology Research Institute ITRIfiledCriticalIndustrial Technology Research Institute ITRI
Priority to CN94109115ApriorityCriticalpatent/CN1076838C/en
Publication of CN1122026ApublicationCriticalpatent/CN1122026A/en
Application grantedgrantedCritical
Publication of CN1076838CpublicationCriticalpatent/CN1076838C/en
Anticipated expirationlegal-statusCritical
Expired - Fee Relatedlegal-statusCriticalCurrent

Links

Images

Landscapes

Abstract

A transpose memory includes four dual-port memories, a first counter for writing elements into the dual-port memories, and a second counter for reading elements from the dual-port memories. If the received matrix is to be output to the first type switching circuit, the first counter writes each received matrix element into a specific memory corresponding to a quarter quadrant of the matrix. If the received matrix is to be output to the second type conversion circuit, the first counter writes the received matrix elements into a specific dual port memory corresponding to a column and a row of the matrix in either the singular or the even number.

Description

The transpose memory of discrete cosine transform/anti-discrete cosine change-over circuit
The present invention relates to one and assign into two transpose memories between the one dimension change-over circuit.It can be used on the circuit of positive discrete cosine transform (Discrete CosineTransform-DCT) or anti-discrete cosine conversion (Inverse DCT).Thus, it is compressed with usefulness to signal of video signal especially.
Positive discrete cosine transform is widely used in signal of video signal compression and processing.The positive discrete cosine transform of two dimension is defined as follows:Zmn=4N2C(m)c(n)Σi=0N-1Σj=0N-1Xijcos(2i+1)mII2Ncos(2j+1)nII2N-----(1)X whereinIjBe the input element i of a N * N matrix, j=0,1,2 ..., N-1, and zMnBe the output element of a N * N matrix, m, n=0,1,2 ..., n-1.WhereinAnti-discrete cosine then is defined as:Xxj=Σm=0N-1Σn=0N-1c(m)c(n)Zmncos(2i+1)mII2Ncos(2j+1)nII2N-----(2)The matrix form of equation (1) is
Z=CXCt(3) wherein X is the input data matrix, and C is the cosine transition matrix, and CtIt is the transposition of Matrix C.If X is a N * N matrix, then equation (3) can be expressed as:Z11Z12...Z1NZ21Z22...Z2N..ZN1ZN2...ZNNC11C12...C3NC21C22...CNN..CN1CN2...CNNX11X12...X1NX21X22...X2N..XN1XN2...XNNC11C21...CN1C11C22...CN2..C1NC2N...CNN----(4)The very similar positive discrete cosine transform of computing of anti-Discrete Slant string conversion is except C and CtExchange.Therefore can the Dan Yizheng discrete cosine transform discuss.Equation (3) can be divided into following one dimension conversion
Y=CX (5)
Z=YCt(6) wherein Y is a product matrix placed in the middle, and CtIt is the transposition of Matrix C.Lead from equation (6), following equation is similarly effective:
Z=CYt(6a) Y whereintTransposition for matrix Y.If the anti-discrete cosine conversion, we then have:
X=CtY or X=YtThe many tradition of C (6b) are used the positive discrete cosine transform technique or the circuit of input data matrix, as " discrete cosine transform " of N.Ahmed, and IEEE Trans.on comp., vol.C-23, Jan., 1974, the positive discrete cosine transform technique of property as 90-93 proposes; N.Cho ﹠amp; S.Lee, " fast algorithm that 2-D discrete cosine changes and enforcement (Fast Algorithm and Implementation of 2-D DCT "), IEEETrans.on Cir.﹠amp; Sys., Vol.38, No.3, Mar.1991,297-305 propose N positive discrete cosine transform circuit of one dimension of utilization or multiplexer, calculate a positive discrete cosine transform of two-dimentional N * N with N/2 the positive discrete cosine transform circuit of one dimension; M.Sun et al., " VLSI Implementation of a 16 * 16 DCT ", IEEETrans.on Cir.﹠amp; Sys., Vol.36, No.4, Apr.1989,610-17 propose the positive discrete cosine transform of two dimension that a use storer is stored the positive discrete cosine transform matrix operation of one dimension partial results; H.Hou, " calculate the quick recursive algorithm (A Fast Recursive Algorithm for Computing theDCT) of discrete cosine transform ", IEEE Trans.on Acoustic, Speech and SignalProcessing, Vol.ASSP-35, No.10,1987,1455-61 utilizes the symmetry of just discrete string conversion coefficient matrix to reduce the quantity of multiplying.Equation (5) can be write as:YoYe=C1C1C2-C2XfXr---(7)Y whereineBe the matrix that contains matrix Y even number line, YoBe the matrix that contains matrix Y odd-numbered line, XfBe the matrix that contains the matrix X first half capable (upper half rows), XrBe the matrix that contains matrix X Lower Half row, C1And C2Be
Figure C9410911500081
Positive discrete cosine transform matrix of coefficients.For example, with one 8 * 8 positive discrete cosine transform, equation (7) can be written as:Y1Y3Y5Y7DEFGE-G-D-FF-DGEG-FE-DX0-X7X1-X6X2-X5X3-X4---Y0Y2Y4Y6=AAAABC-C-BA-A-AAC-BB-CX0+X7X1+X6X2+X5X3+X4---(7a)A=cos (π/4) wherein, B=cos (π/8), C=sin (π/8), D=cos (π/16), E=cos (3 π/16), F=sin (3 π/16), G=(sin π/16).Y0, Y1, Y2, Y7Be the row of matrix Y, and X0, X1..., X7Row for input data matrix X.Thus, the anti-discrete cosine conversion can be written as:XfXr=C1tC1tC1t-C2tYoYe---(8)Wherein
Figure C9410911500084
Be C1Conversion and
Figure C9410911500085
Be C2It is conversion.For example, with one 8 * 8 positive discrete cosine transform, equation (8) can be written as:X0X1X2X3=AAAABC-C-BA-A-AAC-BB-CY0Y2Y4Y6+DEFGE-G-D-FF-DGEG-FE-DY1Y3Y5Y7X7X6X5X4AAAABC-C-BA-A-AAC-BB-CY0Y2Y4Y6+DEFGEG-D-FF-DGEG-FE-EY1Y3Y5Y7---(8a)Equation 7-7 (a) and 8-8 (a) also can be applicable to the Z=CY of decision as equation (6a)t, or at anti-discrete cosine conversion, Y=ZC.Equation (7a) and advantage (8a) are that the multiplying amount that reduces the positive discrete cosine transform of each one dimension reaches 50%.
Fig. 1 is the block diagram of just/anti-discrete string conversion of two dimension of expression.Special 4,791,598 as the U.S., wherein the positive discrete cosine transform of first onedimension 3 is via n1Line receives an input data matrix X.The positivediscrete cosine transform 3 of this one dimension is according to mode (5) or (7a) conversion input data matrix Y, and via n2Line exportstranspose memory 5 to.Transposememory 5 is with matrix Y transposition placed in the middle and via n3The transposition Y of line YtExport second positive discrete cosine transform 7 of one dimension to.The last positivediscrete cosine transform 7 of this one dimension is according to equation (6), (6a) or (7a) via n4A two dimension output of line output transition matrix Z.
With a positive discrete cosine transform circuit is example, one of the positive discrete cosine transform circuit output of one dimension " OK-and the hurdle " (row-column) in proper order, that is to say Y0, Y1, Y2..., Y63(it seems that with this one 8 * 8 entry of a matrix element is YIj, i, j=0,1,2 ..., 7).Transposememory 5 writes data in view of the above and with transposition order (hurdle-OK) it is read.That is to say Y0, Y8, Y24, Y32..., Y1, Y9, Y17, Y25, Y33..., Y63But, the just discrete string change-overcircuit 7 of equation (7a) demonstration one dimension need once be read two elements from matrix Y: (Y0, Y56), (Y8, Y48), (Y16, Y48) ..., (Y1, Y57), (Y9, Y49) ..., (Y31, Y39), or be referred to as " towing hurdle-OK " (shuffled column-row) in proper order.Data from transposition storer output is then received by pre-treatment (pre-processing) circuit that contains ALU (ALU) 76 and translation input/parallel output register 7C, and to pull two elements in hurdle-once export to the row sequential parallel.
With an anti-discrete cosine circuit is example, one dimension change-overcircuit 3 with matrix battle array Y with Y0, Y7, Y1, Y6, Y2, Y5..., Y8, Y15, Y9, Y14..., Y59, Y60(towing row-hurdle order) exports transpose memory to, and exports (Y with the mixing hurdle-row order of transposition by transpose memory0, Y56, Y8, Y48, Y16, Y40..., Y1, Y57, Y9, Y49..., Y31, Y39).But, equation (8a) demonstration one dimension anti-discrete cosine change-over circuit need once be read two elements from matrix Y: (Y0, Y8), (Y16, Y24) ..., (Y1, Y9), (Y17, Y25) ..., (Y55, Y63).Again, the translation of pre processing circuit input/parallel output register receives towing hurdle-row order output data from the storer that transplants, and once exports two elements with same order.
With the architecture design of Fig. 1 is that complete run-in index has its necessity, that is to say, matrix Y placed in the middle is from the positive discrete cosine transform 3 of first one dimension, throughtranspose memory 5, to the circulation between second positive discrete cosine transform 7 of one dimension without any interruption.But, the normally the most difficult part of transpose memory 5.For example, if data is to writetranspose memory 5 in proper order with row-hurdle, then be to call over hurdle-row.If the required time of read-write respectively is T, that petty traditional transpose memory then needs the 2T time to handle a matrix Y.
In order to reduce the processing time, different traditional transpose memories are suggested in succession.Framework has the earliest used two transpose memories and each matrix Y placed in the middle of interaction process.In first period, first transpose memory writes first matrix Y1 placed in the middle according to row-hurdle order.Second period, in second period, second transpose memory writes second matrix Y2 placed in the middle in proper order according to row-hurdle, and simultaneously, Y1 reads from first transpose memory with hurdle-row order.By that analogy, the 3rd period, first transpose memory writes Y3 in proper order with row-hurdle, and calls over Y2 from second transpose memory with hurdle-row simultaneously.The shortcoming of this transpose memory is to need two storeies.
Fig. 2 has represented by United States Patent (USP) 4,791, the 598 another kind of traditional approachs that propose.As shown in the figure,transpose memory 5 comprises a row theparallel register row 15 with a row of translation register row 13.Matrix Y translation input translation register row 13 (with hurdle-row in proper order) make each register 13-1,13-2 ..., the corresponding row of 13-N storage matrix Y.When whole matrix all deposits translation register in and arranges 13, each register 13-1,13-2 ..., 13-N just passes to their data corresponding parallel register 15-1 abreast, 15-2 ..., 15-N.Thus, parallel register row's 15 data just can be imported ALU 17.Simultaneously,translation register row 13 can receive next matrix Y at once.The shortcoming of this framework is the space of the many registers of waste.United States Patent (USP) 5,053,985 propose another kind of framework, as shown in Figure 3.The positive discrete cosine transform 20 of one dimension is handled input matrix X with the speed that doubles the data input rate.Comprise that with a positive discrete cosine transform 20 of one dimension preceding register andALU 30 come the pre-treatment data, come the processing array multiplication, also have late register andALU 40 with multiplier and totalizer 35.The matrix Y placed in the middle that this one-dimensionaldiscrete cosine conversion 20 produces is stored in a RAM60, if say with row-hurdle order, is to call over and handled by the positive discrete cosine transform of one dimension with hurdle-row fromRAM 60 then.Because the processing speed of its conversion is the twice of matrix X input, therefore with " the 2-D discrete cosine switching center processor of a kind of 100MHz (the A 100MHz 2-D DCT CoreProcessor) " of S.Urampto el al., IEEE J.of Solid State Cirs., Vol.27, No.4, P.429-99, Apr.1992 is an example, the data input rate be limited in processor speed half.
United States Patent (USP) 5,042,007 and 5,177,704 propose a transpose memory device that connects to FIFO (First-in first-out) with translation register.Two patents are all with (N2+ 1) n bit translation register and (2N2-2) 2 * 1 multiplexers are combined as transpose memory.Similarly, United States Patent (USP) 4,903,231 propose another with N2Individual translation register and about N2The transpose memory that individual multiplexer makes up.This framework can be exported matrix element in proper order or import with row one hurdle or hurdle delegation.But the shortcoming of these frameworks is the areas that accounted for many registers and multiplexer.
United States Patent (USP) 4,769,790 propose another kind of framework, are hurdle (or row) the input delay circuits (delay circuit) with an input matrix.Postpone a hurdle input sorting circuit (distribution circuit), and be rotated.The hurdle that is rotated exports second delay circuit to again with it delay.This framework has accounted for very big space at the used translation register of each delay circuit.
United States Patent (USP) 4,918,527 transpose memories that propose are divided into two halves.When data is written into a wherein half, second half then reads data.And the data that odd number is capable is to deposit the two halves storer in two pairs two over-over mode, that is to say, and is capable at each odd number, the element switch type on the hurdle that each is right deposit storer in, till whole matrix placed in the middle all deposits in.This framework has used a large amount of decoding circuits, has therefore accounted for very big space.
The major defect of aforementioned several frameworks is that they have accounted for sizable area.Two dimension just or the anti-discrete cosine converting structure more may cause the register overload of front processor.In addition, aforementioned various frameworks are neither can very flexiblely be read, and writes data, in other words, be expert at exactly-hurdle, and the hurdle-OK, conversion between towing hurdle-row and the towing row-hurdle.At last, traditional transpose memory need just/document before removing fully before the data that reads or writes between anti-discrete cosine conversion.Target of the present invention promptly is the shortcoming that is to overcome foregoing structure.
A transpose memory is accepted a matrix according to first order, according to second order matrix element of receiving is just being exported to or the anti-discrete cosine change-over circuit then.A transpose memory comprises 4 dual-port random access memories (dual-portRAM).The element of 1/4th quadrants of each dual-ported memory storage matrix.A transpose memory comprises that simultaneously a register is used for the matrix element of each reception is write a certain specific storer.In addition, comprise that also a register calls over element to (element pair) according to second.
First register writes a matrix element the specific dual-ported memory of selection to reach the purpose of reading a pair of element simultaneously.And must not increase extra register.If element is to output to positive discrete cosine transform circuit, each dual-ported memory is assigned with 1/4 matrix element.Thus, first register writes a specific dual-ported memory that is assigned to each matrix element.That is to say that the matrix element in the upper left corner writes first dual-ported memory; The upper right corner write second; The lower left corner write the 3rd; The lower right corner write the 4th.If element is to output to the anti-discrete cosine change-over circuit, each dual-ported memory then is assigned to matrix element to go the hurdle interactively.Thus, first register writes specific dual-ported memory respectively with the matrix element on " list " or " two " row and hurdle.That is to say that the matrix element that is positioned at single file and single hurdle writes first storer; Be positioned at second storer that write on the two hurdles of single file; Be positioned at the 3rd storer that write on duplicate rows list hurdle; Be positioned at the 4th storer that write on the two hurdles of duplicate rows.
Therefore, first kind of wiring method allows element to reading with towing order (Shuffledorder).For instance, element Y0, Y8, Y16, Y24(being positioned at the upper left corner of matrix) writes on first storer; And be positioned at the element Y in the matrix lower left corner32, Y40, Y48, Y56Be stored in second storer.Therefore, second register can be at an easy rate to pull the capable element that calls in hurdle to (Y0, Y56), (Y8, Y48), (Y16, Y40), (Y24, Y32).Similarly, second kind of wiring method allows element to reading with non-towing order (Non-Shuffled order).For instance, be positioned at the element Y of matrix single file0, Y16, Y32, Y48Write on first storer; And be positioned at the element Y of matrix duplicate rows8, Y24, Y40, Y56Be stored in second storer.Therefore, second counter can be at an easy rate to pull the capable element that calls in hurdle to (Y0, Y8), (Y16, Y24), (Y32, Y40), (Y48, Y56).
In fact, the transpose memory that the present invention proposes can be assigned between two one dimension change-over circuits, for example between two just discrete (or anti-discrete) cosine change-over circuits.Thus, first and second registers can reach the computing (Pipeline Transpose operation) of pipeline transposition with the synchronization of the matrix element of input.That is to say that first register is when second register read the matrix element that last time writes, with the matrix element write store that receives.And the memory storage element that writes the element place was vacateed earlier before next element writes.So mutual read-write, this transpose memory promptly can a kind of conversion pattern to write element right, again to read with one or another kind of pattern.
Briefly, this transpose memory can be with four storeies than small size, and two registers and a little logical circuit combine.And under various input sequence, export data simply regularly.
Fig. 1 be a conventional two-dimensional just/block diagram of anti-discrete cosine change-over circuit;
Fig. 2 is the block diagram of a traditional transpose memory;
Fig. 3 is the block diagram of another kind of traditional transpose memory;
Fig. 4 be one according to the present invention Wei Ji Foundation two dimension just/block diagram of anti-discrete cosine change-over circuit;
Fig. 5 is a block diagram according to detailed transpose memory of the present invention;
Fig. 6 (a) (b), (c), is according to the address register order that specifically writes of the present invention (d);
The transpose memory state of Fig. 7 example key diagram 5: according to Fig. 6 (a) write address register order and with the state after partly the writing of first matrix;
Fig. 8 (a) (b), (c), is according to the address register order of specifically reading of the present invention (d);
The transpose memory state of Fig. 9 example key diagram 5: according to Fig. 6 (d) write address register order, the part of second matrix is write, and according to the state after the address register calls over first matrix of reading of Fig. 8 (a);
Figure 10 example illustrates the transpose memory state of another Fig. 5: according to Fig. 6 (a) write address register order, with partly writing of the 3rd matrix, and call over second state behind the matrix according to the address register that reads of Fig. 8 (d);
Figure 11 example illustrates the transpose memory state of a kind of Fig. 5; According to Fig. 6 (b) write address register order and with the state after partly the writing of first matrix;
The transpose memory state of Figure 12 example key diagram 5: according to Fig. 6 (c) write address register order, with partly writing of second matrix, and according to the form after the address register calls over first matrix of reading of Fig. 8 (b);
The transpose memory state of Figure 13 example key diagram 5; According to Fig. 6 (b) write address register order, with partly writing of the 3rd matrix, and call over second state after the square limit according to the address register that reads of Fig. 8 (c).
Fig. 4 be a two dimension just/anti-discrete cosine change-over circuit framework 100.As shown in the figure, according to two dimension of the present invention just/anti-discrete cosine change-over circuit framework 100 comprising an one dimension just or anti-discrete cosine change-over circuit 110, wherein exporting matrix Y placed in the middle is Y0, Y2..., Y63(positive discrete cosine transform) or Y0, Y7, Y1, Y6, Y2, Y5..., Y60(anti-discrete cosine conversion) be order.According to the present invention, the matrix Y of output passes to a transpose memory 120 via the bus-bar 111 of n bit.This transpose memory 120 output transposed matrix YtBe by the extremely positive discrete cosine transform circuit 140 of the capable order in towing hurdle (or capable order in hurdle is to anti-discrete cosine change-over circuit 140), and once export two elements.Matrix YtThen reach and only contain an ALU (Aritnmetic Logic Unit---ALU) 135 front processor 130.Since the present invention's transpose memory 120 is output matrix Y correctlyt, so the required translation/parallel register 13,15 (as shown in Figure 2) of prior art can omit.
Fig. 5 is the detail drawing of transpose memory of the present invention.As shown in the figure, one can transposition N * N matrix transpose memory comprise four dual-port random access memories 141,142,143,144.Each dual-port random access memory 141-144 is
Figure C9410911500161
Be used for storing a matrix Y placed in themiddle element 1/4th.Each dual-port random access memory 141-144 comprises a corresponding data input port one 41-1,142-1,143-1,144-1, be used for respectively receiving via n bit bus-bar 111, from the one dimension of Fig. 4 just/element of the matrix Y placed in the middle of anti-discrete cosine change-over circuit 110 outputs.Each dual-port random access memory 141-144 also comprises a corresponding address and writes port one 41-2,142-2, and 143-2,144-2 is connected to one respectively and writes address register 151.In addition, each dual-port random access memory comprises permission end (enable terminal) 141-3 again, 142-3, and 143-3,144-3, respectively via corresponding line 151-1,151-2,151-3,151-4 are connected to and write address register 151.Whenwriting address register 151 at 151-1,151-2,151-3, lines such as 151-4 produce enabling signal, and corresponding dual-port random access memory is then licensed.Licensed dual-port random access memory is then according to the address indication that writes address register 151.The matrix element writing memory element placed in the middle of data input line 111 will be appeared at.
Exceptwriting address register 151, this transpose memory 120 ports contain one and read address register 152, and port one 41-4,142-4,143-4,144-4 are read in the address that is connected respectively to each dual-port random access memory 141-144.In addition, read two of address register 152 outputs and select control signal S1 and S2.When receiving the address of reading address register 152, each dual-port random access memory 141-142 reads the memory element value that this address refers to simultaneously.Dual-port random access memory 141-144 is via read port 141-5,142-5, and 143-5 and 144-5 read element.
Select control signal S1 to be connected to first multiplexer 161.As shown in the figure,multiplexer 161 receives the output element from the read port 142-5 and the 143-5 of dual-portrandom access memory 142 and 143.Select control signal S2 to be connected to second andthird multiplexer 162 and 163.A data input ofmultiplexer 162 andformula multiplexer 163 is passed in the output ofmultiplexer 161 respectively, dual-portrandom access memory 141 then output data are to another data input ofmultiplexer 162, and dual-portrandom access memory 144 output data are to another data input ofmultiplexer 163.
As implied above, the parallel port one 41-5 output matrix element that reads from the address of each dual-port random access memory 141-144.But through selecting control signal S1 and S2, dual-port random access memory 141,142,143,144 wherein two element can the time export.
It below is the calculation specification of this transpose memory 120.For convenience's sake, suppose that each matrix Y placed in the middle is 8 * 8, and element is numbered Y0, Y1..., Y8, Y9..., Y63Trip hurdle order.Writing address register 151 can come the memory element of dual-port random access memory 141-144 is counted according to a string counting sequence that writes.For instance, Fig. 6 (a), (b), (c), (d) expression four kinds ofwrite address counter 151 is different writes counting sequence.As shown in the figure, each dual-port random access memory 141-144 comprises the memory element group 171-0 to 171-15 that is used for storing 16 matrix elements respectively, 172-0 to 172-15,173-0 to 173-15,174-0 to 174-15, numbering 1 to 64 has represented that writeaddress counter 151 writes order of each element.For example, according to Fig. 6 (a), in period 1 (cycle), writeaddress counter 151 points to the memory element 171-0 of dual-port random access memory 141.Second round, then point to the storage file 171-1 of dual-portrandom access memory 141, by that analogy.
Hence one can see that, and the counting sequence (Fig. 6 (a)~(d)) that writeaddress counter 151 is selected is according to the input of matrix Y and the sequence order (sequence order) of output element decide between two parties.Its input and output sequence see table one in proper order for details.Wherein output sequence represents that with parantheses the element of each cycle output is right.
The example of the just discrete string conversion of a two dimension below will be discussed.As shown in Table 1, the positive discrete cosine transform circuit 110 of the one dimension of Fig. 4 with each matrix Y placed in the middle to go the output of hurdle sequence.Data input bus-bar 111 is with the speed input transpose memory 120 of one-period one element in the element of Y.Transpose memory 120 exports the element of the Y of input to front processor 135 again to drag the capable order of electric fence, and each towing hurdle all heavily covers (as table 1) one time.Suppose a string matrix Y1 of positive discrete cosine transform circuit 110 sequences of one dimension ground output, Y2, Y3, Y4 ..., writeaddress counter 151 is according to the counting sequence counting of Fig. 6 (a).For example, at first with the element of importing
Figure C9410911500181
Write the memory element 171-0 of dual-portrandom access memory 141, second element
Figure C9410911500182
Writing memory element 171-1, by that analogy.The element of the matrix Y in preceding 57 cycles write and the appointment of memory element sees table 2 for details.Fig. 7 represents the state of dual-port random access memory 141-144 after the 57th cycle.Wherein dash area is represented empty memory element.
Similar withwrite address counter 151, read the counting sequence counting of address counter 152 according to Fig. 8 (a)-(d).Shown in Fig. 8 (a)-(d), read address counter 152 and pointing to two memory elements that are positioned at different dual-port random access memories with one-period.For example, according to Fig. 8 (a),, read address counter 152 and read element in memory element 170-0 and the 173-0 from dual-portrandom access memory 141 and 143 respectively in the period 1.In addition, the memory element of each Output bar correspondence of matrix all must heavily cover and read once.Fig. 8 (a) for example reads address counter 152 and points to a hurdle of dual-portrandom access memory 141 and 143 in first to fourth cycle, memory element 171-0 just, 173-0,171-4,173-4,171-8,173-8,171-12,173-12.This hurdle read once again in the 5th to eight cycle again.Thus, reading address counter 152, to read element respectively in the cycle one to four right(Y01,Y561),(Y81,Y481),(Y161,Y401),(Y241,Y321),Read once respectively again to eight in cycle five.
To write and readaddress counter 151 and 152 and can write and read matrix element simultaneously in order to make, both must the phase mutually synchronization.The dual-port random access memory state of supposing the 57th all after dates is that shown in Figure 7, and from the 58th cycle, reading address counter 152, can to begin the sensor matrix element according to the order shown in Fig. 8 (a) right.So the 58th cycle, writeaddress counter 151 is with matrix elementBe stored in the memory element 173-1.In addition, reading address counter 152, also to read element from memory element 171-0 and 173-0 right
Figure C9410911500193
Read the memory element 171-0 that address counter 152 need only export an address to dual-port random access memory 141-144,172-0,173-0, it is right that 174-0 can reach the element that reads.This is that address counter 152 has produced suitable S1 simultaneously and S2 selects control signal because read, together to control multiplexer 161,162,163 and to select correct element to output.With above-mentioned is example, and S1 and S2 promptly control the output thatmultiplexer 161 is selected dual-portrandom access memory 143, andcontrol multiplexer 162 and 163 is selected the output of dual-portrandom access memory 141 andmultiplexer 161 respectively.
Thus, in the 59th cycle, readaddress counter 151 with element
Figure C9410911500201
Writing memory element 173-2, and reading address counter 152, to read element from memory element 171-4 and 173-4 rightThis process continues up to the 64th cycle, sees table 3 for details.After the 64th cycle, writeaddress counter 151 is with the element that shows most of matrix Y1Write, and read address counter 152 and also almost the element in first hurdle of matrix Y1 is run through.At this moment, writeaddress counter 151 is with the element of matrix Y2(Y02,Y12,······,Y82,Y92,······,Y632)Write dual-port random access memory 141-144.But, writeaddress counter 151 reads in the memory element that address counter 152 no longer reads so that element is write only.For example, writeaddress counter 151 can write the present memory element 171-0 that stores the matrix Y1 first hurdle element, 173-0,171-4,173-4,171-8,173-8,171-12,173-12 with the element of matrix Y2 first row.For the element of suitable storage matrix Y2, those read the memory element that address counter 152 no longer needs to writeaddress counter 151 according to the sequential counting of Fig. 6 (d).Therefore, the 65th cycle, when reading address counter 152 from memory element 171-12, it is right that 173-12 reads element
Figure C9410911500206
Simultaneously, writeaddress counter 151 is with element
Figure C9410911500207
Writing memory element 171-0.Matrix Y1 read and writing of matrix Y2 continues up to for the 121st cycle, see table 3 for details.After the 121st cycle, the state of dual-port random access memory 141-144 sees Fig. 9 for details.Wherein dash area represents to read the memory element part that address counter 152 " no longer needs ".
From the 122nd beginning in cycle, read address counter 152 and begin the element of matrix Y2 is read.But because matrix Y2 write the write sequence (Fig. 6 (a)) that counting sequence (Fig. 6 (d)) is different from matrix Y1, must count with different counting sequences (Fig. 8 (d)) so read address counter 152.For example, the 122nd cycle, when reading address counter 152 from memory element 171-0, it is right that 172-0 reads elementSimultaneously, writeaddress counter 151 is with element
Figure C9410911500212
Writing memory element 172-4.This simultaneously respectively according to Fig. 8 (d) and Fig. 6 (d) order read and write continue up to the 128th cycle, see table 4 for details.
The 129th cycle, writeaddress counter 151 begin with the element sequence of matrix Y3 write those and read the memory element that address counter 152 no longer needs.At this moment, writeaddress counter 151 is to count according to the counting sequence of Fig. 6 (a).Therefore, the 129th cycle, when reading address counter 152 from memory element 171-3, it is right that 172-3 reads element
Figure C9410911500213
Simultaneously, writeaddress counter 151 is with element
Figure C9410911500214
Writing memory element 171-0.This while writes according to the matrix Y3's of Fig. 6 (a) order, and according to the reading of the matrix Y2 of Fig. 8 (d) order, continues up to the 185th cycle, sees table 4 for details.After the 185th cycle, the state of dual-port random access memory 141-144 sees Figure 10 for details.Wherein dash area represents to read the memory element part that address counter 152 " no longer needs ".
As shown in figure 10, the element that matrix Y1 is stored among the stored element of matrix Y3 and Fig. 7 is identical.Therefore, writing with reading of previous matrix of a matrix can continue ad infinitum simultaneously.That is to say, as long as odd number matrix Y1, Y3 ... counting sequence with Fig. 6 (a) writes, the old Y2 of even number square, Y4, counting sequence with Fig. 6 (d) writes, and alternately with Fig. 8 (a), counting sequence (d) is read, that petty transpose memory 120 can have no discontinuously, ceaselessly transposition matrix Y placed in the middle one by one.Must notice that the counting sequence that writes of Fig. 6 (a) is mutual transposition.Fig. 8 (a) is mutual transposition with the counting sequence that reads of Fig. 8 (d).
This transpose memory 120 also can be applicable to a two-dimension anti-discrete cosine change-over circuit.As shown in table 1, first one dimension anti-discrete cosine change-over circuit 110 is to pull capable hurdle sequence via bus-bar 111 output data.Second one dimension anti-discrete cosine change-over circuit 140 must once be read two elements of matrix Y placed in the middle with the capable order in hurdle, and each hurdle must heavily cover and reads once.Suppose the one dimension anti-discrete cosine conversion sequence ground output matrix Y1 of Fig. 4, Y2, Y3 ... element, writeaddress counter 151 writes transpose memory 120 according to the counting sequence of Fig. 6 (b) with the element of matrix Y1, as table 5.Just, incite somebody to action in theperiod 1Writing memory element 171-0; In second round, with element
Figure C9410911500222
Writing memory element 172-3; In the 3rd accent phase, with element
Figure C9410911500223
Writing memory element 172-0; In theperiod 4, with element
Figure C9410911500224
Writing memory element 171-3, or the like.Figure 11 represents the state of the 56th all after date dual-port random access memory 141-144, and wherein dash area is represented empty memory element.
Table 6 has shown betweencycle 57 and 120 that dual-port random access memory 141-144 is the contrast of read/write matrix element simultaneously.Fromcycles 57 beginning, read the element of address counter 152 according to the counting sequence sensor matrix Y1 of the 8th (b) figure.Therefore, the 57th cycle, when reading address counter 152 from memory element 171-0, it is right that 173-0 reads element
Figure C9410911500225
Simultaneously, writeaddress counter 151 is with elementSimultaneously, writeaddress counter 151 is with elementWriting memory element 173-12.
The 65th cycle, writeaddress counter 151 begin with the element sequence of matrix Y2 write those and read the memory element that address counter 152 no longer needs.At this moment, writeaddress counter 151 is to count according to the counting sequence of Fig. 6 (c).Therefore, the 65th cycle, when reading address counter 152 from memory element 172-0, it is right that 174-0 reads elementThe time, writeaddress counter 151 is with element
Figure C9410911500231
Writing memory element 171-0.The matrix of this while according to Fig. 6 (c) order writes, and reads with matrix according to Fig. 8 (b) order, continues up to the 120th cycle, sees table 6 for details.After the 120th cycle, the state of dual-port random access memory 141-144 sees Figure 12 for details.Wherein dash area represents to read the memory element part that address counter 152 " no longer needs ".
Table 7 has shown between cycle 121 and 184 that dual-port random access memory 141-144 is the contrast of read/write matrix element simultaneously.Write address counter 151 counts up to the 128th cycle in order to write the element of matrix Y2 according to the counting sequence of Fig. 6 (c) always.And then according to Fig. 8 (c) counting from 184 cycles of the 121st cycle to the in order to write the element of matrix Y3.After the 184th cycle, the state of dual-port random access memory 141-144 sees Figure 13 for details.Wherein dash area is represented the memory element part that read address counter 152 " no longer needs ".State shown in Figure 13, very similar with Figure 11, therefore the process of above-mentioned while read/write matrix element can ceaselessly continue.
Just as the positive discrete cosine transform circuit 100 of two dimension, this transpose memory 120 can have no to be interrupted in two-dimension anti-discrete cosine change-over circuit 100, ceaselessly matrix placed in the middle of transposition.In the case, writeaddress counter 151 writes matrix element with the counting sequence of Fig. 6 (b), (c) alternately, and reads matrix element with the counting sequence of Fig. 8 (b), (c) alternately.Must notice that Fig. 6 (b) and the counting sequence that writes of Fig. 6 (c) are mutual transposition.Fig. 8 (b) figure is mutual transposition with the counting sequence that reads of Fig. 8 (c).
As mentioned above, writeaddress counter 151 is in order to make matrix element to reading at an easy rate with the order that matrix element is stored in dual-port random access memory 141-144.For example, when matrix element when exporting the positive discrete cosine transform of one dimension circuit 140 to, output is the capable order in towing hurdle in proper order, and each towing hurdle all heavily covers once, just (Y0, Y56) ..., (Y24, Y32) ..., (Y0, Y56) ..., (Y24, Y32), (Y0, Y56) ..., (Y24, Y32), (Y31, Y39) ...Therefore the right element of each element reads address counter 152 and can read easily from different quadrants.Same, when matrix element when exporting one dimension anti-discrete cosine change-over circuit 140 to, output is that the hurdle is capable in proper order in proper order, and every hurdle all heavily covers once, just (Y0, Y8) ..., (Y48, Y56), (Y0, Y8) ..., (Y48, Y56), (Y55, Y63) ...Therefore the right element of each element reads address counter 152 and can read easily respectively from single file and duplicate rows.
This shows that the particular storage element that writeaddress counter 151 writes dual-port random access memory 141-144 with matrix element makes and to read address counter 152 the sensor matrix element is right at an easy rate.This need only suitably control S1 and S2 selects control signal to reach.For example, shown in Fig. 8 (a) and table 3, in order to read element to (Y8, Y48), reading address counter 152 need only OPADD and point to memory element 171-4,172-4,173-4,174-4, suitably select control signal S1 and S2 with the memory element 171-4 of output dual-portrandom access memory 141 and the memory element 173-4 that double-port random is depositedstorer 143 simultaneously, that corresponding matrix element is to (Y8, Y48) can read.
This transpose memory 120 also can be exported different matrixes in proper order according to different output sequences.For example, the matrix of first input the hurdle is capable exports in proper order with towing, and the matrix of second input is with the capable phyllotaxy output in hurdle.But in this case, first matrix must all be read from transpose memory 120, and second matrix could write again.Therefore, caused the cycle delay of a matrix length.But, but improve the cycle delay of the system length of 120 palpuses of traditional transpose memory greatly.The cycle delay of system length promptly refers to a matrix reception and registration through two one dimension change-over circuits, transpose memory, and required time of front/rear processor.
Briefly, this transpose memory comprises four dual-ported memories, and write address counter and one read address counter.If a matrix is for exporting positive discrete cosine transform circuit to, that petty write address counter writes specific dual-ported memory with matrix element according to different quadrants.If a matrix is for exporting the anti-discrete cosine change-over circuit to, that petty write address counter writes specific dual-ported memory with matrix element according to list or Shuan Lan and row.Therefore, this transpose memory accounts for less integrated circuit area.
Though the present invention illustrates that with the every embodiment that is disclosed the claim scope of wherein asking is not limited to by these embodiment contents.Any other is according to the interest field of asking in the present invention institute, and the essence of its original idea, principle, essential ideas and intension, and implementer in addition all have been included within the scope that every claim defined of the present invention by following the application.
Table 1
Circuit 110List entriesCircuit 140Output sequence
1-D DCT Y0,Y1,...,Y8, Y9,...,Y63 1-D DCT (Y0,Y56),...,(Y34,Y32),(Y0, Y56),...,(Y24,Y32),(Y3239)
1-D IDCT Y0,Y7,...,Y8, Y51,...,Y60 1-D IDCT (Y0,Y8),...,(Y48,Y56),(Y0, Y8),...,(Y48,Y56),(Y55,Y63)
1-D DCT Y0,Y1,...,Y3, Y9,...,Y63 1-D IDCT (Y0,Y8),...,(Y46,Y56),(Y0, Y9),...,(Y48,Y56),(Y55,Y63)
1-D IDCT Y0,Y7,..,Y8, Y51,...,Y60 1-D DCT (Y0,Y56),...,(Y24,Y32),(Y3, Y55),...,(Y24,Y32),(Y31,Y39)
Table 2
CycleInput
1 Y10 171-0
2 Y11 171-1
3 Y12 171-2
4 Y13 171-3
5 Y14 172-3
6 Y15 172-2
7 Y16 172-1
8 Y17 172-0
9 Y18 171-4
10 Y19 171-5
11 Y110 171-6
12 Y111 171-7
13 Y112 172-7
14 Y113 172-6
15 Y114 172-5
16 Y115 172-4
CycleInput
17 Y116 171-8
18 Y117 171-9
19 Y118 171-10
20 Y119 171-11
21 Y120 172-11
22 Y121 172-10
23 Y122 172-9
24 Y123 172-8
25 Y124 171-12
26 Y125 171-13
27 Y126 171-14
28 Y127 171-15
29 Y128 172-15
30 Y129 172-14
31 Y130 172-13
32 YU31 172-12
CycleInput
33 Y132 173-12
34 Y133 173-13
35 Y134 173-14
36 Y135 173-15
37 Y136 174-15
38 Y137 174-14
39 Y138 174-13
40 Y139 174-12
41 Y140 173-8
42 Y141 173-9
43 Y142 173-10
44 Y143 173-11
45 Y144 174-11
46 Y145 174-10
47 Y146 174-9
48 Y147 174-8
CycleInput
49 Y148 173-4
50 Y149 171-5
51 Y150 173-6
52 Y151 173-7
53 Y152 174-7
54 Y153 174-6
55 Y154 174-5
56 Y155 174-4
57 Y156 173-0
Table 3
CycleInputOutput
58 Y157 173-1 (Y10,Y156) 171-0,173-0
59 Y158 173-2 (Y18,Y148) 171-4,173-4
60 Y159 173-3 (Y116,Y140) 171-8,173-8
61 Y160 174-3 (Y124,Y132) 171-12,173-12
62 Y161 174-2 (Y10,Y156) 171-0,173-0
63 Y162 174-1 (Y18,Y148) 171-4,173-4
64 Y163 174-0 (Y116,Y140) 171-8,173-8
65 Y20 171-0 (Y124,Y132) 171-12,173-12
66 Y21 171-4 (Y11,Y157) 171-1,173-1
67 Y22 171-8 (Y19,Y149) 171-5,173-5
68 Y23 171-12 (Y117,Y141) 171-9,173-9
69 Y24 173-12 (Y125,Y133) 171-13,173-13
70 Y25 173-8 (Y11157) 171-1,173-1
71 Y26 173-4 (Y19149) 171-5,173-5
72 Y27 173-0 (Y117,Y141) 171-9,173-9
73 Y28 171-1 (Y125,Y133) 171-13,173-13
74 Y29 171-5 (Y12,Y159) 171-2,173-2
CycleInputOutput
75 Y210 171-9 (Y110,Y150) 171-6,173-6
76 Y211 173-2 (Y118,Y142) 171-10,173-10
77 Y212 173-3 (Y126,Y134) 171-14,173-14
78 Y213 174-3 (Y12,Y158) 171-2,173-2
79 Y214 173-5 (Y110,Y150) 171-6,173-6
80 Y215 173-1 (Y118,Y142) 171-10,173-10
81 Y216 171-2 (Y136,Y134) 171-14,173-14
114 Y249 172-5 (Y17,Y163) 172-0,174-0
115 Y250 172-9 (Y115,Y155) 172-4,174-4
116 Y251 172-13 (Y123,Y147) 172-8,174-8
117 Y252 174-13 (Y131,Y139) 172-12,174-12
118 Y253 174-9 (Y17163) 172-0,174-0
119 Y254 174-5 (Y115155) 172-4,174-4
120 Y255 174-1 (Y123,Y147) 172-8,174-8
121 Y256 172-0 (Y131,Y139) 172-12,174-12
Table 4
CycleInputOutput
122 Y257 172-4 (Y20,Y56) 171-0,172-0
123 Y258 172-8 (Y28,Y245) 171-1,172-1
124 Y259 172-12 (Y116,Y140) 171-2,172-2
125 Y260 174-12 (Y224,Y232) 171-3,172-3
126 Y261 174-8 (Y20,Y256) 171-0,172-0
127 Y262 174-4 (Y28,Y248) 171-1,172-1
128 Y263 174-0 (Y216,Y240) 171-2,172-2
129 Y30 171-0 (Y224,Y232) 171-3,172-3
130 Y31 171-1 (Y21,Y257) 171-4,172-4
131 Y32 171-2 (Y29,Y249) 171-5,172-5
132 Y33 171-3 (Y217,Y241) 171-6,172-6
133 Y34 172-3 (Y225,Y233) 171-7,172-7
134 Y35 172-2 (Y21,Y257) 171-4,172-4
135 Y36 172-1 (Y29,Y249) 171-5,172-5
136 Y37 172-0 (Y217,Y141) 171-6,172-6
137 Y38 171-4 (Y225,Y233) 171-7,172-7
138 Y39 171-5 (Y22,Y258) 171-8,172-8
CycleInputOutput
139 Y316 171-6 (Y210,Y250) 171-9,172-9
140 Y311 171-7 (Y218,Y242) 171-10,172-10
141 Y312 172-7 (Y226,Y134) 171-11,172-11
142 Y313 172-6 (Y22,Y258) 171-8,172-8
143 Y314 172-5 (Y210,Y250) 171-9,172-9
144 Y315 172-4 (Y218,Y142) 171-10,172-10
145 Y316 171-8 (Y226,Y234) 171-11,172-11
178 Y349 173-5 (Y27,Y263) 173-0,174-0
179 Y350 173-6 (Y215,Y255) 173-1,174-1
180 Y351 173-7 (Y223,Y247) 173-2,174-2
181 Y352 174-7 (Y231,Y239) 173-3,174-3
182 Y353 174-6 (Y27,Y263) 173-0,174-0
183 Y354 174-5 (Y215,Y255) 173-1,174-1
184 Y355 174-4 (Y223,Y247) 173-2,174-2
185 Y356 173-0 (Y231,Y39) 173-3,174-3
Table 5
CycleInput
1 Y10 171-0
2 Y17 172-3
3 Y11 172-0
4 Y16 171-3
5 Y12 171-1
6 Y15 172-2
7 Y13 172-1
8 Y14 171-2
9 Y18 173-0
10 Y115 174-3
11 Y19 174-0
12 Y114 173-3
13 Y110 173-1
14 Y113 174-2
15 Y111 174-1
16 Y112 173-2
CycleInput
17 Y116 171-4
18 Y123 172-7
19 Y117 172-4
20 Y122 171-7
21 Y118 171-5
22 Y121 172-6
23 Y119 172-5
24 Y120 171-6
25 Y124 173-4
26 Y131 174-7
27 Y135 174-4
28 Y130 173-7
29 Y126 173-5
30 Y129 174-6
31 Y127 174-5
32 Y128 173-6
CycleInput
33 Y132 171-8
34 Y139 172-11
35 Y133 172-8
36 Y138 171-11
37 Y134 171-9
38 Y137 172-10
39 Y135 172-9
40 Y136 171-8
41 Y140 173-8
42 Y147 174-11
43 Y142 173-8
44 Y146 173-11
45 Y142 173-9
46 Y145 174-10
47 Y143 174-9
48 Y144 173-10
CycleInput
49 Y148 171-12
50 Y155 172-15
51 Y149 172-12
52 Y154 171-15
53 Y150 171-13
54 Y153 172-14
55 Y151 172-13
56 Y152 171-14
Table 6
CycleInputOutput
57 Y156 173-12 (Y10,Y18) 171-0,173-0
58 Y163 174-15 (Y116,Y124) 171-4,173-4
59 Y157 174-12 (Y132,Y140) 171-8,173-8
60 Y162 173-15 (Y149,Y156) 171-12,173-12
61 Y158 173-13 (Y10,Y13) 171-0,173-0
62 Y161 174-14 (Y116,Y124) 171-4,173-4
63 Y159 174-13 (Y132,Y140) 171-8,173-8
64 Y260 173-14 (Y149,Y155) 171-12,173-12
65 Y20 171-0 (Y11,Y19) 172-0,174-0
66 Y27 173-12 (Y117,Y125) 172-4,174-4
67 Y21 173-0 (Y133,Y141) 172-8,174-8
68 Y26 171-2 (Y149,Y157) 172-12,174-12
69 Y22 171-4 (Y11,Y19) 172-0,174-0
70 Y25 173-8 (Y117,Y125) 172-4,174-4
71 Y23 173-4 (Y233,Y141) 172-8,174-8
72 Y24 171-8 (Y249,Y157) 172-12,174-12
73 Y28 172-0 (Y12,Y110) 171-1,173-1
CycleInputOutput
74 Y215 174-12 (Y118,Y126) 171-5,173-5
75 Y29 174-0 (Y134,Y142) 171-9,173-9
76 Y224 172-12 (Y150,Y138) 171-13,173-13
77 Y210 172-4 (Y12,Y110) 171-1,173-1
78 Y213 174-8 (Y118,Y126) 171-5,173-5
79 Y211 174-4 (Y134,Y142) 171-9,173-9
80 Y212 172-8 (Y150,Y158) 171-13,173-13
113 Y240 172-2 (Y17,Y115) 172-3,174-3
114 Y247 174-14 (Y123,Y131) 172-7,174-7
115 Y241 174-2 (Y139147) 172-11,174-11
116 Y246 172-14 (Y155,Y163) 172-15,174-15
117 Y242 172-6 (Y17,Y115) 172-3,174-3
118 Y245 174-10 (Y123,Y131) 172-7,174-7
119 Y243 174-6 (Y139,Y147) 172-11,174-11
120 Y244 172-10 (Y155,Y153) 172-15,174-15
Table 7
CycleInputOutput
121 Y256 172-3 (Y20,Y28) 171-0,172-0
122 Y263 174-15 (Y216,Y224) 171-1,172-1
123 Y257 174-3 (Y232,Y240) 171-2,172-2
124 Y262 172-15 (Y248,Y256) 171-3,172-3
125 Y258 172-7 (Y20,Y28) 171-0,172-0
126 Y261 174-11 (Y216,Y224) 171-1,172-1
127 Y259 174-7 (Y232,Y240) 171-2,172-2
128 Y260 172-11 (Y249,Y256) 171-3,172-3
129 Y30 171-0 (Y21,Y29) 173-0,174-0
130 Y37 172-3 (Y217,Y225) 173-1,174-1
131 Y31 171-1 (Y233,Y241) 173-2,174-2
132 Y36 171-3 (Y249,Y257) 173-3,174-3
133 Y32 171-1 (Y21,Y29) 173-0,174-0
134 Y35 172-2 (Y217,Y225) 173-1,174-1
135 Y33 172-1 (Y233,Y241) 173-2,174-2
136 Y34 171-2 (Y249,Y257) 173-3,174-3
137 Y38 173-0 (Y22,Y210) 171-4,172-4
CycleInputOutput
138 Y315 174-3 (Y216,Y226) 171-5,172-5
139 Y39 174-0 (Y234,Y242) 171-6,172-6
140 Y314 173-3 (Y250,Y258) 171-7,172-7
141 Y310 173-1 (Y22,Y210) 171-4,172-4
142 Y313 174-2 (Y218,Y226) 171-5,172-5
143 Y311 174-1 (Y234,Y242) 171-6,172-6
144 Y312 173-2 (Y250,Y258) 171-7,172-7
177 Y340 173-8 (Y27,Y257) 173-12,174-12
178 Y347 174-11 (Y223,Y231) 173-13,174-13
179 Y341 174-8 (Y239,Y247) 173-13,174-14
180 Y346 173-11 (Y255,Y263) 173-15,174-15
181 Y342 173-9 (Y27,Y215) 173-12 174-12
182 Y345 174-10 (Y223,Y231) 173-13,174-13
183 Y343 174-9 (Y239,Y247) 173-14,174-14
184 Y344 173-10 (Y255,Y263) 173-15,174-15

Claims (10)

Translated fromChinese
1.一种转置存储器,其根据第一个序列顺序输入矩阵元素,及根据第二个序列顺序输出矩阵元素至第一或第二型式转换电路,其特征在于,该存储器装置包含:1. A transposition memory, which inputs matrix elements according to a first sequence order, and outputs matrix elements to a first or second type conversion circuit according to a second sequence order, characterized in that the memory device comprises:至少四个双端口存储器,每一个双端口存储器用以存储四分之一个矩阵的元素;At least four dual-port memories, each dual-port memory is used to store a quarter of the elements of the matrix;一个第一个计数器,作为所述的第一型式转换电路,并用来将收到的每一个矩阵元素写入对应于所述的元素的矩阵象限的特定之存储器的特定之存储元件;且作为所述的第二型式转换电路,并用来将收到的每一个矩阵元素写入对应于所述的元素的矩阵单或双行与栏的特定存储器的特定存储元件;及a first counter, as said first type conversion circuit, and for writing each matrix element received into a specific storage element of a specific memory corresponding to said matrix quadrant of said element; and as said The second type conversion circuit described above, and used to write each matrix element received into a specific storage element of a specific memory corresponding to the matrix single or double row and column of said element; and一个第二个计数器,根据所述第二个序列顺序同时从所述存储器的特定存储元件读出所述元素对应的地址。A second counter reads the addresses corresponding to the elements from the specific storage elements of the memory at the same time according to the second sequence order.2.如权利要求1所述的转置存储器,其特征在于,所述的第一型式转换电路为正离散余弦转换电路,而所述的第二型式转换电路为反离散余弦转换电路。2. The transpose memory according to claim 1, wherein the first type conversion circuit is a positive discrete cosine conversion circuit, and the second type conversion circuit is an inverse discrete cosine conversion circuit.3.如权利要求1所述的转置存储器,其特征在于,所述的矩阵元素从所述的转置存储器以所述的第二个序列顺序的输出,是所述的转置存储器以所述的第一个序列顺序接收的所述的矩阵元素之转置。3. The transpose memory of claim 1, wherein the output of said matrix elements from said transpose memory in said second sequential order is said transpose memory in said The transpose of the elements of the matrix received in the first sequence order described above.4.如权利要求1所述的转置存储器,其特征在于,所述的双端口存储器为双端口随机存取存储器。4. The transpose memory according to claim 1, wherein said dual-port memory is a dual-port random access memory.5.如权利要求1所述的转置存储器,其特征在于,上述的转置存储器接收一序列的矩阵,且在一个特定型式的转换之下,所述的第一个计数器根据一个写入顺序,将每一收到的单数矩阵的所述矩阵元素写入单数的所述的存储器,并根据另一个为所述写入顺序的转置的写入顺序,将每一收到的双数矩阵的所述的矩阵元素写入偶数的所述的存储。5. The transpose memory of claim 1, wherein the above-mentioned transpose memory receives a sequence of matrices, and under a specific type of conversion, the first counter is based on a write order , write said matrix elements of each received singular matrix into singular said memory, and write each received double matrix according to another write order that is the transposition of said write order The elements of the matrix are written to even numbers of the storage.6.如权利要求1所述的转置存储器,其特征在于,上述的第一及第二个计数器可以同时计数,且上述的第一及第二个计数器的上述的计数顺序除了相互同步,也与接收上述的转置存储器的矩阵元素同步,由此,上述的第二个计数器只在上述的第一个计数器把将要被上述的第二个计数器读取的上述的矩阵元素写入上述的特定的存储元件之后,才把存储在上述存储元件内上述的接收的矩阵元素以一特定的序列顺序读出。6. The transposition memory as claimed in claim 1, wherein said first and second counters can count at the same time, and the above-mentioned counting order of said first and second counters is not only synchronous with each other, but also Synchronous with receiving the matrix elements of the above-mentioned transpose memory, whereby the above-mentioned second counter writes the above-mentioned matrix elements to be read by the above-mentioned second counter into the above-mentioned specific After the storage element, the above-mentioned received matrix elements stored in the above-mentioned storage element are sequentially read out in a specific sequence.7.如权利要求1所述的转置存储器,其特征在于,上述的第一个计数器平行地输出一个计数到上述四个存储器的写入地址输入端,另外,上述的第一个计数器输出适当的起动信号到上述的存储器,并根据上述第一个记数器的上述计数的指示,一次只将接收的矩阵元素存储到上述存储器的一个存储元件。7. The transposition memory as claimed in claim 1, characterized in that, said first counter outputs a count in parallel to the write address input terminals of said four memories, and in addition, said first counter outputs an appropriate and storing the received matrix elements in only one storage element of said memory at a time as indicated by said count of said first counter.8.如权利要求1所述的转置存储器,其特征在于,还包含:8. The transpose memory as claimed in claim 1, further comprising:一个第一个多工器,接收从第一个上述的存储器和第二个上述的存储器输出的矩阵元素并连接至第一和第二个资料输入,另外接收上述之第二个计数器的第一个选择控制信号并连接至该第一个多工器的选择控制输入;a first multiplexer receiving matrix elements output from the first aforementioned memory and the second aforementioned memory and connected to the first and second data inputs, and additionally receiving the first of the aforementioned second counters a selection control signal and connected to the selection control input of the first multiplexer;一个第二个多工器,接收从第三个上述的存储器和上述的第一个多工器所选择输出的矩阵元素并连接至第一和第二个资料输入,另外接收上述的第二个计数器的第二个选择控制信号并连接至该第二个多工器的选择控制输入;a second multiplexer receiving matrix elements selected from the third aforementioned memory and the output of the first aforementioned multiplexer and connected to the first and second data inputs, additionally receiving the aforementioned second a second selection control signal of the counter and connected to the selection control input of the second multiplexer;一个第三个多工器,接收从第一个上述的存储器和第四个上述的存储器输出的矩阵元素并连接至第一和第二个资料输入,另外接收上述的第二个计数器的上述的第二个选择控制信号并连接至该第三个多工器的选择控制输入;a third multiplexer receiving the matrix elements from the output of the first said memory and the fourth said memory and connected to the first and second data inputs, additionally receiving said said second counter a second selection control signal connected to the selection control input of the third multiplexer;上述的第二个计数器平行地输出一个计数到上述的四个存储器的每一个,并输出特定的第一和第二个选择控制信号,根据上述的第二个计数器的上述的计数的指示,从其中两个上述的存储器的存储元件选择出矩阵元素。The above-mentioned second counter outputs a count in parallel to each of the above-mentioned four memories, and outputs specific first and second selection control signals, according to the indication of the above-mentioned count of the above-mentioned second counter, from The storage elements of two of the above-mentioned memories are selected as matrix elements.9.如权利要求1所述的转置存储器,其特征在于,上述的第一个记数器将收到的上述的矩阵元素写到指定的双端口存储器的特定的存储元件,使得上述的第二个计数器只要能够输出单一个地址到上述的双端口存储器,就可以读出上述矩阵元素的元素对应地址。9. The transpose memory as claimed in claim 1, characterized in that, said first counter writes said matrix element received to a specific storage element of a designated dual-port memory, so that said first counter As long as the two counters can output a single address to the above-mentioned dual-port memory, the address corresponding to the element of the above-mentioned matrix element can be read out.10.一个二维转换电路,其特征在于包含:10. A two-dimensional conversion circuit, characterized in that it comprises:第一个和第二个一维转换电路,上述的第一个一维转换电路接收一个输入矩阵,并根据第一个序列顺序输出一个一维转换矩阵的元素;The first and second one-dimensional conversion circuits, the above-mentioned first one-dimensional conversion circuit receives an input matrix, and sequentially outputs elements of a one-dimensional conversion matrix according to the first sequence;一个转置存储器,接收上述的第一个序列顺序的矩阵元素,并包含:A transpose memory that receives the matrix elements in the first sequence order above and contains:四个双端口存储器,而每一个存储器用以存储四分之一个上述的第一个序列顺序的上述的矩阵的元素;four dual-port memories, and each memory is used to store a quarter of the elements of the above-mentioned matrix in the order of the first sequence;一个第一个记数器,作为第一型式转换电路的上述的第一个一维转换电路,并用来将收到的每一个上述的第一个序列顺序的矩阵元素写入对应到上述的元素的矩阵象限的特定存储器的特定存储元件,并作为第二型式转换电路的上述的第二个一维转换电路,并用来将收到的每一个上述的第一个序列顺序的矩阵元素写入对应于上述元素的矩阵单或双行与栏的特定存储器的特定存储元件;及A first counter, which is used as the above-mentioned first one-dimensional conversion circuit of the first type conversion circuit, and is used to write each matrix element of the above-mentioned first sequence sequence received corresponding to the above-mentioned element The specific storage element of the specific memory of the matrix quadrant, and the above-mentioned second one-dimensional conversion circuit as the second type conversion circuit, and is used to write each received matrix element of the above-mentioned first sequence order into the corresponding specific storage elements of specific memory in single or double rows and columns of the matrix of said elements; and一个第二个计数器,根据第二个序列顺序用来从上述存储器的特定存储元件读出上述一维转置矩阵之元素对应的地址;A second counter is used to read the address corresponding to the element of the above-mentioned one-dimensional transposition matrix from the specific storage element of the above-mentioned memory according to the second sequence order;上述的第二个一维转换电路接收上述的第二个序列的元素,并输出一个二维转换矩阵。The above-mentioned second one-dimensional transformation circuit receives the elements of the above-mentioned second sequence, and outputs a two-dimensional transformation matrix.
CN94109115A1994-08-191994-08-19 Transpose Memory for Discrete Cosine Transform/Inverse Discrete Cosine Transform CircuitExpired - Fee RelatedCN1076838C (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN94109115ACN1076838C (en)1994-08-191994-08-19 Transpose Memory for Discrete Cosine Transform/Inverse Discrete Cosine Transform Circuit

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN94109115ACN1076838C (en)1994-08-191994-08-19 Transpose Memory for Discrete Cosine Transform/Inverse Discrete Cosine Transform Circuit

Publications (2)

Publication NumberPublication Date
CN1122026A CN1122026A (en)1996-05-08
CN1076838Ctrue CN1076838C (en)2001-12-26

Family

ID=5033774

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN94109115AExpired - Fee RelatedCN1076838C (en)1994-08-191994-08-19 Transpose Memory for Discrete Cosine Transform/Inverse Discrete Cosine Transform Circuit

Country Status (1)

CountryLink
CN (1)CN1076838C (en)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
JP3971135B2 (en)*2001-07-112007-09-05株式会社テクノマセマティカル DCT matrix decomposition method and DCT apparatus
US7082450B2 (en)*2001-08-302006-07-25Nokia CorporationImplementation of a transform and of a subsequent quantization
CN100366068C (en)*2004-03-092008-01-30北京中星微电子有限公司A storage space saved storage processing method
CN109408117B (en)*2018-10-082021-01-26京东方科技集团股份有限公司 Matrix transposition device and method, and display device
US10831507B2 (en)2018-11-212020-11-10SambaNova Systems, Inc.Configuration load of a reconfigurable data processor
US11188497B2 (en)2018-11-212021-11-30SambaNova Systems, Inc.Configuration unload of a reconfigurable data processor
US10698853B1 (en)2019-01-032020-06-30SambaNova Systems, Inc.Virtualization of a reconfigurable data processor
US10768899B2 (en)2019-01-292020-09-08SambaNova Systems, Inc.Matrix normal/transpose read and a reconfigurable data processor including same
US11386038B2 (en)2019-05-092022-07-12SambaNova Systems, Inc.Control flow barrier and reconfigurable data processor
US11055141B2 (en)2019-07-082021-07-06SambaNova Systems, Inc.Quiesce reconfigurable data processor
CN110826711B (en)*2019-10-292022-04-26中昊芯英(杭州)科技有限公司Matrix processing device, method and equipment
US11809908B2 (en)2020-07-072023-11-07SambaNova Systems, Inc.Runtime virtualization of reconfigurable data flow resources
US11782729B2 (en)2020-08-182023-10-10SambaNova Systems, Inc.Runtime patching of configuration files
WO2022062682A1 (en)*2020-09-272022-03-31中科寒武纪科技股份有限公司Data processing device, integrated circuit chip, device, and implementation method therefor
US11409540B1 (en)2021-07-162022-08-09SambaNova Systems, Inc.Routing circuits for defect repair for a reconfigurable data processor
US11327771B1 (en)2021-07-162022-05-10SambaNova Systems, Inc.Defect repair circuits for a reconfigurable data processor
US11556494B1 (en)2021-07-162023-01-17SambaNova Systems, Inc.Defect repair for a reconfigurable data processor for homogeneous subarrays

Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN87103462A (en)*1986-05-121987-11-25菲利浦光灯制造公司Discrete cosine conversion device
CN87104093A (en)*1986-06-061987-12-16汤姆生无线电报总公司 Calculating device for one-dimensional cosine transform and image coding device and decoding device including the computing device
US5181183A (en)*1990-01-171993-01-19Nec CorporationDiscrete cosine transform circuit suitable for integrated circuit implementation
US5291429A (en)*1991-03-081994-03-01Fujitsu LimitedCircuit for matrix calculation of discrete cosine transformation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN87103462A (en)*1986-05-121987-11-25菲利浦光灯制造公司Discrete cosine conversion device
CN87104093A (en)*1986-06-061987-12-16汤姆生无线电报总公司 Calculating device for one-dimensional cosine transform and image coding device and decoding device including the computing device
US5181183A (en)*1990-01-171993-01-19Nec CorporationDiscrete cosine transform circuit suitable for integrated circuit implementation
US5291429A (en)*1991-03-081994-03-01Fujitsu LimitedCircuit for matrix calculation of discrete cosine transformation

Also Published As

Publication numberPublication date
CN1122026A (en)1996-05-08

Similar Documents

PublicationPublication DateTitle
CN1076838C (en) Transpose Memory for Discrete Cosine Transform/Inverse Discrete Cosine Transform Circuit
US5481487A (en)Transpose memory for DCT/IDCT circuit
CA1277036C (en)Orthogonal transform processor
CA1290854C (en)Two-dimensional discrete cosine transform processor
CN1346460A (en) Parallel Data Processing and Reordering
CN103369326B (en)Be suitable to the transform coder of high-performance video coding standard HEVC
CN1009034B (en)Discrete cosine conversion device
Mehlhorn et al.Area—Time optimal VLSI integer multiplier with minimum computation time
US5508538A (en)Signal processing applications of massively parallel charge domain computing devices
Chang et al.Hardware-efficient DFT designs with cyclic convolution and subexpression sharing
CN1118033C (en)Discrete 3780-point Fourier transformation processor system and its structure
Zhang et al.Hardware architecture design of block-matching and 3D-filtering denoising algorithm
Patil et al.Low Power High Speed VLSI Architecture for 1-D Discrete Wavelet Transform
Mohanty et al.New scan method and pipeline architecture for VLSI implementation of separable 2-D FIR filters without transposition
US8572148B1 (en)Data reorganizer for fourier transformation of parallel data streams
Usha Bhanu et al.High-speed systolic VLSI architecture for 2-D forward lifting-based DWT
Aggoun et al.A parallel 3D DCT architecture for the compression of integral 3D images
Avila et al.A one gigaflop VLSI systolic processor
Patil et al.Design of High Speed VLSI Architecturefor 1-D Discrete Wavelet Transform
Demassieux et al.Orthogonal transforms
Nakano et al.FIFO-based hardware sorters for high bandwidth memory
Zhang et al.A low area pipelined 2-D DCT architecture for JPEG encoder
Prots' ko et al.The Analysis of Models of the Block-cyclic Structures of the DCT-II core for the Synthesis of Fast Algorithms.
Cohen et al.The Conveyor-an Interconnection Device for Parallel Volumetric Transformations
Gong et al.New cost-effective VLSI implementation of 2D discrete cosine transform and its inverse

Legal Events

DateCodeTitleDescription
C10Entry into substantive examination
SE01Entry into force of request for substantive examination
C06Publication
PB01Publication
C14Grant of patent or utility model
GR01Patent grant
C41Transfer of patent application or patent right or utility model
TR01Transfer of patent right

Effective date of registration:20090109

Address after:Taipei City, Taiwan, China

Patentee after:Institute for Information Industry

Address before:Taiwan, China

Patentee before:Industrial Technology Research Institute

ASSSuccession or assignment of patent right

Owner name:YULIN TECHNOLOGY CO., LTD.

Free format text:FORMER OWNER: INDUSTRIAL TECHNOLOGY RESEARCH INSTITUTE

Effective date:20090109

C17Cessation of patent right
CF01Termination of patent right due to non-payment of annual fee

Granted publication date:20011226

Termination date:20100819


[8]ページ先頭

©2009-2025 Movatter.jp