Movatterモバイル変換


[0]ホーム

URL:


WO2002093934A1 - Image compression and transmission - Google Patents

Image compression and transmission
Download PDF

Info

Publication number
WO2002093934A1
WO2002093934A1PCT/GB2002/002236GB0202236WWO02093934A1WO 2002093934 A1WO2002093934 A1WO 2002093934A1GB 0202236 WGB0202236 WGB 0202236WWO 02093934 A1WO02093934 A1WO 02093934A1
Authority
WO
WIPO (PCT)
Prior art keywords
motion vectors
array
hash values
frame
generating
Prior art date
Application number
PCT/GB2002/002236
Other languages
French (fr)
Inventor
Farrukh N. Alavi
G. M. Megson
Original Assignee
Salgen Systems Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Salgen Systems LimitedfiledCriticalSalgen Systems Limited
Publication of WO2002093934A1publicationCriticalpatent/WO2002093934A1/en

Links

Classifications

Definitions

Landscapes

Abstract

A method of compressing image data comprising the steps of generating a set of motion vectors representative of one or more image frames, generating, by means of a predetermined hash function a set of hash values, responding to said motion vectors, and storing as a code book said hash values in the form of a table or array. Vector quantisation may be used to index the hash values in the table or array to enable them to be retrieved by a decoder.

Description

IMAGE COMPRESSION AND TRANSMISSION
This invention relates to a method and apparatus for compression of images, in particular moving images such as video sequences and the like, for transmission across a communication network.
Digital video has been developed to a great extent over recent years, and in view of the large range of applications to which it lends itself-, particularly with the very high uptake- and- growth in personal computers and workstations and the popularity of the global Internet, substantial research and development has been dedicated to the development of techniques for compression, decompression and transmission of video. In general, the aim is to improve efficiency of compression as well as the effectiveness of the transport medium.
From a technical perspective, the main aim is to reduce both storage and transmission costs, i.e. to improve coding efficiency. However, one of the main concerns is the inherent trade-off between coding efficiency and video fidelity. Industry standards such as H.261 and MPEG define standard formats for compressed video data (but not implementations) , such that video fidelity can be improved as better codecs are developed without having to redefine the standard. Further, the defined standards enable a range of bitrates to be supported so that the quality of the reproduced video becomes a function of the cost of the hardware that the user can afford.
Thus, MPEG specifies both a syntax and a semantics for a legal video bitstream at the encoder stage, and a definition for synchronisation and demultiplexing of the bitstream into its constituent parts (i.e. video, audio and other data) at the decoder stage, the latter permitting the video playback quality to scale with the abilities of the target hardware.
The video algorithms defined by MPEG are based on a class of video compression algorithms that aim to maximally reduce the natural spatio-temporal redundancy both within and between video frames in order to deliver compression. A key feature to exploit in such redundancy elimination is that of the motion of rigid bodies in a sequence of frames . Clearly, by encoding the relevant object once, and subsequently transmitting merely its spatial translation, much irrelevant data is eliminated rom the encoding process . Algorithms which attempt to achieve this effect are known as motion compensation algorithms.
Substantial work has been carried out to develop sophisticated models for motion compensation. Two main classes of such algorithms have become predominant, namely block-matching algorithms (BMA) which look at the translation of groups of pixels, and pel-recursive algorithms which are concerned with individual pixel translations.
By their very nature, pel-recursive algorithms produce better video fidelity. However, such algorithms are also more expensive to compute. BMA routines have therefore become the de facto standard in a great majority of modern implementations, and a substantial amount of research and development has been put into improving BMA over the first basic procedure outlined back in 1981. Such improvements aim to reduce BMA's computational expense, as well as increase its overall quality.
It is well known that video sequences contain both intra-frame (spatially local) and inter-frame (temporally global) correlations, and methods to exploit this redundancy have been considered since the early 1970' s. The earliest method considered interframe ('delta-coded') sequences, where the intensity difference between pels in successive frames were coded, and this method provided the basis for all modern predictive coding techniques. The basic idea is to look at the following two variables:
* p (x, y; t) : the value of a pixel at location (x, y) at time t; and
* Predte/VV t) : the predicted value of the same pixel.
The difference between the two, ε=ppred(x'/y; t - p (x, y; t) , is known as the error signal or residual that is to be transmitted to the receiver, where it is combined with ppred (~~' y t) to reconstruct p (x, y; t) . Clearly, the better the predicted value, the smaller the error signal or residual (ε) . Conversely, compression is optimised by minimising the residual. The overall compression can be further improved if the error signal is transformed (Discrete Cosine Transform, or DCT, is the usual preferred approach) , leading to the so-called hybrid coding techniques .
Even further compression results if the motion of rigid bodies is taken into account. For still frames, the position vector of an object, represented as v= (x,y) , does not change between frames. However, where there is motion, some translated r'= (x ' , y ' ) makes a better predictor. Hence, the motion vector Δr=r-r' can be encoded and transmitted in addition to ε for the pels at least for which motion can be identified. This is known as motion-compensated interframe predictive coding. The evaluation of Δr at the encoder is called motion estimation, whereas, at the decoder end, the exploitation of this information in pel reconstruction is called motion compensation.
Various motion compensation algorithms have been proposed and, as stated above, the block matching algorithm (BMA) remains the most widely used, primarily for the simplicity of its concept and its hardware realisability. The BMA typically begins by partitioning a frame of video pixels into non-overlapping macroblocks of size N x N. Each macroblock in the frame being encoded (the 'current block') is compared with potential matches ('candidate blocks') in the previous, or reference, frame. For a maximum vector displacement of &> pixels, a given macroblock is searched within a search window of size (N+2to)x (N+2&) , as shown in Figure 1 of the drawings. The range of the motion vector is constrained by controlling the size of the search window. The displacement is taken to be that comparison which maximises or minimises a function, a distortion measure, representing the matching criterion. Many such functions have been proposed, such as the cross-correlation function (CCF) , the mean square error (MSE) , the mean absolute error (MAE) and the cross-search algorithm (CSA) .
As stated above, block matching algorithms represent a tradeoff between block reconstruction accuracy and hardware/computational expense vis a vis pel-recursive techniques. Furthermore, the magnitude of the motion vectors generated by means of block-matching can be relatively large, which is counter-productive within a compression strategy, especially in the case where the video data is to be transmitted at a relatively low bitrate, in which case the proportion of the transmission burst assigned to motion vectors can become disproportionate. It is for this reason that residuals from motion estimation are first compressed themselves (by means of a lossy transform encoder and an entropy encoder) before transmission. This encoding scheme has been adopted by the MPEG, H.261 and H.263 standards.
In spite of the trade-off in quality which is inherent in block-matching algorithms, the search for good motion vectors adds substantial computational overhead to the encoding process. It takes substantially longer to perform motion estimation than it does to perform motion compensation. In fact, the block-matching process is the most time-consuming part of the entire encoding process. Thus, in encoding MPEG video, the algorithms must perform a tight balancing act between the conflicting requirements of short encoding times, high image quality and high compression ratios. Encoding times can be reduced by reducing the search area for good motion vectors, but this has a direct impact on the image quality. Further, the latter varies inversely with good compression ratios.
The problem is even more severe for H.261, which is intended for video conferencing applications and the like, in which case the encoding process takes place on-line. We have now devised a technique which overcomes the problems outlined above and provides a method and apparatus for encoding image data which substantially reduces motion estimation times relative to the prior art techniques identified above. In accordance with a first aspect of the present invention, there is provided a method of compressing image data comprising the steps of generating a set of motion vectors representative of one or more image frames, generating, by means of a predetermined hash function a set of hash values corresponding to said motion vectors, and storing as a codebook said hash values in the form .of a table or array.
Also in accordance with the first aspect of the present invention, there is provided an apparatus for compressing image data comprising means, for generating a set of motion vectors representative of one or more image frames, means for generating, using a predetermined hash function, a set of hash values corresponding to said motion vectors, and codebook means for storing said hash values in the form of a table or array.
In accordance with a second aspect of the present invention, there is provided a method of compressing image data comprising the steps of generating a set of motion vectors representative of one or more image frames, storing as a codebook data representative of said motion vectors in the form of a table or array, and using vector quantisation to index the data stored in said table or array for retrieval of said data by decoding means .
Also in accordance with the second aspect of the present invention, an apparatus for compressing image data comprising means for generating a set of motion vectors representative of one or more image frames, codebook means for storing data representative of said motion vectors in the form of a table or array, and vector quantisation means for indexing the data stored in said table or array so that it can be retrieved.
An exemplary embodiment of the invention will now be described with reference to the accompanying drawings, in which:
Figure 1 is a schematic diagram illustrating a macroblock and search window used in a BMA compression technique according to the prior art;
Figure 2 Is a schematic diagram illustrating the integration of a vector quantiser codebook with a hash table, the diagram showing a hash table with buckets, each with M slots per bucket ;
Figure 3 is a schematic diagram illustrating an exemplary embodiment of hardware for Vector Quantised Hashing (VQH) the name of our proposed algorithm for motion estimation; and
Figure 4 is a graph showing the plot of PBMA=N2A2.
The concept of a look-up table is well-known in engineering, and may be defined as a set of (name, attribute) pairings for storing data items. There are three basic operations which may be required to be performed on such a look-up table :
1. Insert a data item
2. Delete a data item
3. Search for a data item Intuitively, since tables are stored just like arrays, such operations may be expected to cost 0 (n) for n items. However, in accordance with this exemplary embodiment of the present invention, a better performance can be obtained by the use of a technique known as 'hashing', whereby the search criterion replaces a sequence of operations by a single operation involving the computation of a function known as a 'hash function' .
For the purpose of the present description, assume that the size of the hash table is fixed (i.e. 'static hashing' as opposed to 'dynamic hashing' in which the table size may vary) . The address of a data item x stored within the hash table may be computed by evaluating the hash function h (x) . Typically, hash tables are partitioned into b 'buckets', with each bucket consisting of s ' slots ' . Each slot is capable of storing exactly one data item, and it is often the case that s=l, i.e. each bucket stores just one data item.
The construction of the hash function h (o) is the most crucial aspect of designing a hash table. Not only should h (o) be easy to compute, but it should also ideally generate a unique address within the hash table for each argument. It is not possible for the hash table to hold every possible value of the argument. Hence, it has been found that collisions often occur, i.e. h (x) =h (y) for two data items
Figure imgf000010_0001
≠ y} . Another problem is that of overflow, whereby a data item is mapped by h (o) into an existing bucket that is already full. Ideally, therefore, hash functions should be designed to minimise the possibility of both collisions and overflow. It has been found that there are advantages to encoding groups of image sequences, as opposed to encoding individual samples. A technique known as vector quantisation (VQ) utilises this finding and offers a way of performing lossy compression along the way.
VQ is essentially the multi-dimensional generalisation of scalar quantisation, as is commonly employed in analog-to- digital conversion processes. In analytical terms, if X is an N-dimensional source vector, then VQ is a mapping such that:
Q : R™
Where C is an L-dimensional set, L < N, such that
C = { i/.-./ i,}, and the Y.^ e #N for i=l,...,L. C is usually termed the 'codebook', and the Y± the 'code vectors'. The VQ operator Q partitions £N into L disjoint and exhaustive regions {~~l f . . . P } , each of which has a single coarse-grained representation.
In multi-dimensional signal processing, X may be taken to be a pixel macroblock that is quantised under the operation Q into a finite codebook. The. latter is generated once, and a copy is provided to both the encoder and the decoder. It is then sufficient to merely store or transmit the output of the codebook in order to represent any source vector. The technique operates as a pattern matching algorithm. It is well-known in engineering literature and is an integral part of MPEG's repertoire of routines. In this exemplary embodiment of the present invention, Q is reinterpreted as the hash function h= (X) and, together with an appropriately sized two-dimensional array, enables the implementation of a hash table.
Referring to Figure 2 of the drawings, a source vector X is mapped by Q into a bucket, and occupies a unique, but arbitary, slot position. Each bucket therefore holds all the source vectors that are sufficiently close to the appropriate code vector which is their quantised representation within the source regions Pj,. With this interpretation, and assuming that there are no restrictions on the size of the hash table, it is possible to represent the entire domain of source vectors completely accurately, in spite of the fact that Q is usually a dimensionality-reduction operator. In other words, the combination of VQ and a hash table loaded in the manner described above provide a way for non-lossy representation of a source frame. This combined structure will be hereinafter referred to as a 'Vector Quantised Hash Table' or VQHT.
In order to support motion compensation, MPEG classifies video frames into three categories as follows.
1. I (ntra) -frames, which are independently coded without reference to any other frames .
2. P (redicted) -frames, which exploit motion compensation in order to improve compression. A predicted frame is coded with reference to a preceding I- or P-frame. 3. B (idirectional) -frames which rely upon both preceding and subsequent frames . Such frames use bidirectional interpolation between I- and P-frames, but are not used for coding other frames. They also have the highest compression efficiency.
Furthermore, MPEG specifies two parameters, N and M, which keep a count of the frame distance (i.e. number of frames) between, respectively, two successive I-frames (also-know -as GOP or 'Group of Pictures') and two successive P-frames. Typically, the boundary between GOPs is dictated by a scene cut; hence, N is a function of the number of such cuts in a video. M, however, is not defined by MPEG, and is left to the discretion of the encoder.
The generation of P-frames is crucial for efficient coding, but is also the most expensive part of MPEG, since motion estimation is directly involved. The decoding process uses a macroblock and a motion vector to reconstruct a P-frame, based on a closest match search of the preceding frame. Note that the use of the word 'preceding' does not imply frame adjacency, since B-frames typically interleave I- and P-frames.
In addition, MPEG does not specify how a closest match should be implemented; encoders have the task therefore of minimising the difference between a predicted and an actual macroblock.
In the following, the concept of performing motion estimation and compensation using the VQHT technique discussed above is described. For simplicity, both forward-predicted and bidirectionally-predicted frames are referred to as P-frames in the following description. For a given GOP, the process begins by encoding an I-frame (or a P-frame from which a subsequent P-frame is to be deduced) into a VQHT. As described above, this provides a complete and non-lossy representation of an I-frame. From an implementation perspective, encoding involves a two-stage process:
1. Codebook generation, in which a decision is made on the number of bucket entries L in the codebook C. Representative code vectors from the I-frame are computed
(using any of the standard VQ training algorithms) and stored in C. Clearly, the larger the codebook, the less the quantisation error during encoding and look-up. It follows, therefore, that the minimum bound on L should be at least equal to the maximum number of motion vectors that any subsequent predicted frame will require. Thus, video fidelity becomes a function of the codebook size, as well as the size of the hash table.
2. Hash Table loading, in which the VQHT bucket slots are filled up by feeding every possible source vector (macroblock) from the I-frame through the hash function and storing it (together with its co-ordinates) in its appropriate bucket. Bucket slots are filled up sequentially in this manner.
Clearly, the above two processes must be performed exactly once for a given GOP. The resulting VQHT structure must be made available to the encoder.
To encode a subsequent P-frame, a set of motion vectors are required for those macroblocks which will be predicted during the decoding stage. The generation of motion vectors using the VQHT involves the simple act of a hash table lookup. The P- frame macroblock whose motion vector is required is hashed directly into a bucket entry. The corresponding motion vector is then obtained simply by searching all slots for that I-frame macroblock which minimises a distance metric. The co-ordinate difference between the P-frame and I-frame macroblock so found defines the motion vector. This can now be DCT-encoded before being transmitted to the decoder in the usual MPEG manner.
The encoder structures required in a hardware implementation of VQH can be partitioned into pre-processing and postprocessing stages. For pre-processing, all that is required is a vector quantiser (which is normally a part of MPEG anyway) and some local buffer memory which stores the buckets and slots comprising the VQHT. It is possible to construct control logic that will directly fill up the VQHT from the vector quantiser's output when it is given an I-frame to encode. This, however, could also be done in software without incurring a significant performance penalty.
In BMA, the generation of motion vectors from subsequent P- frames is, as noted above, a computationally intensive task. Several high-throughput systolic designs have been suggested and implemented in order to achieve this.
In the VQH approach, it is possible to design, for postprocessing purposes, efficient dataflow hardware which will give rise to a high-performance motion vector generation engine. Such a design is illustrated in Figure 3 of the drawings, and consists of the following: A shift register array, which takes as its input a linearised macroblock from a P-frame that is to be encoded. The geometry of this array is arranged such that the outputs are simply equal to the inputs, but with each component staggered by one computational cycle from its predecessor;
A codebook buffer, which contains the code vectors which will be filled in by the hardware vector quantiser; A VQHT buffer, which contains the representation of the I-frame, and is filled in during the pre-processing stage;
A systolic sorter: the P-frame macroblock that is to be encoded needs to be hashed into its appropriate bucket, and the corresponding I-frame macroblock with the least distance metric needs to be found. For this reason, a systolic sorter is included, the function of which is two-fold. Firstly, it sorts the output metrics from the codebook array in order to find the corresponding bucket . Secondly, it sorts the output metrics from the bucket in order to find those with the least distortion; An array of comparators: the distortions from the sorter array need not be unique, particularly if the macroblock is representing a region of low spatial gradient (i.e. minimum motion) . Thus it is necessary to compare the sorted outputs with each other in order to tag all those which are equal. The comparator performs this task; and A mean absolute differencer: at this stage, there exists a set of bucket entries which have an identical (and minimum) distance metric between the P-frame macroblock to.be encoded and the I-frame. It now remains to find from these entries that unique entry which minimises the co-ordinate metric. The 2-dimensional differencer performs this task. It takes as its input the {x, y) coordinates of the P-frame macroblock to be encoded as well as the outputs from the comparator array. It then performs a metric computation (an L2-norm) between this coordinate and the coordinates of all candidate I-frame macroblocks. The resulting calculation tags the coordinates of the best-matching I-frame macroblock.
In BMA, to compare an N x N once requires O iN2) operations (the O-notation is well-known in complexity analysis, and provides a way of. expressing an upper bound) . The number of such macroblock searches required within a search window is, from Figure 1, (2w+l)2~0(w2) . Consider a square frame, of dimensions A x A pixels. Since N+2w=A, we have O (w-) ≡ 0(A2) . If an exhaustive search for a motion vector is carried out over the entire frame, a total of PBMA = 0 (N2AZ) 0 (1SP) 0(A2) operations are required for every P-frame. The function grows relatively rapidly, as shown in Figure 4 of the drawings.
In the VQHT approach to motion estimation according to this exemplary embodiment of the present invention, it is necessary to factor in the initial, but one-off, cost of generating the codebook at the start of a GOP. Using the convention illustrated in Figure 2, if there are L bucket entries, with the largest bucket having at most M slots, then a VQHT training and loading algorithm based on J-means clustering can be shown to require O (IM) operations. (Note: M is bounded by A2, as explained above, but it is realistic to expect that M < A2, and The look-up for generating a motion vector requires simply O (L) operations followed by at most O (M) slot searches. This gives a total of PVQHT = 0 {LM) operations to set up a GOP, followed by O (L) +0 (M) operations for every P-frame that is subsequently encoded using it. With standard VQ, the greater the codebook size, the more accurate is the quantised representation. However, with the VQHT of the present invention, the reduction in accuracy entailed by small values of L is compensated for by an increase in the maximum slot size M. The extremal cases are simply L=A2, M=l against L=l, =A2. By choosing mid-point values L=M= (l/2)A, we obtain PVQHT = 0(A2) + 0(A).
Thus, in the above description a new method of finding the closest match in video compression is presented based on two new ideas, namely the use of a hash table for storing the motion vectors, and the use of vector quantisation (VQ) as an indexing method for a hash table. A systolic architecture is also proposed for implementing the described algorithm in hardware.
An embodiment of the present invention has been described above by way of example only and it will be apparent to persons skilled in the art that modifications and variations can be made to the described embodiment without departing from the scope of the invention.

Claims

CLAIMS :
1. A method of compressing image data comprising the steps of generating a set of motion vectors representative of one or more image frames, generating, by means of a predetermined hash function a set of hash values corresponding to said motion vectors, and storing as a code book said hash values in the form of a table or array.
2. A method according to claim 1 , including the step of using vector quantisation to index the hash values stored in said table or array for retrieval thereof by decoding means .
3. Apparatus for compressing image data comprising means for generating a set of motion vectors representative of one or more image frames, means for generating, using a predetermined hash function, a set of hash values corresponding to said motion vectors, and code book means for storing said hash values in the form of a table or array.
4. Apparatus according to claim 3, further comprising vector quantisation means for indexing the hash values stored in said table or array so that they can be retrieved.
5. A method of compressing image data comprising the steps of generating a set of motion vectors representative of one or more image frames, storing as a code book data representative of said motion vectors in the form of a table or array, and using vector quantisation to index the data stored in said table or array for retrieval of said data by decoding means.
6. A method according to claim 5, wherein said data representative of said motion vectors comprises a set of hash values corresponding to said motion vectors, said hash values being generated by means of a predetermined hash function.
7. Apparatus for compressing image data comprising means for generating a set of motion vectors representative of one or more image frames, code book means for storing data representative of said motion vectors in the form of a table or array, and vector quantisation means for indexing the data stored in said table or array so that it can be retrieved.
8. Apparatus according to claim 7, wherein said data representative of said motion vectors comprises a set of hash values corresponding to said motion vectors, said hash values being generated by means of a predetermined hash function.
9. A method or apparatus according to any one of the preceding claims, wherein said vector quantisation is a mapping Q such that :
Q: RN → C where C is an L-dimensional set, L<N, such that C = {γ1# ... , YN}, and the γ4 e RN for i= 1, ... ,L.
10. A method or apparatus according to claim 9, wherein Q is interpreted as the hash function h=h (X) (where X = RN) .
PCT/GB2002/0022362001-05-142002-05-14Image compression and transmissionWO2002093934A1 (en)

Applications Claiming Priority (2)

Application NumberPriority DateFiling DateTitle
GB0111627.62001-05-14
GB0111627AGB2375673A (en)2001-05-142001-05-14Image compression method using a table of hash values corresponding to motion vectors

Publications (1)

Publication NumberPublication Date
WO2002093934A1true WO2002093934A1 (en)2002-11-21

Family

ID=9914510

Family Applications (1)

Application NumberTitlePriority DateFiling Date
PCT/GB2002/002236WO2002093934A1 (en)2001-05-142002-05-14Image compression and transmission

Country Status (2)

CountryLink
GB (1)GB2375673A (en)
WO (1)WO2002093934A1 (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
WO2009143120A3 (en)*2008-05-192010-04-01Citrix Systems, Inc.Systems and methods for enhanced image encoding
WO2015142829A1 (en)*2014-03-172015-09-24Qualcomm IncorporatedHash-based encoder search for intra block copy
TWI548266B (en)*2014-06-242016-09-01愛爾達科技股份有限公司Multimedia file storage system and related devices
EP3061233A4 (en)*2013-10-252016-10-12Microsoft Technology Licensing Llc REPRESENTATION OF BLOCKS USING HASH VALUES IN VIDEO ENCODING AND DECODING AND IMAGES
US9786270B2 (en)2015-07-092017-10-10Google Inc.Generating acoustic models
US9858922B2 (en)2014-06-232018-01-02Google Inc.Caching speech recognition scores
US10204619B2 (en)2014-10-222019-02-12Google LlcSpeech recognition using associative mapping
US10229672B1 (en)2015-12-312019-03-12Google LlcTraining acoustic models using connectionist temporal classification
US10264290B2 (en)2013-10-252019-04-16Microsoft Technology Licensing, LlcHash-based block matching in video and image coding
US10368092B2 (en)2014-03-042019-07-30Microsoft Technology Licensing, LlcEncoder-side decisions for block flipping and skip mode in intra block copy prediction
US10390039B2 (en)2016-08-312019-08-20Microsoft Technology Licensing, LlcMotion estimation for screen remoting scenarios
US10403291B2 (en)2016-07-152019-09-03Google LlcImproving speaker verification across locations, languages, and/or dialects
US10567754B2 (en)2014-03-042020-02-18Microsoft Technology Licensing, LlcHash table construction and availability checking for hash-based block matching
US10681372B2 (en)2014-06-232020-06-09Microsoft Technology Licensing, LlcEncoder decisions based on results of hash-based block matching
US10706840B2 (en)2017-08-182020-07-07Google LlcEncoder-decoder models for sequence to sequence mapping
EP3613014A4 (en)*2017-04-212020-07-22Zenimax Media Inc. COMPENSATION OF A PLAYER'S INPUT MOTIONS THROUGH ANTICIPATING MOTION VECTORS
US11025923B2 (en)2014-09-302021-06-01Microsoft Technology Licensing, LlcHash-based encoder decisions for video coding
US11095877B2 (en)2016-11-302021-08-17Microsoft Technology Licensing, LlcLocal hash-based motion estimation for screen remoting scenarios
US11202085B1 (en)2020-06-122021-12-14Microsoft Technology Licensing, LlcLow-cost hash table construction and hash-based block matching for variable-size blocks

Citations (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
EP0731614A2 (en)*1995-03-101996-09-11Kabushiki Kaisha ToshibaVideo coding/decoding apparatus

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US4979039A (en)*1989-01-301990-12-18Information Technologies Research Inc.Method and apparatus for vector quantization by hashing
EP0576765A1 (en)*1992-06-301994-01-05International Business Machines CorporationMethod for coding digital data using vector quantizing techniques and device for implementing said method
US5832131A (en)*1995-05-031998-11-03National Semiconductor CorporationHashing-based vector quantization

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
EP0731614A2 (en)*1995-03-101996-09-11Kabushiki Kaisha ToshibaVideo coding/decoding apparatus

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
CHOO C Y ET AL: "A hashing-based scheme for organizing vector quantization codebook", ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 1995. ICASSP-95., 1995 INTERNATIONAL CONFERENCE ON DETROIT, MI, USA 9-12 MAY 1995, NEW YORK, NY, USA,IEEE, US, 9 May 1995 (1995-05-09), pages 2495 - 2498, XP010151838, ISBN: 0-7803-2431-5*
CHOO C Y ET AL: "Interframe hierarchical vector quantization using hashing-based reorganized codebook", CODING AND SIGNAL PROCESSING FOR INFORMATION STORAGE, PHILADELPHIA, PA, USA, 23-24 OCT. 1995, vol. 2605, Proceedings of the SPIE - The International Society for Optical Engineering, 1995, SPIE-Int. Soc. Opt. Eng, USA, pages 151 - 157, XP001089677, ISSN: 0277-786X*
KRUSE S-M: "SCENE SEGMENTATION FROM DENSE DISPLACEMENT VECTOR FIELDS USING RANDOMIZED HOUGH TRANSFORM", SIGNAL PROCESSING. IMAGE COMMUNICATION, ELSEVIER SCIENCE PUBLISHERS, AMSTERDAM, NL, vol. 9, no. 1, 1 November 1996 (1996-11-01), pages 29 - 41, XP000630907, ISSN: 0923-5965*
LEI XU ET AL: "RANDOMIZED HOUGH TRANSFORM (RHT): BASIC MECHANISMS, ALGORITHMS, AND COMPUTATIONAL COMPLEXITIES", CVGIP IMAGE UNDERSTANDING, ACADEMIC PRESS, DULUTH, MA, US, vol. 57, no. 2, 1 March 1993 (1993-03-01), pages 131 - 154, XP000226771, ISSN: 1049-9660*

Cited By (38)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US8295617B2 (en)2008-05-192012-10-23Citrix Systems, Inc.Systems and methods for enhanced image encoding
WO2009143120A3 (en)*2008-05-192010-04-01Citrix Systems, Inc.Systems and methods for enhanced image encoding
US10264290B2 (en)2013-10-252019-04-16Microsoft Technology Licensing, LlcHash-based block matching in video and image coding
EP3061233A4 (en)*2013-10-252016-10-12Microsoft Technology Licensing Llc REPRESENTATION OF BLOCKS USING HASH VALUES IN VIDEO ENCODING AND DECODING AND IMAGES
US11076171B2 (en)2013-10-252021-07-27Microsoft Technology Licensing, LlcRepresenting blocks with hash values in video and image coding and decoding
US10567754B2 (en)2014-03-042020-02-18Microsoft Technology Licensing, LlcHash table construction and availability checking for hash-based block matching
US10368092B2 (en)2014-03-042019-07-30Microsoft Technology Licensing, LlcEncoder-side decisions for block flipping and skip mode in intra block copy prediction
CN106105197B (en)*2014-03-172019-01-15高通股份有限公司For the encoder searches based on hash of intra block duplication
US9715559B2 (en)2014-03-172017-07-25Qualcomm IncorporatedHash-based encoder search for intra block copy
CN106105197A (en)*2014-03-172016-11-09高通股份有限公司The encoder searches based on hash replicating for intra block
WO2015142829A1 (en)*2014-03-172015-09-24Qualcomm IncorporatedHash-based encoder search for intra block copy
US9858922B2 (en)2014-06-232018-01-02Google Inc.Caching speech recognition scores
US10681372B2 (en)2014-06-232020-06-09Microsoft Technology Licensing, LlcEncoder decisions based on results of hash-based block matching
TWI548266B (en)*2014-06-242016-09-01愛爾達科技股份有限公司Multimedia file storage system and related devices
US11025923B2 (en)2014-09-302021-06-01Microsoft Technology Licensing, LlcHash-based encoder decisions for video coding
US10204619B2 (en)2014-10-222019-02-12Google LlcSpeech recognition using associative mapping
US9786270B2 (en)2015-07-092017-10-10Google Inc.Generating acoustic models
US10803855B1 (en)2015-12-312020-10-13Google LlcTraining acoustic models using connectionist temporal classification
US11769493B2 (en)2015-12-312023-09-26Google LlcTraining acoustic models using connectionist temporal classification
US11341958B2 (en)2015-12-312022-05-24Google LlcTraining acoustic models using connectionist temporal classification
US10229672B1 (en)2015-12-312019-03-12Google LlcTraining acoustic models using connectionist temporal classification
US11594230B2 (en)2016-07-152023-02-28Google LlcSpeaker verification
US11017784B2 (en)2016-07-152021-05-25Google LlcSpeaker verification across locations, languages, and/or dialects
US10403291B2 (en)2016-07-152019-09-03Google LlcImproving speaker verification across locations, languages, and/or dialects
US10390039B2 (en)2016-08-312019-08-20Microsoft Technology Licensing, LlcMotion estimation for screen remoting scenarios
US11095877B2 (en)2016-11-302021-08-17Microsoft Technology Licensing, LlcLocal hash-based motion estimation for screen remoting scenarios
EP3723370A1 (en)*2017-04-212020-10-14Zenimax Media Inc.Player input motion compensation by anticipating motion vectors
US11695951B2 (en)2017-04-212023-07-04Zenimax Media Inc.Systems and methods for player input motion compensation by anticipating motion vectors and/or caching repetitive motion vectors
US11323740B2 (en)2017-04-212022-05-03Zenimax Media Inc.Systems and methods for player input motion compensation by anticipating motion vectors and/or caching repetitive motion vectors
US11330291B2 (en)2017-04-212022-05-10Zenimax Media Inc.Systems and methods for player input motion compensation by anticipating motion vectors and/or caching repetitive motion vectors
EP3723045A1 (en)*2017-04-212020-10-14Zenimax Media Inc.Player input motion compensation by anticipating motion vectors
US11503332B2 (en)2017-04-212022-11-15Zenimax Media Inc.Systems and methods for player input motion compensation by anticipating motion vectors and/or caching repetitive motion vectors
US11533504B2 (en)2017-04-212022-12-20Zenimax Media Inc.Systems and methods for player input motion compensation by anticipating motion vectors and/or caching repetitive motion vectors
EP3613014A4 (en)*2017-04-212020-07-22Zenimax Media Inc. COMPENSATION OF A PLAYER'S INPUT MOTIONS THROUGH ANTICIPATING MOTION VECTORS
US11601670B2 (en)2017-04-212023-03-07Zenimax Media Inc.Systems and methods for player input motion compensation by anticipating motion vectors and/or caching repetitive motion vectors
US10706840B2 (en)2017-08-182020-07-07Google LlcEncoder-decoder models for sequence to sequence mapping
US11776531B2 (en)2017-08-182023-10-03Google LlcEncoder-decoder models for sequence to sequence mapping
US11202085B1 (en)2020-06-122021-12-14Microsoft Technology Licensing, LlcLow-cost hash table construction and hash-based block matching for variable-size blocks

Also Published As

Publication numberPublication date
GB2375673A (en)2002-11-20
GB0111627D0 (en)2001-07-04

Similar Documents

PublicationPublication DateTitle
WO2002093934A1 (en)Image compression and transmission
JP4662636B2 (en) Improvement of motion estimation and block matching pattern
US8761254B2 (en)Image prediction encoding device, image prediction decoding device, image prediction encoding method, image prediction decoding method, image prediction encoding program, and image prediction decoding program
EP0615386B1 (en)A motion vector processor for compressing video signal
US20110261886A1 (en)Image prediction encoding device, image prediction encoding method, image prediction encoding program, image prediction decoding device, image prediction decoding method, and image prediction decoding program
KR100955396B1 (en) Two-prediction encoding method and apparatus, Two-prediction decoding method and apparatus and recording medium
US20060039470A1 (en)Adaptive motion estimation and mode decision apparatus and method for H.264 video codec
US20070217515A1 (en)Method for determining a search pattern for motion estimation
JP2000222587A (en)Motion estimation using orthogonal transformation/ domain block matching
JP2009510845A (en) Multidimensional neighborhood block prediction for video encoding
CN1457604A (en) Motion information encoding and decoding method
WO2012006304A2 (en)Motion compensation using vector quantized interpolation filters
CN114390289B (en)Reference pixel candidate list construction method, device, equipment and storage medium
EP1389875A2 (en)Method for motion estimation adaptive to DCT block content
US5699129A (en)Method and apparatus for motion vector determination range expansion
Bégaint et al.Deep frame interpolation for video compression
EP4272444A1 (en)A method, an apparatus and a computer program product for encoding and decoding
US20210235107A1 (en)Efficient video motion estimation by reusing a reference search region
JP2006191642A (en)Residual coding in compliance with video standard using non-standardized vector quantization coder
US20060104358A1 (en)Method and apparatus for motion estimation using adaptive search pattern for video sequence compression
CN100584010C (en)Power optimized collocated motion estimation method
US6931066B2 (en)Motion vector selection based on a preferred point
Kommerla et al.Real-Time Applications of Video Compression in the Field of Medical Environments
US12267484B2 (en)Warped reference list for warped motion video coding
RU2701058C1 (en)Method of motion compensation and device for its implementation

Legal Events

DateCodeTitleDescription
AKDesignated states

Kind code of ref document:A1

Designated state(s):AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SL TJ TM TN TR TT TZ UA UG US UZ VN YU ZA ZM ZW

ALDesignated countries for regional patents

Kind code of ref document:A1

Designated state(s):GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121Ep: the epo has been informed by wipo that ep was designated in this application
REGReference to national code

Ref country code:DE

Ref legal event code:8642

32PNEp: public notification in the ep bulletin as address of the adressee cannot be established

Free format text:NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 69(1) EPC

122Ep: pct application non-entry in european phase
NENPNon-entry into the national phase

Ref country code:JP

WWWWipo information: withdrawn in national office

Country of ref document:JP


[8]ページ先頭

©2009-2025 Movatter.jp