







交叉引用cross reference
本申请享有2021年12月16日提交的申请号为63/290,073的美国临时专利申请的优先权,该先前申请在此全文引用。This application benefits from US Provisional Patent Application No. 63/290,073, filed December 16, 2021, which is hereby incorporated by reference in its entirety.
技术领域technical field
本公开总体上涉及视频编码,并且更具体地,涉及用于利用并行化技术进行高效视频编码的方法和装置。The present disclosure relates generally to video coding, and more particularly, to methods and apparatus for efficient video coding using parallelization techniques.
背景技术Background technique
除非本文另有说明,否则本节中描述的方法不是下面列出的权利要求的现有技术,并且不因包含在本节中而被承认为现有技术。Unless otherwise indicated herein, the approaches described in this section are not prior art to the claims listed below and are not admitted to be prior art by inclusion in this section.
视频编码通常涉及通过编码器将视频(即,原始视频)编码成比特流,将比特流传输到解码器,以及通过解码器解析和处理比特流以产生重建的比特流。编码器可以在对视频进行编码时采用各种编码模式或工具,其目的之一是减少需要传输到解码器的比特流的总大小,同时仍然向解码器提供关于原始视频的足够信息,使得解码器可以生成非常忠实于原始视频的重建视频。例如,在2020年发布的最先进的视频编码标准通用视频编码(VVC)标准的最终版本中,其中新定义了各种编码工具以实现与上一代视频编码标准(即高效视频编码(HEVC)标准,自2013年发布以来的视频编码规范)相比约40%的编码增益(例如Bjontegaard Delta-Rate增益)。借助VVC提供的新编码工具,高性能视频编码成为可能,支持新的视频用例,例如视野相关的360°视频流,具有区域随机访问、信噪比可扩展性等高级功能(SNR)等。Video encoding generally involves encoding video (ie, raw video) into a bitstream by an encoder, transmitting the bitstream to a decoder, and parsing and processing the bitstream by the decoder to produce a reconstructed bitstream. Encoders can employ various encoding modes or tools when encoding video, one of the purposes of which is to reduce the overall size of the bitstream that needs to be transmitted to the decoder, while still providing the decoder with enough information about the original video to enable decoding The detector can produce reconstructed video that is very faithful to the original video. For example, in the final version of the most advanced video coding standard, the Versatile Video Coding (VVC) standard released in 2020, various coding tools are newly defined in order to achieve compatibility with the previous generation of video coding standards, namely the High Efficiency Video Coding (HEVC) standard. , a video coding specification published since 2013) compared to about 40% coding gain (eg Bjontegaard Delta-Rate gain). With the help of new encoding tools provided by VVC, high-performance video encoding is possible, supporting new video use cases, such as view-dependent 360° video streaming, with advanced features such as regional random access, signal-to-noise ratio scalability (SNR), and more.
例如,VVC标准包括与帧内预测相关的新编码工具,例如基于矩阵的帧内预测(MIP)、色度分离树(CST)、帧内子分区(ISP)和帧内块复制(IBM)。与帧间预测相关的新编码工具,例如自适应运动矢量分辨率(AMVR)、运动矢量差分合并模式(MMVD)、组合帧间/帧内预测(CIIP)和几何分区(GPM)也包含在VVC中。适用于图片内和图片间预测的新工具也包含在VVC中,例如采样自适应偏移(SAO)、自适应环路滤波器(ALF)、交叉分量自适应环路滤波器(CCALF)和色差联合编码(JCCR)。此外,与编码器块划分相关的新工具也包含在VVC中,例如三元树划分(TT)、二叉树三元树划分(BT_TT)、更大的最大编码树单元大小为64像素x 64像素(CTU64)、以及更大的最大变换单元大小为32像素x 32像素(TU32)。其他新开发的视频编码标准也跟随VVC的类似趋势,包括更多的编码工具以实现更好的编码性能。For example, the VVC standard includes new coding tools related to intra prediction, such as matrix-based intra prediction (MIP), chroma separation tree (CST), intra sub-partitioning (ISP), and intra block copying (IBM). New coding tools related to inter prediction, such as Adaptive Motion Vector Resolution (AMVR), Merge Motion Vector Difference (MMVD), Combined Inter/Intra Prediction (CIIP) and Geometric Partitioning (GPM) are also included in VVC middle. New tools for intra- and inter-picture prediction are also included in VVC, such as Sample Adaptive Offset (SAO), Adaptive Loop Filter (ALF), Cross Component Adaptive Loop Filter (CCALF), and Chromatic Difference Joint Code (JCCR). In addition, new tools related to encoder block partitioning are also included in VVC, such as triple tree partition (TT), binary tree triple tree partition (BT_TT), larger maximum coding tree unit size of 64 pixels x 64 pixels ( CTU64), and a larger maximum transform unit size of 32 pixels by 32 pixels (TU32). Other newly developed video coding standards follow a similar trend of VVC, including more coding tools for better coding performance.
因此,编码器需要使用的编码工具将取决于编码器被设计为支持哪个或哪些视频编码标准。随着视频编码标准的不断演进,标准中定义了越来越多的编码工具,因此期望通用的视频编码器能够实现各种编码工具。因此,对于要编码的每个图片或其一部分,编码器能够快速确定要应用于要编码的直接视频数据的优选的或其他合适的编码工具是非常重要的,以便以合理的编码成本实现所需的视频质量。Therefore, the encoding tools an encoder needs to use will depend on which video encoding standard or standards the encoder is designed to support. With the continuous evolution of video coding standards, more and more coding tools are defined in the standards, so it is expected that a general video coder can implement various coding tools. Therefore, for each picture or portion thereof to be encoded, it is important that the encoder be able to quickly determine the preferred or otherwise suitable encoding tool to apply to the direct video data to be encoded in order to achieve the desired video quality.
发明内容Contents of the invention
以下概述仅是说明性的,并不旨在以任何方式进行限制。即,提供以下概述以介绍本文描述的新颖的和非显而易见的技术的概念、亮点、好处和优点。选择的实现在下面的详细描述中进一步描述。因此,以下概述不旨在识别要求保护的主题的基本特征,也不旨在用于确定要求保护的主题的范围。The following overview is illustrative only and not intended to be limiting in any way. That is, the following overview is provided to introduce the concepts, highlights, benefits and advantages of the novel and non-obvious technologies described herein. Selected implementations are further described in the detailed description below. Accordingly, the following summary is not intended to identify essential features of the claimed subject matter, nor is it intended to be used to determine the scope of the claimed subject matter.
本公开的目的是提供与利用并行化技术进行视频编码有关的方案、概念、设计、技术、方法和设备。据信,利用本公开中的各种实施例,实现了包括改进的编码等待时间、简化的搜索存储器访问和/或减少的硬件开销在内的益处。The purpose of the present disclosure is to provide solutions, concepts, designs, techniques, methods and devices related to video encoding using parallelization technology. It is believed that with various embodiments in the present disclosure, benefits including improved encoding latency, simplified search memory access, and/or reduced hardware overhead are realized.
一方面,提出了一种使用优选编码工具对视频数据进行编码的方法。该方法可以涉及通过多个处理元件(PE)接收视频数据,每个处理元件被配置为针对相应的编码工具执行编码效率评估,以在执行编码效率评估时进行评估。在一些实施例中,每个PE可以是低复杂度率失真优化器(LC-RDO)。该方法随后可以涉及通过执行编码效率评估的多个PE中的每一个来计算特定于相应编码工具和视频数据的相应品质因数(FOM)。在一些实施例中,FOM可以是平方差和(SSD)、绝对差和(SAD)或绝对变换差和(SATD)。该方法还可以包括通过比较由多个PE计算的FOM来确定特定于视频数据的编码工具。在一些实施例中,该方法还可以涉及确定一组与所确定的编码工具有关的参数设置。最后,该方法可以使用确定的编码工具和参数设置对视频数据进行编码。In one aspect, a method of encoding video data using a preferred encoding tool is presented. The method may involve receiving video data by a plurality of processing elements (PEs), each processing element being configured to perform a coding efficiency evaluation for a respective coding tool to perform the evaluation when performing the coding efficiency evaluation. In some embodiments, each PE may be a Low Complexity Rate-Distortion Optimizer (LC-RDO). The method may then involve computing, by each of the plurality of PEs performing coding efficiency evaluation, a respective figure of merit (FOM) specific to the respective coding tool and video data. In some embodiments, the FOM may be the sum of squared differences (SSD), sum of absolute differences (SAD), or sum of absolute transformed differences (SATD). The method may also include determining an encoding tool specific to the video data by comparing the FOMs calculated by the plurality of PEs. In some embodiments, the method may also involve determining a set of parameter settings related to the determined encoding tool. Finally, the method can encode video data using defined encoding tools and parameter settings.
在一些实施例中,视频数据可以是编码块(CB),其被划分成形成列和行的阵列的多个子块。每个PE在接收视频数据时,可能一次连续接收多个子块。每个PE一次接收的子块的数量可以与涉及的PE的数量相同,即与待评估的编码工具的数量相同。在一些实施例中,PE可以使用蛇形扫描处理顺序来接收和处理视频数据以通过列或行进行处理。In some embodiments, video data may be a coded block (CB), which is divided into a number of sub-blocks forming an array of columns and rows. When each PE receives video data, it may continuously receive multiple sub-blocks at a time. The number of sub-blocks each PE receives at one time may be the same as the number of PEs involved, ie the same number of coding tools to be evaluated. In some embodiments, PEs may receive and process video data using a serpentine processing order to process by column or row.
在一些实施例中,子块可以存储在具有多个存储体的高速缓冲存储器中。存储体可以分为两组,其中每组可以具有与PE的数量一样多的存储体。在PE通过逐列蛇形或光栅扫描接收子块的情况下,将子块的任意两列相邻的子块分别存储在两组存储体中。在PE通过逐行蛇形扫描或光栅扫描接收子块的情况下,将任意两行相邻的子块分别存储在两组存储体中。In some embodiments, sub-blocks may be stored in a cache memory having multiple banks. Banks can be divided into two groups, where each group can have as many banks as the number of PEs. In the case that the PE receives the sub-blocks by column-by-column serpentine or raster scanning, any two adjacent sub-blocks of the sub-blocks are respectively stored in two groups of memory banks. In the case that the PE receives sub-blocks through progressive serpentine scanning or raster scanning, any two rows of adjacent sub-blocks are respectively stored in two groups of memory banks.
在另一方面,提出了一种装置,其包括高速缓冲存储器、处理器、多个处理元件(PE)和比较器。处理器被配置为根据视频数据特定的存储体分配方案将视频数据存储在高速缓存存储器中,其中存储体分配方案由处理器基于诸如视频数据的编码块的大小、视频数据子块的大小、以时间交错方式同时运行的PE的数量、用于处理视频数据子块的扫描顺序(例如,光栅扫描或蛇形扫描)等的各种因素来确定。每个PE被配置为将各自的编码模式或编码工具应用于视频数据,并且随后通过计算品质因数(FOM)例如平方差和(SSD),绝对差之和(SAD),或绝对转换差之和(SATD)来确定其编码效率。比较器用于比较PE计算出的FOM,从而确定编码工具。In another aspect, an apparatus is presented that includes a cache memory, a processor, a plurality of processing elements (PEs) and a comparator. The processor is configured to store the video data in the cache memory according to a video data specific bank allocation scheme, wherein the bank allocation scheme is determined by the processor based on, for example, the size of an encoded block of video data, the size of a sub-block of video data, and The number of PEs running simultaneously in a time-interleaved manner, the scanning order (eg, raster scan or serpentine scan) used to process video data sub-blocks is determined by various factors. Each PE is configured to apply its respective encoding mode or encoding tool to the video data, and then calculates a figure of merit (FOM) such as sum of square difference (SSD), sum of absolute difference (SAD), or sum of absolute transform difference (SATD) to determine its coding efficiency. The comparator is used to compare the FOM calculated by the PE to determine the encoding tool.
附图说明Description of drawings
附图被包括以提供对本公开的进一步理解并且并入并构成本公开的一部分。附图图示了本公开的实施方式,并且与描述一起用于解释本公开的原理。值得注意的是,附图不一定是按比例绘制的,因为一些组件可能被显示为与实际实施中的尺寸不成比例,以清楚地说明本公开的概念。The accompanying drawings are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this disclosure. The drawings illustrate the embodiments of the disclosure and, together with the description, serve to explain principles of the disclosure. It is worth noting that the drawings are not necessarily drawn to scale as some components may be shown out of scale to actual implementation in order to clearly illustrate the concepts of the present disclosure.
图1是根据本公开的实施方式的示例设计的图。FIG. 1 is a diagram of an example design according to an embodiment of the disclosure.
图2是根据本公开的实施方式的示例设计的图。FIG. 2 is a diagram of an example design according to an embodiment of the disclosure.
图3是根据本公开的实施方式的示例设计的图。3 is a diagram of an example design according to an embodiment of the disclosure.
图4是根据本公开的实施方式的示例设计的图。4 is a diagram of an example design according to an embodiment of the disclosure.
图5是根据本公开的实施方式的示例设计的图。5 is a diagram of an example design according to an embodiment of the disclosure.
图6是根据本公开的实施方式的示例编码效率评估装置的图。FIG. 6 is a diagram of an example encoding efficiency evaluation device according to an embodiment of the present disclosure.
图7是根据本公开的实施方式的示例过程的流程图。7 is a flowchart of an example process according to an embodiment of the disclosure.
图8是根据本公开的实施方式的示例电子系统的图。8 is a diagram of an example electronic system according to an embodiment of the disclosure.
具体实施方式Detailed ways
本文公开了要求保护的主题的详细实施例和实施方式。然而,应当理解,所公开的实施例和实施方式仅仅是可以以各种形式体现的要求保护的主题的说明。然而,本公开可以以许多不同的形式来体现,并且不应被解释为限于在此阐述的示例性实施例和实施方式。相反,提供这些示例性实施例和实施方式使得本公开的描述是透彻和完整的,并且将向本领域的技术人员充分传达本公开的范围。在下面的描述中,可以省略众所周知的特征和技术的细节以避免不必要地模糊所呈现的实施例和实现方式。Detailed examples and implementations of the claimed subject matter are disclosed herein. It is to be understood, however, that the disclosed examples and implementations are merely illustrations of claimed subject matter that can be embodied in various forms. However, this disclosure may be embodied in many different forms and should not be construed as limited to the example embodiments and implementations set forth herein. Rather, these exemplary embodiments and implementations are provided so that the description of the present disclosure will be thorough and complete, and will fully convey the scope of the present disclosure to those skilled in the art. In the following description, well-known features and technical details may be omitted to avoid unnecessarily obscuring the presented embodiments and implementations.
根据本公开的实施方式涉及与高效并行化视频编码和搜索存储器访问有关的各种技术、方法、方案和/或解决方案。根据本发明,可以单独或联合实施多种可能的方案。也就是说,虽然这些可能的解决方案可以在下面单独描述,但是这些可能的解决方案中的两个或更多个可以以一种或另一种组合来实现。Embodiments according to the present disclosure relate to various techniques, methods, schemes and/or solutions related to efficient parallelization of video encoding and search memory access. According to the invention, various possible solutions can be implemented individually or in combination. That is, while these possible solutions may be described individually below, two or more of these possible solutions may be implemented in one or another combination.
一、并行编码工具评估1. Parallel Coding Tool Evaluation
如上文别处所述,重要的是编码器(即,视频编码器)快速确定哪个编码工具适合于对即时视频数据进行编码。编码器因此将使用确定的编码工具而不是编码器也能够执行的其他编码模式来编码视频数据。编码器可以根据各种因素确定某个编码工具是最合适的,例如要编码的视频的特定属性、编码比特流的特定特征等。此外,视频数据的不同部分可以是使用不同的编码工具或模式进行编码。例如,视频的每个帧可以被划分成非重叠块,有时称为编码块(CB),并且每个帧可以被划分成多个切片(slice),每个切片具有非重叠块的相关组(group)。视频数据可以以每个切片(即,其编码块)用相应的编码工具编码的方式编码。As noted elsewhere above, it is important for an encoder (ie, a video encoder) to quickly determine which encoding tool is suitable for encoding the instant video data. The encoder will thus encode the video data using certain encoding tools rather than other encoding modes that the encoder is also capable of performing. An encoder can determine that a certain encoding tool is the most appropriate based on various factors, such as specific properties of the video being encoded, specific characteristics of the encoded bitstream, and so on. Additionally, different portions of the video data may be encoded using different encoding tools or modes. For example, each frame of a video can be divided into non-overlapping blocks, sometimes called coding blocks (CB), and each frame can be divided into slices, each slice having an associated group of non-overlapping blocks ( group). Video data can be coded in such a way that each slice (ie its coded block) is coded with a corresponding coding tool.
为了确定编码工具(即,用于编码即时视频数据或其切片的最合适的编码工具),编码器可能需要使用即时视频的至少一部分要编码的数据来评估若干候选编码工具。为了快速确定编码工具,评估过程的目的不是为了获得精细(即高度准确)的编码结果,而是针对每个候选编码工具及时获得粗略(即不太准确)的结果,以便比较结果,并据此确定编码工具。编码器随后将使用确定的编码工具对即时视频数据进行编码。该评估过程在下文中可互换地称为“编码工具评估过程”或“编码效率评估过程”。In order to determine an encoding tool (ie, the most suitable encoding tool for encoding the instant video data or a slice thereof), the encoder may need to evaluate several candidate encoding tools using at least a portion of the data to be encoded for the instant video. In order to identify coding tools quickly, the evaluation process is not aimed at obtaining fine (i.e. highly accurate) coding results, but rather to obtain coarse (i.e. less accurate) results in time for each candidate coding tool so that the results can be compared and based on this Identify coding tools. The encoder will then encode the live video data using the determined encoding tool. This evaluation process is hereinafter interchangeably referred to as "coding tool evaluation process" or "coding efficiency evaluation process".
值得注意的是,所确定的编码工具通常取决于待编码的视频数据。这是因为适用于对某种类型的视频数据进行编码的编码工具可能并不同样适用于对其他类型的视频数据进行编码。例如,当对主要包含自然图像的视频数据进行编码时与对主要包含屏幕内容的视频数据进行编码时,可以分别确定不同的编码工具。It is worth noting that the determined encoding tool usually depends on the video data to be encoded. This is because an encoding tool suitable for encoding one type of video data may not be equally suitable for encoding another type of video data. For example, when encoding video data mainly including natural images and when encoding video data mainly including screen content, different encoding tools may be respectively determined.
为了及时评估多个候选编码工具,编码器可以对评估过程采用并行化(parallelization)。即,两个或多个处理元件(processing element,PE)同时运行,每个PE评估各自候选编码工具的性能(例如,编码效率)。图1是根据本公开的实施方式的示例设计的图,其中呈现了并行编码工具评估方案100。在方案100中,并行化是由四个同时操作的处理单元(PE),即PE 130、131、132和133来实现的。PE 130-133中的每一个被配置为评估应用于存储在搜索存储器110中的视频数据的相应编码工具的编码效率。例如,PE 130被配置为对编码工具T0执行编码效率评估,PE 131用于对编码工具T1进行编码效率评估。同时,PE132和PE 133用于分别对编码工具T2和T3进行编码效率评估。每个编码工具T0、T1、T2和T3可以是VVC、HEVC或其他视频编码标准中定义的编码工具之一,例如上文别处所述的VVC编码工具。In order to evaluate multiple candidate encoding tools in time, the encoder can employ parallelization of the evaluation process. That is, two or more processing elements (processing elements, PEs) run concurrently, and each PE evaluates the performance (eg, coding efficiency) of a respective candidate coding tool. FIG. 1 is a diagram of an example design in which a parallel coding
如上所述,PE 130-133旨在及时评估编码工具的效率。因此,涉及低复杂度硬件和/或软件模块的简单评估算法通常用于实现PE。例如,PE 130-133中的每一个可以是低复杂度率失真优化器(low-complexity rate-distortion optimizer,LC-RDO),其被配置为通过执行相对简单的计算来评估编码工具的编码效率,例如空间像素滤波、绝对像素间差值计算、逐像素平方差值计算、逐像素变换差值计算。通常,PE 130-133中的每一个可以具有流水线结构或架构,其包括多个处理阶段。流水线结构被配置为通过顺序地将数据从一个阶段传递到下一个阶段来处理数据。在一些实施例中,PE 130-133可以从搜索存储器(高速缓存)110递增地获取视频数据用于处理。例如,PE 130-133中的每一个可以是具有流水线结构的LC-RDO,其包括水平滤波(HFIR)级,其后是垂直滤波(VFIR)级,其后是失真计算(DIST)级,然后是比较(COMP)级。LC-RDO可以使用流水线级递增地处理数据,其中每个级在每个流水线周期期间处理数据的不同部分。As mentioned above, PE 130-133 is designed to assess the efficiency of coding tools in a timely manner. Therefore, simple evaluation algorithms involving low-complexity hardware and/or software modules are usually used to implement PE. For example, each of the PEs 130-133 may be a low-complexity rate-distortion optimizer (LC-RDO) configured to evaluate the coding efficiency of a coding tool by performing a relatively simple calculation , such as spatial pixel filtering, absolute pixel-to-pixel difference calculation, pixel-by-pixel square difference calculation, and pixel-by-pixel transform difference calculation. In general, each of PEs 130-133 may have a pipelined structure or architecture that includes multiple processing stages. A pipeline structure is configured to process data by sequentially passing it from one stage to the next. In some embodiments, PEs 130-133 may incrementally fetch video data from search memory (cache) 110 for processing. For example, each of PEs 130-133 may be an LC-RDO with a pipelined structure comprising a horizontal filtering (HFIR) stage, followed by a vertical filtering (VFIR) stage, followed by a distortion calculation (DIST) stage, and then It is comparative (COMP) level. LC-RDO can incrementally process data using pipeline stages, where each stage processes a different portion of the data during each pipeline cycle.
每个PE 130-133使用相同的视频数据,即存储在搜索存储器110中的视频数据113,来评估相应编码工具的编码效率。在一些实施例中,视频数据113可以包括视频的编码块(CB)。编码工具160在方案100中确定后,用于对CB 113进行编码。编码工具160被确定为编码工具T0、T1、T2和T3之一。编码工具160由比较器150确定,比较器150被配置为比较由PE130-133生成的评估结果。每个PE130-133可以通过对视频数据113应用各自的编码工具来执行编码效率评估,从而生成评估结果。例如,PE 130可通过对视频数据113应用编码工具T0来执行编码效率评估,从而产生表现在品质因数(FOM)140中的评估结果。类似地,PE131、132和133中的每一个可以通过分别对视频数据113应用编码工具T1、T2和T3来执行编码效率评估,从而分别产生表现在FOM 141、142和143中的评估结果。在一些实施例中,FOM140-143中的每一个可以是所得到的编码视频与原始视频数据113之间的平方差之和(SSD)、绝对差之和(SAD)或绝对变换差之和(SATD),其中对视频数据113中的每个像素计算总和。比较器150可以比较FOM 140-143并确定编码工具T0、T1、T2和T3中的哪一个是编码工具160,其稍后将用于对CB 113进行编码。例如,每个FOM 140-143可以是相应的SSD值,并且比较器150可以比较FOM 140-143并且确定FOM 142具有FOM 140-143中的最低值。因此,比较器150从而可以决定编码工具T2是用于对视频数据113进行编码的编码工具160。Each PE 130-133 uses the same video data, ie, the
在一些实施例中,除了确定的编码工具160之外,比较器150还可以确定一组编码参数以与确定的编码工具160一起使用以对视频数据113进行编码。为此,一些PE 130-133可以配置为使用相同的编码工具但使用不同的编码参数设置进行操作。例如,T0和T1可以是相同的编码工具,而PE 130和131以应用于相同编码工具的不同编码参数设置操作,例如,第一组编码参数与第二组编码参数。得到的FOM 140和141将指示在第一组和第二组之间优选哪组编码参数。优选的编码参数集被包括作为确定的编码工具160的一部分。In some embodiments,
在一些实施例中,方案100可以涉及PE(例如,PE 130、131、132或133),其包括高复杂度率失真优化器(HC-RDO)来替代或增加到PE的LC-RDO。HC-RDO可以与PE的LC-RDO级联。与仅具有LC-RDO的PE的实施方式相比,具有HC-RDO的PE可以通过涉及更复杂的运算以更高精度来确定或以其他方式计算相应的FOM(即,FOM 140、141、142或143),尽管通常以更多的处理时间为代价。由于更高的准确性,由涉及HC-RDO的PE确定的编码工具160可能不同于由仅涉及LC-RDO的PE确定的编码工具160,并且可能更适合编码,具有增强的编码效率和/或表现。In some embodiments,
二、时间交错缓存访问2. Time interleaved cache access
搜索存储器110有时被称为“高速缓存”或“高速缓冲存储器”。高速缓存110被设计为在编码工具评估过程中用于存储视频数据例如CB 113的临时存储器,其中PE 130-133可以重复访问高速缓存110以加载CB 113的不同部分。然而,高速缓存110不能提供对每个PE130-133的同时访问。即,即使方案100说明PE 130-133可以通过数据总线120、121、122和123访问高速缓存110,上述高速缓存存储器110的属性要求在任何时候,数据总线120-123中只有一条可以“打开”,即,将数据从高速缓存110传输到一个PE 130 -133。由此可见,只有当高速缓存110被复制成多个副本时,PE 130-133之间的真正并行化才是可能的,每个副本由PE 130 -133中的相应一个访问。显然,复制高速缓存110不是一个有吸引力的并行化解决方案,因为复制副本的硬件成本很高并且可能不实用。
图2是根据本公开的实施方式的示例设计的图,其中在不复制高速缓存110的情况下实现实际意义的并行化。具体地,图2图示了时间交错(time-interleaving)高速缓存访问方法200,其中PE 130-133可以同时操作,其中在任何时间不超过一个数据总线120-123打开以访问高速缓存110。即,在任何给定时间,不超过一个PE 130-133可以从高速缓存110接收数据(例如,视频数据113)。FIG. 2 is a diagram of an example design in which substantial parallelization is achieved without duplicating
在开始编码效率评估过程之前,PE 130-133可以不加载或以其他方式从高速缓存110中读取全部视频数据113。相反,PE 130-133可以只加载视频数据113的一部分,例如CB113的一部分115。PE130-133可能不需要访问高速缓存110来加载CB 113的更多部分,直到部分115被处理。PE 130-133中的每一个可以具有内部存储器,通常被称为“队列缓冲器(line buffer)”,以存储当前正在加载的视频数据113的部分。PE可以访问队列缓冲器以检索视频数据113的部分用于编码工具评估过程。PE可以使用队列缓冲器来保存或存储视频数据113的部分直到高速缓存窗口再次打开,此时正在加载视频数据113的下一部分。队列缓冲器然后可以由当前加载的视频数据113的新部分来补充。PEs 130-133 may not load or otherwise read all of
在一些实施例中,CB 113可以被分成多个非重叠的子块,通常具有相同的大小(例如,4个像素的高度和4个像素的宽度)。也就是说,CB 113的子块可以形成CB 113的列和行的阵列。CB 113的部分115可以包括多个子块,例如,标有“0”、“1”、“2”、“3”、“4”、“5”。此外,如上文别处所述,PE 130-133中的每一个可以是由HFIR级、VFIR级、DIST级和COMP级组成的LC-RDO流水线。数据可以通过LC-RDO流水线的各个阶段,首先由HFIR阶段处理,然后由VFIR阶段处理,然后由DIST阶段处理,最后由COMP阶段处理。图2中提供了时间线299。表示前13个流水线周期的进展,即流水线周期1-13。In some embodiments, the
参考图2,PE 130-133以时间交错的方式访问高速缓存110。例如,在第一个流水线周期期间,轮到PE130访问高速缓存110(在图中由“读取”阶段指示),在此期间PE 130加载CB 113的子块“0”。PE131-133分别在接下来的三个流水线周期,即第二、第三和第四流水线周期中加载CB 113的子块“0”。在PE 131-133依次加载子块“0”之后,PE 130在第五流水线周期再次访问高速缓存110,在此期间PE130加载下一个子块(即,CB 113的子块“1”)。同样,PE 131-133分别在接下来的三个流水线周期(即第六、第七和第八个)中加载CB113的子块“1”。在PE 131-133依次加载子块“1”之后,PE 130在第九个流水线周期再次轮到访问高速缓存110,在此期间PE 130加载下一个子块(即,CB 113的子块“2”)。PE 131-133在接下来的三个流水线周期(即第十、第十一和第十二流水线周期)中分别加载CB 113的子块“2”。Referring to FIG. 2, PEs 130-133
由于PE 130-133对高速缓存110的访问是时间交错的,因此由于PE的流水线特性,PE 130-133内加载的子块的处理也是时间交错的。例如,PE 130在第四个流水线周期结束时完成处理CB 113的子块“0”(PE 130的COMP阶段完成处理子块“0”),而PE 131、132和133分别在第五、第六和第七流水线周期结束时完成对CB 113的子块“0”的处理。Since accesses to
根据时间交错高速缓存访问方法200,在任何给定时间,PE130-133中至多只有一个从高速缓存110加载子块数据。因此,并行化方案100仅利用高速缓存110的一个副本可以通过采用时间交错高速缓存访问方法200来实现。然而,方法200导致非常低的PE利用率。如图2所示,PE流水线阶段在大多数流水线周期中都是空闲的(即不处理任何数据)。在四个PE并行化的情况下,时间交错缓存访问方法200导致大约25%的PE利用率。在并行化方案中涉及超过四个PE的情况下,时间交错高速缓存访问方法200将导致甚至更低的PE利用率。According to time-interleaved
图3是根据本公开的实施方式的示例设计的图,其中示出了另一种时间交错缓存访问方法(即,方法300),其大大改善了方法200导致的低PE利用率。如图3所示、PE空闲时间比方法200少很多。事实上,经过多次流水线循环后,方法300的PE利用率接近100%。通过在每个缓存访问窗口加载CB 113的一个以上子块,方法200中的大部分PE空闲时间在方法300中被消除。例如,虽然方法200命令PE 130在第一个流水线周期期间仅从缓存110加载CB113的子块“0”,但方法300提倡加载四个子块,即在第一个流水线周期期间,加载块CB 113的子块“0”、“1”、“2”和“3”。假设子块“1”、“2”和“3”在与子块“0”在同一流水线周期中被加载和保存到PE 130的队列缓冲区,则可以提早子块“1”、“2”和“3”上的PE 130流水线操作的开始。例如,PE 130可以早在第二流水线周期开始处理子块“1”并在第五个流水线周期完成处理,与方法200相比提早了三个流水线周期。PE 130处理子块“2”的完成被进一步提早,从如图2所示的第十二个流水线周期拖拽至图3的第六流水线周期。FIG. 3 is a diagram of an example design showing another time-interleaved cache access method (ie, method 300 ) that greatly improves the low PE utilization caused by
具体而言,根据时间交错缓存访问方法300,PE 130-133中的每一个以子块的批次(batch)接收视频数据113,每个批次具有视频数据113的多个子块,每批中的子块的数量等于编码效率评估过程中并行操作的PE的数量。例如,在并行化方案100的编码效率评估过程中并行使用了4个PE(即PE 130-133),因此需要4个PE中的每一个,每次打开时间窗口以访问高速缓存110时,加载CB113的一批4个子块(例如,子块“0-3”、子块“4-7”或子块“8-11”),如时间交错高速缓存访问方法300所示。Specifically, according to the time-interleaved
在一些实施例中,高速缓存110可以被分成几个“体(bank)”(即,存储体)。缓存的体数量是缓存的一个重要参数,体的数量代表了可以同时从缓存中读取或写入的数据条目的数量。具体地,在任何时候,至多只有一个数据条目可以从存储体读取或写入到存储体。鉴于每个PE 130-133预期在一个流水线周期内接收CB 113的四个子块,因此高速缓存110需要具有至少四个存储体,其中四个子块在一个流水线周期中的批次分别存储在四个独立的存储体中。如下文别处所述,诸如高速缓存110必须至少具有的存储体的数量、以及视频数据113的哪些子块存储在哪些存储体中的考虑是实现中并行编码工具评估方案100与时间交错高速缓存访问方法300的结合的重要设计参数。In some embodiments,
三、子块扫描顺序3. Sub-block scanning order
如上文别处所述,编码块可被划分成多个子块,使得子块形成编码块的列和行的阵列。图4是根据本公开的实施方式的示例设计的图,其中CB 113被分成形成列和行的阵列的非重叠子块。具体地,CB 113如图所示。图4的CB大小为32像素宽和32像素高,而每个子块的大小为4x 4像素。因此,CB 113被分成64个子块,如411、412、451和452中的每一个所示。As described elsewhere above, a coding block may be divided into sub-blocks such that the sub-blocks form an array of columns and rows of the coding block. 4 is a diagram of an example design in which the
根据时间交错高速缓存访问方法300,PE 130-133中的每一个被设计成分批加载或以其他方式接收CB 113的子块,每批包含CB 113的四个连续子块。图4示出了PE 130-133可以用来接收CB 113的子块的两种类型的扫描顺序。具体地,PE 130-133可以使用称为“光栅扫描”的扫描顺序来接收CB 113的子块,如图表411和412所示。或称为“蛇形扫描”的扫描顺序,如图表451和452所示。可以逐列方式或逐行方式执行光栅扫描。逐行方式如图表411所示,其中PE 130-133从左到右加载CB 113第一行中的子块,然后加载CB113的第二行,也是从左到右,依此类推。逐列方式在图表412中示出,其中PE 130-133从上到下加载CB 113的第一列中的子块,随后加载CB 113的第二列,也从上到下,依此类推。According to time-interleaved
同样,蛇形扫描也可以逐列方式或逐行方式执行。在蛇形扫描中,扫描方向每行或每列交替。逐列蛇形扫描如图表451所示,其中PE130-133从上到下加载CB 113的第一列中的子块,然后从下到上加载CB 113的第二列,然后再次从上到下加载CB 113的第三列,依此类推。逐行蛇形扫描如图表452所示,其中PE 130-133从左到右加载CB 113的第一行中的子块,然后从右到左加载CB 113的第二行,然后再次从左到右加载CB 113的第三行,依此类推。Likewise, serpentine scanning can be performed column-by-column or row-by-row. In serpentine scanning, the scanning direction alternates per row or column. A column-by-column serpentine scan is shown in Diagram 451, where PE130-133 load subblocks in the first column of
如上文别处所述,PE 130-133中的每一个都需要根据时间交错高速缓存访问方法300一次(即,在流水线周期期间)加载一批四个子块。如图所示,对于CB 113,每列或每行可以恰好分两批加载,无论是光栅扫描还是蛇形扫描,在加载任意4个子块的批次时,都不会出现跨列跨行的情况。即,不存在这样的情况,其中在流水线周期期间获取的一批中的四个子块中的两个位于CB 113的两个相邻列或行中。As described elsewhere above, each of the PEs 130-133 needs to load a batch of four sub-chunks once (ie, during a pipeline cycle) according to the time-interleaved
为图4的子块分配对应的高速缓存体也可以很容易地确定。例如,高速缓存体分配422可用于图表412的光栅扫描和图表451的蛇形扫描。如高速缓存体分配422所示,高速缓存110需要具有四个存储体,即,如缓存体分配422所示的“0”、“1”、“2”、“3”。CB 113的子块根据缓存体分配422存储在缓存110中。即,每一列的第一个以及第五个子块存储在存储体“0”中;每列的第二个和第六个子块存储在存储体“1”中;每列的第三个和第七个子块存储在存储体“3”中;最后,每一列的第四个和第八个子块存储在存储体“4”中。Allocating corresponding cache banks for the sub-blocks of FIG. 4 can also be easily determined. For example,
然而,对于在行或列中具有更多或更少数量的子块的编码块,或者对于并行编码工具评估方案100中涉及的不同数量的并行PE,跨列或跨行的情况可能是不可避免的,相应的缓存体分配会变得更加复杂。对于这些情况,蛇形扫描处理顺序优于光栅扫描处理顺序,因为与光栅扫描相比,蛇形扫描的相应高速缓存体分配相对简单。可能难以找到或确定针对光栅扫描处理顺序的相应缓存体分配,因为跨列或跨行地址差异可能非常不同,具体取决于所使用的编码块的大小。相反,蛇扫描处理顺序在面对跨列或跨行场景时,地址差异有限。However, cases across columns or rows may be unavoidable for encoded blocks with a greater or lesser number of sub-blocks in rows or columns, or for different numbers of parallel PEs involved in the parallel encoding
图5是根据本公开的实施方式的示例设计的图,其中图示了可能的跨列蛇形扫描场景,而不管CB 113的大小。具体地,图表540图示了跨列的并行编码工具评估方案100中涉及四个PE的所有四种可能性场景。如图表540所示,四种可能性之间的最大地址差异等于子块高度的四倍。类似地,图表530说明了当并行编码工具评估方案100中涉及三个PE时跨列场景的所有三种可能性;三种可能性之间的最大地址差等于子块高度的三倍。同样地,图表550说明了当并行编码工具评估方案100中涉及五个PE时跨列场景的所有五种可能性;五种可能性之间的最大地址差异等于子块高度的五倍。FIG. 5 is a diagram of an example design illustrating a possible serpentine scan scenario across columns, regardless of the size of the
还如图所示。图5分别是图表530、540和550中所示场景的对应体分配,即高速缓存110的体分配532、542和552。高速缓存110的存储体可以分为两组(group),其中每组可以具有与PE的数量一样多的存储体。例如,在存储体分配542中,高速缓存110具有两组四个存储体,第一组由存储体“0”、“1”、“2”和“3”组成,第二组由存储体“4”、“5”、“6”和“7”组成。第一组中的体从上到下重复分配给每个奇数列(即第一、第三、第五、第七、第九和第十一列等)的子块,而第二组中的体被重复分配给每个偶数列的子块(即第二、第四、第六、第八、第十和第十二列等),也是从上到下。因此,任意两列相邻的子块分别存储在两组存储体中。作为另一个示例,在存储体分配552中,高速缓存110具有两组五个存储体,第一组由存储体“0”、“1”、“2”、“3”和“4”组成,第二组由存储体“5”、“6”、“7”、“8”和“9”组成。第一组中的存储体从上到下重复分配给每个奇数列的子块,而第二组中的存储体重复分配给每个偶数列的子块列,也是从上到下。因此,任意两列相邻的子块分别存储在两组存储体中。Also as shown. 5 are the corresponding volume allocations for the scenarios shown in
四、说明性实施4. Illustrative implementation
图6示出了能够使用上述并行化方法评估多个编码工具的编码效率的示例装置600。如图所示,装置600接收用于评估编码工具的视频数据601,并相应地确定适合于对视频数据601进行编码的编码工具660。在一些实施例中,装置600还可以确定编码参数的设置666,其将与确定的编码工具660一起使用。视频数据601可以包括编码块113,而确定的编码工具660可以是编码工具160的实施例。装置600可以用于实现并行编码工具评估方案100使用时间交错高速缓存访问方法200或300。FIG. 6 shows an
如图所示,装置600具有用于处理视频数据601和确定编码工具660的若干组件或模块,至少包括一些元件选自处理器605、搜索存储器或高速缓存610、多个处理元件(例如作为PE 631-634)、存储器640和比较器650。高速缓存610可以包括多个存储体,例如存储体611-614,每个存储体611-614能够与其余存储体同时提供相应的数据条目。As shown,
在一些实施例中,如上所列的模块605-650是由计算设备或电子设备的一个或多个处理单元(例如,处理器)执行的软件指令的模块。在一些实施例中,模块605-650是由电子装置的一个或多个集成电路(IC)实现的硬件电路模块。尽管模块605-650被示为单独的模块,但是一些模块可以组合成单个模块。In some embodiments, the modules 605-650 listed above are modules of software instructions executed by one or more processing units (eg, processors) of a computing device or electronic device. In some embodiments, modules 605-650 are hardware circuit modules implemented by one or more integrated circuits (ICs) of an electronic device. Although modules 605-650 are shown as separate modules, some modules may be combined into a single module.
处理器605被配置为接收和分析视频数据601,从而确定存储体分配(例如,存储体分配422、532、542或552)。即,存储体分配是特定于视频数据601的。处理器605还被配置为根据确定的存储体分配将视频数据601的子块存储在搜索存储器610中。Processor 605 is configured to receive and analyze
高速缓存610可以包括多个存储体,例如存储体611、612、613和614。存储体的数量可以与存储体分配中指示的存储体的数量一致(例如,等于),其由处理器605决定。高速缓存610的存储体的数量可以多于图6所示的四个存储体。例如,存储体分配542指示八个不同的存储体用于蛇形扫描。处理器605可以相应地将视频数据601存储在高速缓存610的八个不同的存储体中。高速缓存610可以包含搜索存储器110。Cache 610 may include multiple memory banks, such as
处理元件631-634中的每一个可以是PE 130-133之一的实施例。在一些实施例中,处理元件631-634中的每一个可以是低复杂度RDO流水线。在一些实施例中,处理元件631-634中的每一个可以附加地或备选地包括高复杂度RDO。处理元件631-634可以被配置为通过以时间交错方式访问高速缓存610来获取视频数据601的一部分(例如,遵循时间交错方法200或300)。一次获取的视频数据部分601可以包括视频数据601的多个子块(例如,CB113的部分115的子块0-3、4-7或8-11)。在一些实施例中,每个处理元件631-634可以包括队列缓冲器,其被配置为临时存储从缓存610中批量提取的子块,直到该批次的所有子块经由各个处理元件的流水线阶段都被处理完为止。Each of processing elements 631-634 may be an embodiment of one of PEs 130-133. In some embodiments, each of processing elements 631-634 may be a low-complexity RDO pipeline. In some embodiments, each of processing elements 631-634 may additionally or alternatively include a high-complexity RDO. Processing elements 631-634 may be configured to fetch a portion of
处理元件631-634中的每一个还可以被配置为计算对于视频数据601、指示所应用的相应编码工具的编码效率的相应品质因数(FOM)(例如,FOM 140、141、142或143)。因此,FOM特定于相应的编码工具和视频数据601。也就是说,FOM特定于相应的编码工具和视频数据601的组合。FOM可以是总和平方差、绝对差之和或绝对变换差之和。处理元件631-634计算出的FOM可以存储在存储器640中并用作比较器650的输入。在一些实施例中,处理元件631-634还可以存储用于计算FOM的编码参数。在一些实施例中,PE 631-634中的每一个可以使用相同的编码工具但具有不同的编码参数设置来计算视频数据601的多个FOM。即,在这些实施例中,每个计算出的FOM特定于相应的编码工具、相应的编码参数和视频数据601的组合。每个FOM和相应的编码参数设置都可以保存在存储器640中。Each of the processing elements 631-634 may also be configured to calculate, for the
比较器650可以是比较器150的一个实施例,并被配置为通过比较由处理元件631-634计算并存储在存储器640中的FOM来确定编码工具660。比较器650的比较可以确定首选FOM。例如,优选的FOM可以是具有最低值的SAD。因此,导致SAD的最低值的编码工具可以被确定为编码工具660。在一些实施例中,比较器650还可以确定参数设置666,其可以是处理元件631-634使用的参数设置导致首选FOM(例如,具有最低的SAD值)。Comparator 650 may be an embodiment of
五、说明过程5. Explain the process
图7图示了根据本公开的实施方式的示例过程700。过程700可以表示实现上述各种提议的设计、概念、方案、系统和方法的方面。更具体地,过程700可以表示与根据本公开在多个编码工具中确定编码工具有关的所提出的概念和方案的方面。过程700可包括如框710、720、730和740中的一者或一者以上所说明的一或多个操作、动作或功能。虽然说明为离散的框,但过程700的各种框可分为额外的框、组合成更少的框块,或消除,这取决于所需的实施。此外,过程700的方框/子方框可以图7所示的顺序执行,或者以不同的顺序。此外,可以重复或迭代地执行过程700的一个或多个块/子块。过程700可由装置600及其任何变型实施或在装置600中实施。仅出于说明的目的并且不限制范围,过程700在下面在装置600的上下文中被描述。过程700可以开始于块710。FIG. 7 illustrates an
在710,过程700可以涉及装置600的每个处理元件(例如,PE631-634)接收要在执行编码效率评估中评估的视频数据(例如,视频数据113或601)。每个处理元件被配置为针对相应的编码工具(例如,图1的编码工具T0、T1、T2或T3)执行编码效率评估。在一些实施例中,装置600的PE通过以时间交错的方式访问缓存610来接收视频数据601。也就是说,在任何时候,装置600的不超过一个PE可以访问缓存610。在一些实施例中,视频数据601可以包括编码块(CB),编码块可以被划分成多个形成列或行阵列的子块。装置600的PE可以以子块的批次接收CB,每个批次具有多个子块。在一些实施例中,一批子块的数量等于装置600的同时操作的PE的数量。在一些实施例中,视频数据601的子块可以由装置600的PE使用通过视频数据601的子块的列或行的蛇形扫描来获取。过程700可以从710继续到720。At 710,
在720,过程700可以涉及装置600的每个PE计算相应的FOM。在一些实施例中,每个PE可以是LC-RDO,并且相应的FOM可以是平方差和(SSD)、绝对差和(SAD)或绝对变换差和(SATD)。由装置600的PE计算的FOM可以存储在存储器640中。在一些实施例中,用于计算FOM的编码参数也可以存储在存储器640中。过程700可以从720进行到730。At 720,
在730,过程700可以涉及比较器650比较存储在存储器640中的FOM并且相应地确定编码工具660,其特定于视频数据601。在一些实施例中,比较器650可以确定与确定的编码工具660一起使用的参数设置666。确定的参数设置666可以是包括多个编码参数的值的一组设置。过程700可以从730进行到740。At 730 ,
在740,过程700可以涉及处理器605使用确定的编码工具660对视频数据601进行编码。在一些实施例中,处理器605可以使用确定的编码工具660和确定的参数设置666对视频数据601进行编码.At 740 ,
六、说明性电子系统6. Illustrative Electronic System
许多上述特征和应用被实现为软件过程,这些软件过程被指定为记录在计算机可读存储介质(也称为计算机可读介质)上的一组指令。当这些指令由一个或多个计算或处理单元(例如,一个或多个处理器、处理器核心或其他处理单元)执行时,它们会导致处理单元执行指令中指示的动作。计算机可读介质的示例包括但不限于CD-ROM、闪存驱动器、随机存取存储器(RAM)芯片、硬盘驱动器、可擦除可编程只读存储器(EPROM)、电可擦除可编程只读存储器(EEPROM))等。计算机可读介质不包括无线或通过有线连接传递的载波和电子信号。Many of the above-described features and applications are implemented as software processes specified as a set of instructions recorded on a computer-readable storage medium (also referred to as a computer-readable medium). These instructions, when executed by one or more computing or processing units (eg, one or more processors, processor cores or other processing units), cause the processing units to perform the actions indicated in the instructions. Examples of computer readable media include, but are not limited to, CD-ROMs, flash drives, Random Access Memory (RAM) chips, hard drives, Erasable Programmable Read Only Memory (EPROM), Electrically Erasable Programmable Read Only Memory (EEPROM)) etc. Computer-readable media exclude carrier waves and electronic signals transmitted wirelessly or through wired connections.
在本说明书中,术语“软件”意味着包括驻留在只读存储器中的轫体或存储在存储器中的应用程序,其可以被读入存储器以供处理器处理。此外,在一些实施例中,多个软件发明可以作为较大程序的子部分来实现,同时保留不同的软件发明。在一些实施例中,多个软件发明也可以被实现为单独的程序。最后,一起实现这里描述的软件发明的单独程序的任何组合都在本公开的范围内。在一些实施例中,当软件程序被安装以在一个或多个电子系统上运行时,定义了一个或多个执行和执行软件程序的操作的特定机器实现。In this specification, the term "software" is meant to include firmware residing in read-only memory or application programs stored in memory that can be read into memory for processing by a processor. Furthermore, in some embodiments, multiple software inventions may be implemented as sub-parts of a larger program, while maintaining distinct software inventions. In some embodiments, multiple software inventions may also be implemented as separate programs. Finally, any combination of separate programs that together implement the software inventions described herein is within the scope of the present disclosure. In some embodiments, when a software program is installed to run on one or more electronic systems, one or more specific machine implementations that execute and perform the operations of the software program are defined.
图8概念性地图示了实现本公开的一些实施例的电子系统800。电子系统800可以是计算机(例如台式计算机、个人计算机、平板计算机等)、电话、PDA或任何其他种类的电子设备。这样的电子系统包括各种类型的计算机可读介质和用于各种其他类型的计算机可读介质的接口。电子系统800包括总线805、处理单元810、图形处理单元(GPU)815、系统存储器820、网络825、只读存储器(ROM)830、永久存储设备835、输入设备840和输出设备845。Figure 8 conceptually illustrates an
总线805共同表示通信连接电子系统800的众多内部设备的所有系统、外围设备和芯片组总线。例如,总线805通信连接处理单元810与GPU 815,GPU 815、只读存储器830、系统存储器820和永久存储设备835。Collectively,
从这些不同的存储器单元,处理单元810检索要执行的指令和要处理的数据以便执行本公开的处理。在不同的实施例中,处理单元可以是单处理器或多核处理器。一些指令被传递到GPU 815并由其执行。GPU 815可以卸载各种计算或补充由处理单元810提供的图像处理。From these various memory units, processing
只读存储器(ROM)830存储由处理单元810和电子系统的其他模块使用的静态数据和指令。另一方面,永久存储设备835是读写存储设备。该设备是即使在电子系统800关闭时也存储指令和数据的非易失性存储单元。本公开的一些实施例使用大容量存储设备(例如磁盘或光盘及其相应的磁盘驱动器)作为永久存储设备835。Read Only Memory (ROM) 830 stores static data and instructions used by processing
其他实施例使用可移动存储设备(例如软盘、闪存设备等,及其对应的磁盘驱动器)作为永久存储设备。与永久存储设备835一样,系统存储器820是读写存储设备。然而,与存储设备835不同,系统存储器820是易失性读写存储器,例如随机存取存储器。系统存储器820存储处理器在运行时使用的一些指令和数据。在一些实施例中,根据本公开的过程存储在系统存储器820、永久存储设备835和/或只读存储器830中。例如,各种存储器单元包括用于处理多媒体剪辑的指令与一些实施例。从这些不同的存储器单元,处理单元810检索要执行的指令和要处理的数据以便执行一些实施例的过程。Other embodiments use removable storage devices (eg, floppy disks, flash memory devices, etc., and their corresponding disk drives) as permanent storage devices. Like
总线805还连接到输入和输出设备840和845。输入设备840使用户能够向电子系统传送信息和选择命令。输入设备840包括字母数字键盘和定点设备(也称为「光标控制设备」)、相机(例如网络摄像头)、麦克风或用于接收语音命令的类似设备等。输出设备845显示由电子系统生成的图像或否则输出数据。输出设备845包括打印机和显示设备,例如阴极射线管(CRT)或液晶显示器(LCD),以及扬声器或类似的音频输出设备。一些实施例包括同时用作输入和输出设备的设备,例如触摸屏。
最后,如图8所示,总线805还通过网络适配器(未示出)将电子系统800耦合到网络825。以这种方式,计算机可以是计算机网络的一部分,例如局域网(“LAN”)、广域网(“WAN”)或内联网,电子系统800的任何或所有组件可结合本公开使用。Finally, as shown in FIG. 8,
一些实施例包括在机器可读或计算机可读介质(或者称为计算机可读存储介质、机器可读介质或机器-可读存储介质)。此类计算机可读介质的一些示例包括RAM、ROM、只读光盘(CD-ROM)、可记录光盘(CD-R)、可重写光盘(CD-RW)、只读数字多功能光盘(例如,DVD-ROM,双层DVD-ROM),各种可刻录/可重写DVD(例如,DVD-RAM,DVD-RW,DVD+RW,等等),闪存(例如,SD卡,mini-SD卡、微型SD卡等)、磁性和/或固态硬盘驱动器、只读和记录可用的光盘、超密度光盘、任何其他光学或磁性媒体以及软盘。计算机可读介质可以存储可由至少一个处理单元执行并且包括用于执行各种操作的指令集的计算机程序。计算机程序或计算机代码的示例包括机器代码,例如由编译器生成的机器代码,以及包括由计算机、电子组件或使用解释器的微处理器执行的高级代码的文件。Some embodiments are embodied on a machine-readable or computer-readable medium (alternatively referred to as a computer-readable storage medium, machine-readable medium, or machine-readable storage medium). Some examples of such computer readable media include RAM, ROM, Compact Disc Read Only (CD-ROM), Compact Disc Recordable (CD-R), Compact Disc Rewritable (CD-RW), Digital Versatile Disc Read Only (e.g. , DVD-ROM, Dual Layer DVD-ROM), various recordable/rewritable DVDs (e.g., DVD-RAM, DVD-RW, DVD+RW, etc.), flash memory (e.g., SD card, mini-SD card, micro SD card, etc.), magnetic and/or solid-state hard drives, read-only and logging available Compact discs, ultra-density discs, any other optical or magnetic media, and floppy disks. The computer-readable medium may store a computer program executable by at least one processing unit and including a set of instructions for performing various operations. Examples of computer programs or computer code include machine code, such as generated by a compiler, and files including high-level code executed by a computer, electronic component, or microprocessor using an interpreter.
虽然以上讨论主要涉及执行软件的微处理器或多核处理器,但许多上述特征和应用是由一个或多个集成电路执行的,例如专用集成电路(ASIC)或现场可编程门阵列(FPGA)。在一些实施例中,这样的集成电路执行存储在电路本身上的指令。此外,一些实施例执行存储在可编程逻辑设备(PLD)、ROM或RAM设备中的软件。While the above discussion primarily refers to microprocessors or multi-core processors executing software, many of the above features and applications are performed by one or more integrated circuits, such as application specific integrated circuits (ASICs) or field programmable gate arrays (FPGAs). In some embodiments, such integrated circuits execute instructions stored on the circuit itself. Additionally, some embodiments execute software stored in a programmable logic device (PLD), ROM or RAM device.
如在本说明书和本申请的任何权利要求中使用的,术语“计算机”、“服务器”、“处理器”和“存储器”均指电子或其他技术设备。这些术语不包括人或人群。出于说明书的目的,术语显示或显示表示在电子设备上显示。如本说明书和本申请的任何权利要求中所使用,术语“计算机可读介质”、“计算机可读介质”和“机器可读介质”完全限于以可读形式存储信息的有形物理对象。一台电脑。这些术语不包括任何无线信号、有线下载信号和任何其他临时信号。尽管已经参考许多具体细节描述了本公开,但是本领域的普通技术人员将认识到,在不脱离本公开的精神的情况下,可以以其他具体形式来实施本公开。As used in this specification and any claims of this application, the terms "computer", "server", "processor" and "memory" all refer to electronic or other technological devices. These terms do not include persons or groups of people. For the purpose of the description, the term display or display means displaying on the electronic device. As used in this specification and any claims of this application, the terms "computer-readable medium", "computer-readable medium" and "machine-readable medium" are strictly limited to tangible physical objects that store information in a readable form. a computer. These terms exclude any wireless signals, wired download signals and any other temporary signals. Although the present disclosure has been described with reference to numerous specific details, those skilled in the art will recognize that the present disclosure can be embodied in other specific forms without departing from the spirit of the disclosure.
补充说明Supplementary Note
此处描述的主题有时说明不同的组件包含在不同的其他组件内或与不同的其他组件连接。应当理解,这样描绘的架构仅仅是示例,并且实际上可以实现相同功能的许多其他架构。从概念上讲,实现相同功能的任何组件排列都是有效“关联”的,从而实现所需的功能。因此,此处组合以实现特定功能的任何两个组件可以被视为彼此“相关联”以使得实现期望的功能,而不管架构或中间组件如何。同样,如此关联的任何两个组件也可被视为彼此“可操作地连接”或“可操作地耦合”以实现所需的功能,并且能够如此关联的任何两个组件也可被视为“可操作地连接”,彼此实现所需的功能。可操作地耦合的具体示例包括但不限于物理上可配合和/或物理上交互的组件和/或无线上可交互和/或无线上交互的组件和/或逻辑上交互和/或逻辑上可交互的组件。The subject matter described herein sometimes illustrates that different components are contained within, or connected with, different other components. It is to be understood that such depicted architectures are merely examples, and that in fact many other architectures are possible which achieve the same functionality. Conceptually, any arrangement of components to achieve the same functionality is effectively "associated" such that the desired functionality is achieved. Hence, any two components herein combined to achieve a particular functionality can be seen as "associated with" each other such that the desired functionality is achieved, irrespective of architectures or intermedial components. Likewise, any two components so related can also be considered to be "operably connected" or "operably coupled" to each other to achieve the desired functionality, and any two components capable of being so related can also be considered to be "operably connected" Operably connected" to each other to achieve the desired function. Specific examples of operably coupled include, but are not limited to, physically mateable and/or physically interacting components and/or wirelessly interactable and/or wirelessly interactable components and/or logically interactable and/or logically interactable components Interactive components.
此外,关于本文中基本上任何复数和/或单数术语的使用,本领域技术人员可以根据上下文从复数翻译成单数和/或从单数翻译成复数和/或申请。为了清楚起见,可以在本文中明确地阐述各种单数/复数排列。Furthermore, with respect to the use of substantially any plural and/or singular term herein, those skilled in the art can translate from plural to singular and/or from singular to plural and/or apply depending on the context. Various singular/plural permutations may be explicitly set forth herein for the sake of clarity.
此外,本领域技术人员将理解,一般而言,本文使用的术语,尤其是在所附权利要求中,例如所附权利要求的主体,通常意在作为开放术语,例如,“包括”一词应解释为“包括但不限于”,“有”一词应解释为“至少有”,“包括”一词应解释为“包括但不限于”,等。本领域的技术人员将进一步理解,如果意图引入特定数量的权利要求陈述,则该意图将在权利要求中明确地陈述,并且在没有该陈述的情况下不存在该意图。例如,为了帮助理解,以下所附权利要求可能包含使用介绍性短语“至少一个”和“一个或多个”来介绍权利要求的叙述。然而,使用此类短语不应被解释为暗示通过不定冠词“一”或“一个”引入的权利要求将包含此类引入的权利要求的任何特定权利要求限制为仅包含一个此类权利要求的实施方式,即使当同一权利要求包括介绍性短语“一个或多个”或“至少一个”和不定冠词,例如“一”或“一个”,应解释为“至少一个”或“一个或多个”;这同样适用于使用定冠词来引入索赔陈述。此外,即使明确引用了引入的权利要求记载的特定数量,本领域技术人员将认识到,这种记载应被解释为至少是指被引用的数量,例如,仅引用“两次引用”,而不其他修饰语,表示至少两次背诵,或者两次或更多次背诵。此外,在那些约定类似于“A、B和C等中的至少一个”的情况下。被使用,一般来说,这样的结构意在本领域技术人员会理解约定的意义上,例如,“具有A、B和C中的至少一个的系统”将包括但不限于这样的系统单独有A,单独有B,单独有C,A和B在一起,A和C在一起,B和C在一起,和/或A、B和C在一起,等等。在那些类似于“至少一个”的约定的情况下A、B或C等。通常这样的结构意在本领域技术人员理解约定的意义上,例如,“具有A、B或C中的至少一个的系统”将包括但不限于系统具有单独的A、单独的B、单独的C、A和B在一起、A和C在一起、B和C在一起和/或A、B和C在一起等。本领域技术人员将进一步理解实际上无论是在说明书、权利要求书还是附图中,任何出现两个或更多替代术语的分离词和/或短语都应该被理解为考虑包括一个术语、一个术语或两个术语的可能性。例如,短语“A或B”将被理解为包括“A”或“B”或“A和B”的可能性。Furthermore, those skilled in the art will understand that terms used herein in general, and especially in the appended claims, such as the subject of the appended claims, are generally intended as open terms, for example, the word "comprising" shall The word "including but not limited to" should be interpreted as "including but not limited to", the word "have" should be interpreted as "at least", the word "including" should be interpreted as "including but not limited to", etc. It will be further understood by those within the art that if a specific number of a claim recitation is intended, such an intent will be expressly recited in the claim, and in the absence of such recitation no such intent is present. For example, as an aid to understanding, the following appended claims may contain usage of the introductory phrases "at least one" and "one or more" to introduce claim recitations. However, use of such phrases should not be construed to imply that a claim introduced by the indefinite article "a" or "an" limits any particular claim containing such an introduced claim to those containing only one such claim. Embodiment, even when the same claim includes the introductory phrase "one or more" or "at least one" and an indefinite article, such as "a" or "an", it should be interpreted as "at least one" or "one or more ”; the same applies to the use of the definite article to introduce a statement of claim. Furthermore, even if a specific number of an introduced claim recitation is expressly recited, those skilled in the art will recognize that such recitation should be interpreted to mean at least the recited number, for example, simply citing "twice cited" and not Other modifiers, indicating at least two recitations, or two or more recitations. Also, where those conventions are like "at least one of A, B, and C, etc." are used, generally, such structures are intended in the agreed sense that those skilled in the art would understand, e.g., "a system having at least one of A, B, and C" would include, but not be limited to, such a system having A alone , there is B alone, there is C alone, A and B together, A and C together, B and C together, and/or A, B and C together, etc. A, B, or C, etc. in those cases where there is a convention like "at least one". Typically such structures are intended in the sense that those skilled in the art understand the convention, for example, "a system having at least one of A, B, or C" would include, but not be limited to, a system having A alone, B alone, C alone , A and B together, A and C together, B and C together, and/or A, B and C together, etc. Those skilled in the art will further understand that in fact, whether in the specification, claims or drawings, any separate word and/or phrase where two or more alternative terms appear should be understood as including a term, a term or two term possibilities. For example, the phrase "A or B" will be understood to include the possibilities of "A" or "B" or "A and B."
从上文中可以理解,为了说明的目的,本文已经描述了本公开的各种实施方式,并且在不脱离本公开的范围和精神的情况下可以进行各种修改。因此,本文公开的各种实施方式并非旨在限制,真正的范围和精神由所附权利要求指示。From the foregoing it will be appreciated that various embodiments of the present disclosure have been described herein for purposes of illustration and that various modifications may be made without departing from the scope and spirit of the present disclosure. Therefore, the various embodiments disclosed herein are not intended to be limiting, with the true scope and spirit being indicated by the appended claims.
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202211637941.7ACN116366851A (en) | 2022-12-16 | 2022-12-16 | Video data encoding method and device |
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202211637941.7ACN116366851A (en) | 2022-12-16 | 2022-12-16 | Video data encoding method and device |
| Publication Number | Publication Date |
|---|---|
| CN116366851Atrue CN116366851A (en) | 2023-06-30 |
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202211637941.7APendingCN116366851A (en) | 2022-12-16 | 2022-12-16 | Video data encoding method and device |
| Country | Link |
|---|---|
| CN (1) | CN116366851A (en) |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1675933A (en)* | 2002-06-18 | 2005-09-28 | 高通股份有限公司 | Video encoding and decoding techniques |
| US20070290901A1 (en)* | 2004-06-02 | 2007-12-20 | Koninklijke Philips Electronics, N.V. | Encoding and Decoding Apparatus and Corresponding Methods |
| CN101202914A (en)* | 2006-12-01 | 2008-06-18 | 汤姆森许可贸易公司 | Array of processing elements with local registers |
| US20100215105A1 (en)* | 2007-09-13 | 2010-08-26 | Nippon Telegraph And Telephone Corp. | Motion search apparatus in video coding |
| US20120147023A1 (en)* | 2010-12-14 | 2012-06-14 | Electronics And Telecommunications Research Institute | Caching apparatus and method for video motion estimation and compensation |
| WO2013128010A2 (en)* | 2012-03-02 | 2013-09-06 | Canon Kabushiki Kaisha | Method and devices for encoding a sequence of images into a scalable video bit-stream, and decoding a corresponding scalable video bit-stream |
| CN105850132A (en)* | 2014-01-07 | 2016-08-10 | 联发科技股份有限公司 | Method and device for color index prediction |
| JP2016201784A (en)* | 2015-04-09 | 2016-12-01 | 日本電信電話株式会社 | Reference image buffer |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1675933A (en)* | 2002-06-18 | 2005-09-28 | 高通股份有限公司 | Video encoding and decoding techniques |
| US20070290901A1 (en)* | 2004-06-02 | 2007-12-20 | Koninklijke Philips Electronics, N.V. | Encoding and Decoding Apparatus and Corresponding Methods |
| CN101202914A (en)* | 2006-12-01 | 2008-06-18 | 汤姆森许可贸易公司 | Array of processing elements with local registers |
| US20100215105A1 (en)* | 2007-09-13 | 2010-08-26 | Nippon Telegraph And Telephone Corp. | Motion search apparatus in video coding |
| US20120147023A1 (en)* | 2010-12-14 | 2012-06-14 | Electronics And Telecommunications Research Institute | Caching apparatus and method for video motion estimation and compensation |
| WO2013128010A2 (en)* | 2012-03-02 | 2013-09-06 | Canon Kabushiki Kaisha | Method and devices for encoding a sequence of images into a scalable video bit-stream, and decoding a corresponding scalable video bit-stream |
| CN105850132A (en)* | 2014-01-07 | 2016-08-10 | 联发科技股份有限公司 | Method and device for color index prediction |
| JP2016201784A (en)* | 2015-04-09 | 2016-12-01 | 日本電信電話株式会社 | Reference image buffer |
| Publication | Publication Date | Title |
|---|---|---|
| TWI677234B (en) | Secondary transform kernel size selection | |
| EP3334162B1 (en) | Partial decoding for arbitrary view angle and line buffer reduction for virtual reality video | |
| US9392292B2 (en) | Parallel encoding of bypass binary symbols in CABAC encoder | |
| CN101836454B (en) | Method and device for performing parallel CABAC code processing on ordered entropy slices | |
| KR101105531B1 (en) | Mechanism for a parallel processing in-loop deblock filter | |
| US9948934B2 (en) | Estimating rate costs in video encoding operations using entropy encoding statistics | |
| CN1812576B (en) | Deblocking filters for performing horizontal and vertical filtering of video data simultaneously and methods of operating the same | |
| CN110476426A (en) | Multiple conversion estimation | |
| US20140056365A1 (en) | Method for performing parallel coding with ordered entropy slices, and associated apparatus | |
| TWI699111B (en) | Midpoint prediction error diffusion for display stream compression | |
| TWI832628B (en) | Video coding method and apparatus thereof | |
| US20080298473A1 (en) | Methods for Parallel Deblocking of Macroblocks of a Compressed Media Frame | |
| CN103238321A (en) | Video encoding method for encoding hierarchical-structure symbols and a device therefor, and video decoding method for decoding hierarchical-structure symbols and a device therefor | |
| TW201320760A (en) | Video decoding method and related computer readable medium | |
| CN106534850B (en) | Image processing device, image interpolation method, and image coding method | |
| CN101252694A (en) | Frame store compression and address mapping system for block-based video decoding | |
| AU2011203169B2 (en) | Compression of high bit-depth images | |
| TWI832449B (en) | Method and apparatus for video coding | |
| TWI833327B (en) | Video coding method and apparatus thereof | |
| TWI805534B (en) | Video encoding parallelization with time-interleaving cache access | |
| US10728557B2 (en) | Embedded codec circuitry for sub-block based entropy coding of quantized-transformed residual levels | |
| CN116366851A (en) | Video data encoding method and device | |
| US11973985B2 (en) | Video encoder with motion compensated temporal filtering | |
| US20110311152A1 (en) | Image processing device and image processing method | |
| US10939107B2 (en) | Embedded codec circuitry for sub-block based allocation of refinement bits |
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |