




技术领域technical field
本公开涉及计算机技术领域,尤其涉及一种支持编码、解码的深度学习处理装置及方法。The present disclosure relates to the field of computer technology, and in particular, to a deep learning processing device and method supporting encoding and decoding.
背景技术Background technique
近些年来,多层人工神经网络被广泛应用于模式识别、图像处理、函数逼近和优化计算等领域。多层人工神经网络技术由于其较高的识别准确度和较好的可合并性,受到学界和工业界的广泛关注,然而在被运用到实际项目中时,由于多层人工神经网络技术所需要的计算量较大、模型对内存要求高,因此很难把其运用到嵌入式系统当中。In recent years, multi-layer artificial neural networks have been widely used in pattern recognition, image processing, function approximation and optimization calculations. Multi-layer artificial neural network technology has received extensive attention from academia and industry due to its high recognition accuracy and good composability. The computational complexity is large, and the model has high memory requirements, so it is difficult to apply it to embedded systems.
现有技术中,通常采用通用处理器来对多层人工神经网络运算、训练算法及其压缩编码进行处理,通过使用通用寄存器和通用寄存器部件执行通用指令来支持上述算法。然而,通用处理器运算性能较低,无法满足通常的多层人工神经网络运算的性能需求。除此之外,还可以使用图形处理器(GPU)来支持多层人工神经网络运算、训练算法及压缩其编码。但由于GPU是专门用来执行图形图像运算以及科学计算的设备,因此没有对多层人工神经网络的支持,因此需要大量的前端编码工作才能执行多层人工神经网络运算的支持,带来了额外开销。更何况,GPU只有较小的片上缓存,多层人工神经网络的模型数据(权值)需要反复从片外搬运,而GPU无法对人工神经网络的模型数据进行压缩,因此带来了巨大的功耗开销。In the prior art, a general-purpose processor is usually used to process multi-layer artificial neural network operations, training algorithms and their compression coding, and the above algorithms are supported by using general-purpose registers and general-purpose register components to execute general-purpose instructions. However, general-purpose processors have low computing performance and cannot meet the performance requirements of general multi-layer artificial neural network operations. In addition, a graphics processing unit (GPU) can also be used to support multi-layer artificial neural network operations, training algorithms and compressing their codes. However, since the GPU is a device specially used to perform graphics and image operations and scientific computing, there is no support for multi-layer artificial neural networks, so a lot of front-end coding work is required to perform multi-layer artificial neural network operations. overhead. What's more, the GPU has only a small on-chip cache, and the model data (weights) of the multi-layer artificial neural network need to be repeatedly transferred from the off-chip, and the GPU cannot compress the model data of the artificial neural network, so it brings huge power. cost.
发明内容SUMMARY OF THE INVENTION
有鉴于此,本公开提出了一种支持编码、解码的深度学习处理装置及方法,以在执行多层人工神经网络运算时对参数进行实时编码及解码。In view of this, the present disclosure proposes a deep learning processing device and method that supports encoding and decoding, so as to perform real-time encoding and decoding of parameters when performing multi-layer artificial neural network operations.
根据本公开的一方面,提供了一种支持编码、解码的深度学习处理装置,所述装置包括:According to an aspect of the present disclosure, there is provided a deep learning processing apparatus supporting encoding and decoding, the apparatus comprising:
内存访问单元,用于在内存中读写数据;A memory access unit for reading and writing data in memory;
指令缓存单元,连接于所述内存访问单元,用于通过所述内存访问单元读入神经网络的指令,并存储所述指令;an instruction cache unit, connected to the memory access unit, for reading an instruction of the neural network through the memory access unit, and storing the instruction;
控制器单元,连接于所述指令缓存单元,用于从所述指令缓存单元获取所述指令,并将所述指令译码为所述运算单元的微指令;a controller unit, connected to the instruction cache unit, for acquiring the instruction from the instruction cache unit, and decoding the instruction into a microinstruction of the arithmetic unit;
参数存储单元,连接于所述内存访问单元,用于存储所述内存访问单元传来的第一语义向量,并在接收到数据读取指令时,向参数解压缩单元或运算单元发送所述第一语义向量;The parameter storage unit, connected to the memory access unit, is used to store the first semantic vector transmitted from the memory access unit, and when receiving a data read instruction, send the first semantic vector to the parameter decompression unit or the operation unit a semantic vector;
参数解压缩单元,连接于所述参数存储单元,用于接收所述参数存储单元传来的第一语义向量,并利用解码器对所述第一语义向量进行解压缩处理,获得所述第一语义向量对应的解压缩参数,并向运算单元发送所述解压缩参数;A parameter decompression unit, connected to the parameter storage unit, for receiving the first semantic vector transmitted from the parameter storage unit, and using a decoder to decompress the first semantic vector to obtain the first semantic vector Decompression parameters corresponding to the semantic vector, and send the decompression parameters to the operation unit;
运算单元,连接于参数存储单元、参数解压缩单元及控制器单元,用于根据所述微指令,对接收到的所述第一语义向量或所述解压缩参数进行与神经网络模型相关联的运算,以获取输出结果;The arithmetic unit is connected to the parameter storage unit, the parameter decompression unit and the controller unit, and is used for performing a neural network model associated with the received first semantic vector or the decompression parameter according to the micro-instruction operation to obtain the output result;
在一种可能的实施方式中,所述装置还包括参数压缩单元,In a possible implementation manner, the apparatus further includes a parameter compression unit,
所述参数压缩单元,连接于所述内存访问单元,还用于获取所述内存访问单元传来的神经网络模型的权值和/或神经网络模型的输入向量,并利用所述编码器对所述神经网络模型的权值和/或神经网络模型的输入向量进行压缩,以获取所述神经网络模型的权值对应的语义向量和/或神经网络模型的输入向量对应的语义向量;The parameter compression unit, connected to the memory access unit, is further configured to obtain the weight of the neural network model and/or the input vector of the neural network model transmitted from the memory access unit, and use the encoder to The weight of the neural network model and/or the input vector of the neural network model are compressed to obtain the semantic vector corresponding to the weight of the neural network model and/or the semantic vector corresponding to the input vector of the neural network model;
所述内存访问单元,还用于将所述神经网络模型的权值对应的语义向量和/或神经网络模型的输入向量对应的语义向量作为所述第二语义向量存储在内存中。The memory access unit is further configured to store the semantic vector corresponding to the weight of the neural network model and/or the semantic vector corresponding to the input vector of the neural network model in the memory as the second semantic vector.
在一种可能的实施方式中,所述装置电连接于第一压缩设备,所述第一压缩设备用于获取所述神经网络模型运算所需的权值和/或神经网络模型的输入向量,并利用所述第一压缩设备中的编码器对所述神经网络模型的权值和/或神经网络模型的输入向量进行压缩,以获取所述神经网络模型的权值对应的语义向量和/或神经网络模型的输入向量对应的语义向量;In a possible implementation manner, the apparatus is electrically connected to a first compression device, and the first compression device is configured to obtain the weights required for the operation of the neural network model and/or the input vector of the neural network model, and use the encoder in the first compression device to compress the weights of the neural network model and/or the input vectors of the neural network model to obtain semantic vectors and/or corresponding weights of the neural network model. The semantic vector corresponding to the input vector of the neural network model;
所述内存访问单元,还用于将所述神经网络模型的权值和/或神经网络模型的输入向量传输给所述第一压缩设备和/或将所述神经网络模型的权值对应的语义向量和/或神经网络模型的输入向量对应的语义向量作为第三语义向量存储在内存中。The memory access unit is further configured to transmit the weight of the neural network model and/or the input vector of the neural network model to the first compression device and/or to transmit the semantics corresponding to the weight of the neural network model The semantic vector corresponding to the vector and/or the input vector of the neural network model is stored in the memory as a third semantic vector.
在一种可能的实施方式中,所述参数压缩单元,连接于所述运算单元,还用于利用编码器对所述输出结果进行压缩处理,获得与所述输出结果对应的第四语义向量;In a possible implementation manner, the parameter compression unit, connected to the operation unit, is further configured to use an encoder to compress the output result to obtain a fourth semantic vector corresponding to the output result;
所述内存访问单元,还用于将所述第四语义向量存储在内存中。The memory access unit is further configured to store the fourth semantic vector in the memory.
在一种可能的实施方式中,所述装置电连接于第二压缩设备,所述第二压缩设备用于接收所述输出结果,并通过所述第二压缩设备中的编码器对所述输出结果进行压缩处理,获得与所述输出结果对应的第五语义向量;In a possible implementation manner, the apparatus is electrically connected to a second compression device, and the second compression device is configured to receive the output result, and use an encoder in the second compression device for the output The result is subjected to compression processing to obtain a fifth semantic vector corresponding to the output result;
所述内存访问单元,还用于将所述输出结果传输给所述第二压缩设备和/或将所述第五语义向量存储在内存中。The memory access unit is further configured to transmit the output result to the second compression device and/or store the fifth semantic vector in the memory.
在一种可能的实施方式中,所述参数压缩单元还用于判断所述输出结果或神经网络模型的权值或神经网络模型的输入向量是否稀疏,并在所述输出结果或神经网络模型的权值或神经网络模型的输入向量稀疏时,向所述参数存储单元发送与所述第一语义向量对应的稀疏化标记;In a possible implementation manner, the parameter compression unit is further configured to judge whether the output result or the weight of the neural network model or the input vector of the neural network model is sparse, and in the output result or the neural network model When the weight or the input vector of the neural network model is sparse, send a sparse label corresponding to the first semantic vector to the parameter storage unit;
所述参数存储单元,还用于存储所述稀疏化标记,The parameter storage unit is further configured to store the sparse mark,
其中,所述参数存储单元在接收到所述数据读取指令时,向所述参数解压缩单元或运算单元发送所述第一语义向量,包括:Wherein, when the parameter storage unit receives the data read instruction, sending the first semantic vector to the parameter decompression unit or the operation unit includes:
在接收到所述数据读取指令,且所述参数存储单元中存储有与所述第一语义向量对应的稀疏化标记时,向所述运算单元发送所述第一语义向量。The first semantic vector is sent to the operation unit when the data read instruction is received and the parameter storage unit stores a sparse flag corresponding to the first semantic vector.
在一种可能的实施方式中,所述参数存储单元在接收到数据读取指令时,向所述参数解压缩单元或运算单元发送所述第一语义向量,还包括:In a possible implementation manner, when the parameter storage unit receives a data read instruction, sending the first semantic vector to the parameter decompression unit or the operation unit, further comprising:
在接收到所述数据读取指令,且所述参数存储单元中未存储与所述第一语义向量对应的稀疏化标记时,向所述参数解压缩单元发送所述第一语义向量。The first semantic vector is sent to the parameter decompression unit when the data read instruction is received and the parameter storage unit does not store a sparse flag corresponding to the first semantic vector.
在一种可能的实施方式中,所述编码器和/或所述解码器包括CNN、RNN、BiRNN、GRU、LSTM、COO、CSR、ELL中的一种或多种。In a possible implementation manner, the encoder and/or the decoder includes one or more of CNN, RNN, BiRNN, GRU, LSTM, COO, CSR, and ELL.
在一种可能的实施方式中,所述装置还包括:In a possible implementation, the device further includes:
结果缓存单元,连接于所述运算单元及所述内存访问单元,用于在所述运算单元执行完神经网络模型的最后一层后,存储神经网络模型最后一层的输出结果。The result cache unit is connected to the operation unit and the memory access unit, and is used for storing the output result of the last layer of the neural network model after the operation unit executes the last layer of the neural network model.
根据本公开的另一方面,提出了一种神经网络芯片,所述芯片包括所述的支持编码、解码的深度学习处理装置。According to another aspect of the present disclosure, a neural network chip is proposed, and the chip includes the deep learning processing device supporting encoding and decoding.
根据本公开的另一方面,提出了一种电子设备,所述电子设备包括所述的神经网络芯片。According to another aspect of the present disclosure, an electronic device is proposed, which includes the neural network chip.
根据本公开的另一方面,提出了一种支持编码、解码的深度学习处理方法,所述方法应用于支持编码、解码的深度学习处理装置中,所述装置包括内存访问单元、指令缓存单元、控制器单元、参数存储单元、参数解压缩单元、运算单元,所述方法包括:According to another aspect of the present disclosure, a deep learning processing method supporting encoding and decoding is proposed. The method is applied to a deep learning processing device supporting encoding and decoding. The device includes a memory access unit, an instruction cache unit, A controller unit, a parameter storage unit, a parameter decompression unit, and an arithmetic unit, and the method includes:
内存访问单元在内存中读写数据;The memory access unit reads and writes data in memory;
指令缓存单元通过所述内存访问单元读入神经网络的指令,并存储所述指令;The instruction cache unit reads the instruction of the neural network through the memory access unit, and stores the instruction;
控制器单元从所述指令缓存单元获取所述指令,并将所述指令译码为所述运算单元的微指令;The controller unit obtains the instruction from the instruction cache unit, and decodes the instruction into a microinstruction of the operation unit;
通过参数存储单元存储所述内存访问单元传来的第一语义向量,并在接收到数据读取指令时,向参数解压缩单元或运算单元发送所述第一语义向量;The first semantic vector transmitted from the memory access unit is stored by the parameter storage unit, and when a data read instruction is received, the first semantic vector is sent to the parameter decompression unit or the operation unit;
通过参数解压缩单元接收所述参数存储单元传来的第一语义向量,并利用解码器对所述第一语义向量进行解压缩处理,获得所述第一语义向量对应的解压缩参数,并向运算单元发送所述解压缩参数;Receive the first semantic vector from the parameter storage unit through the parameter decompression unit, and use the decoder to decompress the first semantic vector to obtain the decompression parameter corresponding to the first semantic vector, and send it to the first semantic vector. The operation unit sends the decompression parameter;
通过运算单元根据所述微指令,对接收到的所述第一语义向量或所述解压缩参数进行与神经网络模型相关联的运算,以获取输出结果。According to the micro-instruction, the operation unit performs an operation associated with the neural network model on the received first semantic vector or the decompression parameter, so as to obtain an output result.
在一种可能的实施方式中,深度学习处理装置还包括参数压缩单元,所述方法还包括:In a possible implementation, the deep learning processing apparatus further includes a parameter compression unit, and the method further includes:
所述参数压缩单元获取所述内存访问单元传来的神经网络模型的权值和/或神经网络模型的输入向量,并利用所述编码器对所述神经网络模型的权值和/或神经网络模型的输入向量进行压缩,以获取所述神经网络模型的权值对应的语义向量和/或神经网络模型的输入向量对应的语义向量;The parameter compression unit obtains the weight of the neural network model and/or the input vector of the neural network model transmitted from the memory access unit, and uses the encoder to calculate the weight of the neural network model and/or the neural network The input vector of the model is compressed to obtain the semantic vector corresponding to the weight of the neural network model and/or the semantic vector corresponding to the input vector of the neural network model;
所述内存访问单元将所述神经网络模型的权值对应的语义向量和/或神经网络模型的输入向量对应的语义向量作为第二语义向量存储在内存中。The memory access unit stores the semantic vector corresponding to the weight of the neural network model and/or the semantic vector corresponding to the input vector of the neural network model in the memory as the second semantic vector.
在一种可能的实施方式中,深度学习处理装置电连接于第一压缩设备,所述方法还包括:In a possible implementation manner, the deep learning processing apparatus is electrically connected to the first compression device, and the method further includes:
通过所述内存访问单元将所述神经网络模型的权值和/或神经网络模型的输入向量传输给所述第一压缩设备和/或第三语义向量存储在内存中,其中,所述第三语义向量为所述第一压缩设备获取所述神经网络模型运算所需的权值和/或神经网络模型的输入向量,并利用所述第一压缩设备中的编码器对所述神经网络模型的权值和/或神经网络模型的输入向量进行压缩,从而获取的所述神经网络模型的权值对应的语义向量和/或神经网络模型的输入向量对应的语义向量。The weights of the neural network model and/or the input vector of the neural network model are transmitted to the first compression device and/or the third semantic vector is stored in the memory through the memory access unit, wherein the third The semantic vector is that the first compression device obtains the weights required for the operation of the neural network model and/or the input vector of the neural network model, and uses the encoder in the first compression device to interpret the neural network model. The weights and/or the input vector of the neural network model are compressed, so as to obtain the semantic vector corresponding to the weight of the neural network model and/or the semantic vector corresponding to the input vector of the neural network model.
在一种可能的实施方式中,所述方法还包括:In a possible implementation, the method further includes:
通过所述参数压缩单元的编码器对所述输出结果进行压缩处理,获得与所述输出结果对应的第四语义向量;The output result is compressed by the encoder of the parameter compression unit to obtain a fourth semantic vector corresponding to the output result;
通过所述内存访问单元将所述第四语义向量存储在内存中。The fourth semantic vector is stored in the memory by the memory access unit.
在一种可能的实施方式中,深度学习处理装置电连接于第二压缩设备,所述方法还包括:In a possible implementation manner, the deep learning processing apparatus is electrically connected to the second compression device, and the method further includes:
通过所述内存访问单元将所述输出结果传输给所述第二压缩设备和/或将所述第五语义向量存储在内存中,其中,所述第五语义向量为所述第二压缩设备通过编码器对所述输出结果进行压缩处理,获得的与所述输出结果对应的语义向量。在一种可能的实施方式中,所述方法还包括:The output result is transmitted to the second compression device through the memory access unit and/or the fifth semantic vector is stored in the memory, wherein the fifth semantic vector is passed by the second compression device The encoder compresses the output result, and obtains a semantic vector corresponding to the output result. In a possible implementation, the method further includes:
通过所述参数压缩单元判断所述输出结果或神经网络模型的权值或神经网络模型的输入向量是否稀疏,并在所述输出结果或神经网络模型的权值或神经网络模型的输入向量稀疏时,向所述参数存储单元发送与所述第一语义向量对应的稀疏化标记;Whether the output result or the weight of the neural network model or the input vector of the neural network model is sparse is determined by the parameter compression unit, and when the output result or the weight of the neural network model or the input vector of the neural network model is sparse , sending a sparse mark corresponding to the first semantic vector to the parameter storage unit;
所述参数存储单元存储所述稀疏化标记,the parameter storage unit stores the sparse flag,
其中,所述参数存储单元在接收到所述数据读取指令时,向所述参数解压缩单元或运算单元发送所述第一语义向量,包括:Wherein, when the parameter storage unit receives the data read instruction, sending the first semantic vector to the parameter decompression unit or the operation unit includes:
在接收到所述数据读取指令,且所述参数存储单元中存储有与所述第一语义向量对应的稀疏化标记时,向所述运算单元发送所述第一语义向量。The first semantic vector is sent to the operation unit when the data read instruction is received and the parameter storage unit stores a sparse flag corresponding to the first semantic vector.
在一种可能的实施方式中,所述参数存储单元在接收到数据读取指令时,向所述参数解压缩单元或运算单元发送所述第一语义向量,还包括:In a possible implementation manner, when the parameter storage unit receives a data read instruction, sending the first semantic vector to the parameter decompression unit or the operation unit, further comprising:
在接收到所述数据读取指令,且所述参数存储单元中未存储与所述第一语义向量对应的稀疏化标记时,向所述参数解压缩单元发送所述第一语义向量。The first semantic vector is sent to the parameter decompression unit when the data read instruction is received and the parameter storage unit does not store a sparse flag corresponding to the first semantic vector.
在一种可能的实施方式中,所述装置还包括结果缓存单元,所述方法还包括:In a possible implementation manner, the apparatus further includes a result caching unit, and the method further includes:
结果缓存单元在所述运算单元执行完神经网络模型的最后一层后,存储神经网络模型最后一层的输出结果。The result cache unit stores the output result of the last layer of the neural network model after the operation unit executes the last layer of the neural network model.
本公开可以对待压缩参数进行压缩,从而有效减少神经网络的模型大小、降低了对内存的需求,从而有效提高了神经网络的数据处理速度。The present disclosure can compress the parameters to be compressed, thereby effectively reducing the model size of the neural network, reducing the demand for memory, and effectively improving the data processing speed of the neural network.
根据下面参考附图对示例性实施例的详细说明,本公开的其它特征及方面将变得清楚。Other features and aspects of the present disclosure will become apparent from the following detailed description of exemplary embodiments with reference to the accompanying drawings.
附图说明Description of drawings
包含在说明书中并且构成说明书的一部分的附图与说明书一起示出了本公开的示例性实施例、特征和方面,并且用于解释本公开的原理。The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate exemplary embodiments, features, and aspects of the disclosure, and together with the description, serve to explain the principles of the disclosure.
图1示出了根据本公开一实施方式的支持编码、解码的深度学习处理装置的框图。FIG. 1 shows a block diagram of a deep learning processing apparatus supporting encoding and decoding according to an embodiment of the present disclosure.
图2示出了根据本公开一实施方式的支持编码、解码的深度学习处理装置的框图。FIG. 2 shows a block diagram of a deep learning processing apparatus supporting encoding and decoding according to an embodiment of the present disclosure.
图3示出了根据本公开一实施方式的参数压缩单元压缩待压缩参数及参数解压缩单元解压缩语义向量的模型示意图。FIG. 3 shows a schematic diagram of a model in which a parameter compression unit compresses a parameter to be compressed and a parameter decompression unit decompresses a semantic vector according to an embodiment of the present disclosure.
图4示出了根据本公开一实施方式的支持编码、解码的深度学习处理方法的流程图。FIG. 4 shows a flowchart of a deep learning processing method supporting encoding and decoding according to an embodiment of the present disclosure.
图5示出了根据本公开一实施方式的基于支持编码、解码的深度学习处理装置的流程示意图。FIG. 5 shows a schematic flowchart of a deep learning processing apparatus based on supporting encoding and decoding according to an embodiment of the present disclosure.
具体实施方式Detailed ways
以下将参考附图详细说明本公开的各种示例性实施例、特征和方面。附图中相同的附图标记表示功能相同或相似的元件。尽管在附图中示出了实施例的各种方面,但是除非特别指出,不必按比例绘制附图。Various exemplary embodiments, features and aspects of the present disclosure will be described in detail below with reference to the accompanying drawings. The same reference numbers in the figures denote elements that have the same or similar functions. While various aspects of the embodiments are shown in the drawings, the drawings are not necessarily drawn to scale unless otherwise indicated.
在这里专用的词“示例性”意为“用作例子、实施例或说明性”。这里作为“示例性”所说明的任何实施例不必解释为优于或好于其它实施例。The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration." Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
另外,为了更好的说明本公开,在下文的具体实施方式中给出了众多的具体细节。本领域技术人员应当理解,没有某些具体细节,本公开同样可以实施。在一些实例中,对于本领域技术人员熟知的方法、手段、元件和电路未作详细描述,以便于凸显本公开的主旨。In addition, in order to better illustrate the present disclosure, numerous specific details are given in the following detailed description. It will be understood by those skilled in the art that the present disclosure may be practiced without certain specific details. In some instances, methods, means, components and circuits well known to those skilled in the art have not been described in detail so as not to obscure the subject matter of the present disclosure.
请参阅图1,图1示出了根据本公开一实施方式的支持编码、解码的深度学习处理装置的框图。Please refer to FIG. 1, which shows a block diagram of a deep learning processing apparatus supporting encoding and decoding according to an embodiment of the present disclosure.
如图1所示,所述装置包括:As shown in Figure 1, the device includes:
内存访问单元50,用于在内存中读写数据。The
指令缓存单元60,连接于所述内存访问单元50,用于通过所述内存访问单元读入神经网络的指令,并存储所述指令;The
控制器单元70,连接于所述指令缓存单元60,用于从所述指令缓存单元60获取所述指令,并将所述指令译码为运算单元40的微指令;a
参数存储单元20,连接于所述内存访问单元,用于存储所述内存访问单元传来的第一语义向量,并在接收到数据读取指令时,向参数解压缩单元或运算单元发送所述第一语义向量;The
参数解压缩单元30,连接于所述参数存储单元20,用于接收所述参数存储单元20传来的第一语义向量,并利用解码器对所述第一语义向量进行解压缩处理,获得所述第一语义向量对应的解压缩参数,并向运算单元40发送所述解压缩参数;The
运算单元40,连接于参数存储单元20、参数解压缩单元30及控制器单元70,用于根据所述微指令,对接收到的所述第一语义向量或所述解压缩参数进行与神经网络模型相关联的运算,以获取输出结果。The
通过以上装置个单元的配合,本公开可以利用压缩的参数进行神经网络相关的运算,从而有效减少神经网络的模型大小、降低了对内存的需求,从而有效提高了神经网络的数据处理速度。Through the cooperation of the above devices and units, the present disclosure can use compressed parameters to perform neural network-related operations, thereby effectively reducing the model size of the neural network, reducing the memory requirement, and effectively improving the data processing speed of the neural network.
在一种可能的实施方式中,所述输出结果可以通过所述参数存储单元20传输到内存访问单元50,再通过内存访问单元50存储于内存或其他存储设备中,在其他实施方式中,运算单元40也可以直接通过内存访问单元50将所述输出结果存储到内存或其他存储设备中。In a possible implementation manner, the output result can be transmitted to the
请参阅图2,图2示出了根据本公开一实施方式的支持编码、解码的深度学习处理装置的框图。Please refer to FIG. 2, which shows a block diagram of a deep learning processing apparatus supporting encoding and decoding according to an embodiment of the present disclosure.
如图2所示,所述装置还可以包括:As shown in Figure 2, the apparatus may further include:
参数压缩单元10,连接于所述运算单元40,用于利用编码器对所述输出结果或神经网络的权值、输入向量等参数进行压缩处理,获得与所述输出结果对应的语义向量。参数压缩单元10作为深度学习处理装置片上单元,可以对待压缩参数进行实时压缩,可以减少数据的传输功耗。The
结果缓存单元80,连接于所述运算单元40及所述内存访问单元50,用于存储所述输出结果。The
在其他实施方式中,运算单元40的输出结果可以通过内存访问单元50存储于内存中。In other embodiments, the output result of the
在一种可能的实施方式中,所述装置电连接于第一压缩设备91,所述第一压缩设备用于获取所述神经网络模型运算所需的权值和/或神经网络模型的输入向量,并利用所述第一压缩设备中的编码器对所述神经网络模型的权值和/或神经网络模型的输入向量进行压缩,以获取所述神经网络模型的权值对应的语义向量和/或神经网络模型的输入向量对应的语义向量;In a possible implementation manner, the apparatus is electrically connected to the first compression device 91, and the first compression device is used to obtain the weights and/or input vectors of the neural network model required for the operation of the neural network model , and use the encoder in the first compression device to compress the weights of the neural network model and/or the input vectors of the neural network model to obtain the corresponding semantic vectors and/or weights of the neural network model. Or the semantic vector corresponding to the input vector of the neural network model;
在本实施方式中,第一压缩设备91可以通过内存访问单元50实现与所述装置的交互,例如第一压缩设备91可以通过内存访问单元50从所述装置中获得待压缩参数,并对待压缩参数进行压缩,生成语义向量,再通过内存访问单元50将生成的语义向量传输到所述装置中。在其他实施方式中,第一压缩设备91可以通过所述装置的其他部件实现与所述装置的交互。In this embodiment, the first compression device 91 can interact with the apparatus through the
所述内存访问单元,还用于将所述神经网络模型的权值对应的语义向量和/或神经网络模型的输入向量对应的语义向量作为第三语义向量存储在内存中。The memory access unit is further configured to store the semantic vector corresponding to the weight of the neural network model and/or the semantic vector corresponding to the input vector of the neural network model in the memory as a third semantic vector.
在一种可能的实施方式中,所述装置电连接于第二压缩设备92,所述第二压缩设备用于接收所述输出结果,并通过所述第二压缩设备中的编码器对所述输出结果进行压缩处理,获得与所述输出结果对应的第五语义向量;In a possible implementation manner, the apparatus is electrically connected to a
在本实施方式中,第二压缩设备92可以通过内存访问单元50实现与所述装置的交互,例如第二压缩设备92可以通过内存访问单元50从所述装置中获得待压缩参数,并对待压缩参数进行压缩,生成语义向量,再通过内存访问单元50将生成的语义向量传输到所述装置中。在其他实施方式中,第二压缩设备92可以通过所述装置的其他部件实现与所述装置的交互。In this embodiment, the
所述内存访问单元,还用于将所述第五语义向量存储在内存中。The memory access unit is further configured to store the fifth semantic vector in the memory.
应该明白的是,第一压缩设备91和第二压缩设备92作为深度学习处理装置意外的设备(片外)对装置处理中产生中间结果、最终结果或中间参数或输入向量、权值进行压缩,可以减少所述装置的片上空间。It should be understood that the first compression device 91 and the
在其他实施方式中,第一压缩设备91和第二压缩设备92可以是一个合并的压缩设备,以对深度学习处理运算所需的待压缩参数进行压缩。In other embodiments, the first compression device 91 and the
在一种可能的实施方式中,内存访问单元50可包括连接于支持编码、解码的深度学习处理装置的内存接口的直接内存访问通道,可在内存中读写数据,例如,可以读取内存中的输入神经元数据(输入向量)、权值、指令、输出神经元数据(输出结果)、经过压缩后的语义向量(例如权值、神经元数据被压缩后的语义向量)等。In a possible implementation, the
在一种可能的实施方式中,所述指令包括执行神经网络算法的指令和/或执行神经网络运算的通用向量/矩阵指令。In a possible implementation, the instructions include instructions for performing neural network algorithms and/or general vector/matrix instructions for performing neural network operations.
在本实施方式中,所述神经网络算法的指令可以包括多层感知机(MLP)指令、卷积指令、池化(POOLing)指令等,所述神经网络运算的通用向量/矩阵指令可以包括矩阵乘指令、向量加指令、向量激活函数指令等指令。In this embodiment, the instructions of the neural network algorithm may include multi-layer perceptron (MLP) instructions, convolution instructions, pooling (POOLing) instructions, etc., and the general vector/matrix instructions of the neural network operation may include a matrix Instructions such as multiply instructions, vector add instructions, and vector activation function instructions.
在一种可能的实施方式中,第一语义向量可以为神经网络各层的输入向量或神经网络的权值,例如,第一语义向量可以为神经网络的权值被压缩后生成的语义向量,也可以是神经网络的输入向量被压缩后生成的语义向量,还可以是神经网络各层的输出结果或最终输出结果经过压缩编码后产生的语义向量。应该明白的是,在神经网络通常包括N层,N为大于1的整数,第N层的输出,可以是第N+1层的输入,第N层的输入可以是第N-1层的输出。第一语义向量可以是参数存储单元20通过内存访问单元50从内存中获得的,也可以是参数存储单元20从参数压缩单元10获得的。In a possible implementation manner, the first semantic vector may be an input vector of each layer of the neural network or a weight of the neural network. For example, the first semantic vector may be a semantic vector generated after the weights of the neural network are compressed, It can also be a semantic vector generated after the input vector of the neural network is compressed, or it can be the output result of each layer of the neural network or the semantic vector generated after the final output result is compressed and encoded. It should be understood that a neural network usually includes N layers, where N is an integer greater than 1, the output of the Nth layer can be the input of the N+1th layer, and the input of the Nth layer can be the output of the N-1th layer. . The first semantic vector may be obtained by the
应该明白的是,在多种不同的实施方式中,深度学习处理装置可以包括片外或者片内的不同的压缩部件,例如设置于片内的参数压缩单元10、设置于片外的第一压缩设备91或第二压缩设备92,这些压缩部件对待压缩参数进行压缩后生成的语义向量可以被参数解压缩单元30解压缩。It should be understood that, in various implementations, the deep learning processing apparatus may include different compression components outside or within the chip, such as the
在一种可能的实施方式中,参数存储单元20还可以存储神经网络的各参数的稀疏化标记,例如,当神经网络的权值为稀疏权值时,参数存储单元20可以存储权值的稀疏化标记,该稀疏化标记同样与权值被压缩后产生的第一语义向量对应。In a possible implementation, the
在一种可能的实施方式中,所述数据读取指令可以由支持编码、解码的深度学习处理装置以外的控制器发出,也可以由支持编码、解码的深度学习处理装置中的运算单元40及参数解压缩单元30发出。In a possible implementation manner, the data reading instruction may be issued by a controller other than the deep learning processing device supporting encoding and decoding, or may be issued by the
在一种可能的实施方式中,所述参数存储单元20在接收到数据读取指令时,向所述参数解压缩单元或运算单元发送所述语义向量,还包括:In a possible implementation manner, the
在接收到所述数据读取指令,且所述参数存储单元20中未存储与所述语义向量对应的稀疏化标记时,向所述参数解压缩单元发送所述语义向量。When the data read instruction is received and the
在接收到所述数据读取指令,且所述参数存储单元20中存储有与所述语义向量对应的稀疏化标记时,向所述运算单元发送所述语义向量。When the data read instruction is received and the
在一种可能的实施方式中,所述参数存储单元20在接收到所述数据读取指令,且所述参数存储单元中存储有与所述语义向量对应的稀疏化标记时,向所述运算单元发送所述语义向量。In a possible implementation manner, when the
在一种可能的实施方式中,所述参数存储单元20在接收到所述数据读取指令,且所述参数存储单元中未存储与所述语义向量对应的稀疏化标记时,向所述参数解压缩单元发送所述语义向量。In a possible implementation manner, when the
在一种可能的实施方式中,参数存储单元20还可以将其存储的数据通过内存访问单元50存储到内存或其他存储设备中。In a possible implementation manner, the
在一种可能的实施方式中,参数解压缩单元30可将语义向量进行解码解压缩,从而可以获得与所述待压缩参数数目相同的解压缩参数,所述解压缩参数包括所述待压缩参数的信息。例如,当待压缩参数为N个权值时,参数解压缩单元30可以将语义向量解码解压缩为N个解压缩参数,所述N个解压缩参数与N个权值基本等同。In a possible implementation manner, the
参数解压缩单元30可通过解码器将语义向量进行解码解压缩,从而可以获得与待压缩参数(例如权值、输入向量等)数目相同的解压缩参数。The
在一种可能的实施方式中,所述解码器可以包括CNN(卷积神经网络,Convolutional Neural Network)、RNN(循环神经网络,Recurrent neural networks)、BiRNN(双向RNN,Bidirectional RNN)、GRU(门控循环单元,Gated Recurrent Unit)、LSTM(长短期记忆网络,Long Short-Term Memory)、COO(坐标表示,Coordinate Format)、CSR(行压缩,Compressed Sparse Row)、ELLPACK(ELL)、CSC(列压缩、Compressed SparseColumn)等神经网络中的一种或多种。In a possible implementation manner, the decoder may include CNN (Convolutional Neural Network, Convolutional Neural Network), RNN (Recurrent neural network, Recurrent neural networks), BiRNN (Bidirectional RNN, Bidirectional RNN), GRU (Gate Control loop unit, Gated Recurrent Unit), LSTM (Long Short-Term Memory Network, Long Short-Term Memory), COO (Coordinate Representation, Coordinate Format), CSR (Line Compression, Compressed Sparse Row), ELLPACK (ELL), CSC (column One or more of neural networks such as compression, Compressed SparseColumn).
解码器的选择可以与编码器对应,例如,当编码器选择CNN时,解码器可以为CNN。但是,解码器及编码器的选择也可以是任意的,例如,当编码器选择CNN时,解码器可以选择CNN、RNN等任意一种或者多种。The selection of the decoder can correspond to the encoder, for example, when the encoder selects CNN, the decoder can be CNN. However, the selection of the decoder and the encoder can also be arbitrary. For example, when the encoder selects CNN, the decoder can select any one or more of CNN and RNN.
下面以解码器为RNN为例对解码过程进行说明。The decoding process is described below by taking the decoder as an RNN as an example.
请参阅图2。如图2所示,参数解压缩单元30对语义向量进行解压缩的RNN模型包括多个隐藏层(图中1层作为示例)及输出层,输出层用于输出解压缩参数。See Figure 2. As shown in FIG. 2 , the RNN model in which the
参数解压缩单元30对语义向量解压缩的过程可以视为参数压缩单元10压缩所述待压缩参数过程的逆过程,在解压缩的阶段,可以根据已经生成的输出序列来预测下一个输出,从而将所述隐藏层的语义向量解压缩为所述解压缩参数。The process of decompressing the semantic vector by the
在RNN中,解码过程可以根据给定的前述的语义向量c和已经生成的输出序列y1,y2,…yt-1来预测下一个输出yt。In RNN, the decoding process can predict the next output yt given the aforementioned semantic vector c and the already generated output sequence y1 , y2 ,...yt-1 .
在一种可能的实施方式中,当待压缩参数为神经网络模型的权值时,若权值稀疏,则对其进行压缩后的语义向量可以直接被运算单元40用于进行计算或者对神经网络模型进行训练;若权值非稀疏,则其对应的语义向量需要进行解压缩后生成解压缩参数,所述解压缩参数可以直接被运算单元40用于进行计算或者对神经网络模型进行训练。In a possible implementation, when the parameter to be compressed is the weight of the neural network model, if the weight is sparse, the compressed semantic vector can be directly used by the
在一种可能的实施方式中,与所述神经网络模型相关联的运算可以包括对神经网络模型的训练及执行相关计算。In one possible implementation, the operations associated with the neural network model may include training the neural network model and performing related computations.
在一种可能的实施方式中,运算单元40可以根据带稀疏化标记的语义向量及所述解压缩参数进行乘法运算、加法树运算及激活函数运算。In a possible implementation manner, the
在本实施方式中,运算单元40可以根据多个操作码进行运算操作。In this embodiment, the
当运算单元40获取第一阶段的第一操作码时,根据第一操作码判断是否进行乘法运算,若进行乘法运算,则进行乘法运算并将乘法运算的结果输出到第二阶段;若不需进行乘法运算,则直接进入第二阶段。When the
当运算单元40获取第二阶段的第二操作码时,根据第二操作码判断是否进行加法树运算,若进行加法树运算,则进行加法树运算并将加法树运算的结果输出到第三阶段;若不需进行加法树运算,则直接进入第三阶段。When the
当运算单元40获取第三阶段的第三操作码时,根据第三操作码判断是否进行激活函数的运算,若进行激活函数运算,则进行激活函数的运算并将运算的结果输出。参数压缩单元10对运算单元40的输出结果进行压缩,获得与所述输出结果对应的第四语义向量,将其中的多维数据压缩成为低维数据,减小数据的向量长度,从而减小了存储参数的内存压力。When the
在一种可能的实施方式中,神经网络模型可以包括多层,例如包括N层,参数压缩单元10可以对神经网络模型每一层的输出结果分别进行压缩,并通过内存访问单元50存储到内存中,或者通过参数存储单元20存储。In a possible implementation, the neural network model may include multiple layers, for example, including N layers, and the
在本实施方式中,参数压缩单元10可以包括多个子参数压缩单元(未示出)与神经网络模型的各层分别对应,各个子参数压缩单元可以分别用于对神经网络模型各层的输出结果进行分别压缩编码。In this embodiment, the
在一种可能的实施方式中,参数压缩单元10还可以用于压缩神经网络模型的模型数据,例如可以包括神经网络模型的输入向量、权值、学习率及其他参数。In a possible implementation, the
在本实施方式中,参数压缩单元10连接于所述内存访问单元50,可以获取所述内存访问单元50传来的神经网络模型的权值和/或神经网络模型的输入向量,并利用所述编码器对所述神经网络模型的权值和/或神经网络模型的输入向量进行压缩,以获取所述神经网络模型的权值对应的语义向量和/或神经网络模型的输入向量对应的语义向量。In this embodiment, the
内存访问单元50可以将所述神经网络模型的权值对应的语义向量和/或神经网络模型的输入向量对应的语义向量作为第二语义向量存储在内存中。The
上述的编码器可以将神经网络模型的参数(例如,权值)、各层的输出结果等其他待压缩参数进行压缩编码,从而将多维的待压缩参数压缩成一个定长的语义向量,所述的语义向量包括压缩前的权值的信息,应该明白的是,在选择权值进行压缩时,可以选择任意数量的权值进行压缩。The above encoder can compress and encode the parameters of the neural network model (for example, weights), the output results of each layer, and other parameters to be compressed, so as to compress the multi-dimensional parameters to be compressed into a fixed-length semantic vector. The semantic vector of includes the information of the weights before compression. It should be understood that when selecting weights for compression, any number of weights can be selected for compression.
在一种可能的实施方式中,所述编码器可以包括CNN(卷积神经网络,Convolutional Neural Network)、RNN(循环神经网络,Recurrent neural networks)、BiRNN(双向RNN,Bidirectional RNN)、GRU(门控循环单元,Gated Recurrent Unit)、LSTM(长短期记忆网络,Long Short-Term Memory)、COO(坐标表示,Coordinate Format)、CSR(行压缩,Compressed Sparse Row)、ELL(ELLPACK)、CSC(列压缩、Compressed SparseColumn)等神经网络中的一种或多种。In a possible implementation manner, the encoder may include CNN (Convolutional Neural Network, Convolutional Neural Network), RNN (Recurrent neural network, Recurrent neural networks), BiRNN (Bidirectional RNN, Bidirectional RNN), GRU (Gate Control loop unit, Gated Recurrent Unit), LSTM (Long Short-Term Memory Network, Long Short-Term Memory), COO (Coordinate Representation, Coordinate Format), CSR (Line Compression, Compressed Sparse Row), ELL (ELLPACK), CSC (column One or more of neural networks such as compression, Compressed SparseColumn).
在本实施方式中,当参数压缩单元10、第一压缩设备91、第二压缩设备92对神经网络的待压缩参数进行压缩时,可以对待压缩参数(例如:权值或输入向量)进行稀疏化的判断,在权值或输入向量为稀疏值时,可以优选采用COO、CSR、ELL、CSC对稀疏的权值或输入向量进行压缩编码。In this embodiment, when the
在一个示例中,可以选择RNN作为编码器对权值进行编码压缩,下面以编码器为RNN为例进行说明。In an example, an RNN may be selected as the encoder to encode and compress the weights, and the following description will be given by taking the encoder as an RNN as an example.
请参阅图3,图3示出了根据本公开一实施方式的参数压缩单元10、第一压缩设备91、第二压缩设备92压缩待压缩参数及参数解压缩单元30解压缩语义向量的模型示意图。Please refer to FIG. 3 . FIG. 3 shows a schematic diagram of a model in which the
当采用RNN对所述待压缩参数进行编码压缩时,可以采用逐层贪婪算法来训练深度网络。When using RNN to encode and compress the parameters to be compressed, a layer-by-layer greedy algorithm can be used to train the deep network.
如图3所示,RNN包括输入层和多个隐藏层(两层为举例),在通过逐层贪婪算法对所述待压缩参数进行压缩时,首先,通过多个向量(神经网络模型的输入向量以及权值)来训练RNN的,RNN的将多个向量转化为由的隐藏单元激活值组成的第一中间向量;然后,通过将所述中间向量作为RNN的第二层的输入,RNN的第二层将传来的中间向量转化为第二层的隐藏单元激活值组成的第二中间向量;然后,对后面的隐藏层采用相同的策略,将前一层的输出作为下一层的输入,依次对RNN模型进行训练;最后,可以将当前时刻隐藏层的最后一层作为隐藏层的语义向量。As shown in Figure 3, the RNN includes an input layer and a plurality of hidden layers (two layers are an example). When the parameters to be compressed are compressed through a layer-by-layer greedy algorithm, first, through a plurality of vectors (the input of the neural network model) vector and weights) to train the RNN, the RNN converts multiple vectors into a first intermediate vector composed of hidden unit activation values of The second layer converts the incoming intermediate vector into a second intermediate vector composed of the activation values of the hidden units of the second layer; then, the same strategy is used for the subsequent hidden layers, and the output of the previous layer is used as the input of the next layer , train the RNN model in turn; finally, the last layer of the hidden layer at the current moment can be used as the semantic vector of the hidden layer.
在RNN中,当前时刻隐藏层状态是由上一时刻的隐藏层状态和当前时刻的输入决定的,例如,可用公式:ht=f(ht-1,xt)来表示,其中,ht为当前时刻(t时刻)的隐藏层状态,ht-1为上一时刻(t-1时刻)的隐藏层状态,xt为当前时刻隐藏层的输入。In RNN, the hidden layer state at the current moment is determined by the hidden layer state at the previous moment and the input at the current moment. For example, it can be represented by the formula: ht =f(ht-1 ,xt ), where ht is the state of the hidden layer at the current time (time t), ht-1 is the state of the hidden layer at the previous time (time t-1), and xt is the input of the hidden layer at the current time.
获得了各个时刻的隐藏层状态以后,再将各个时刻(T1~Tx时刻,x为大于1的整数)的隐藏层状态(hT1~hTx)汇总,以生成最后的语义向量c,c=q({hT1,...,hTx}),q表示某种非线性函数。After obtaining the hidden layer state at each time, the hidden layer state (hT1 ~hTx ) at each time (T1~Tx time, x is an integer greater than 1) is summarized to generate the final semantic vector c, c= q({hT1 ,...,hTx }), q represents some kind of nonlinear function.
然而,在RNN网络中,当前时刻计算完后无法看见前面时刻的隐藏层状态,所以可用最后一个时刻(Tx时刻)的隐藏层状态作为语义向量c,即c=hTx。However, in the RNN network, the hidden layer state of the previous time cannot be seen after the current time is calculated, so the hidden layer state of the last time (Tx time) can be used as the semantic vector c, that is, c=hTx .
在一种可能的实施方式中,还可以使用反向传播算法调整各层的参数。In a possible implementation, the parameters of each layer can also be adjusted using a back-propagation algorithm.
在一种可能的实施方式中,所述稀疏化标记例如可用bool变量进行标记。In a possible implementation, the thinning mark can be marked with a bool variable, for example.
在一种可能的实施方式中,所述参数压缩单元10还用于判断所述待压缩参数是否稀疏,并在所述待压缩参数稀疏时,向所述参数存储单元发送与所述语义向量对应的稀疏化标记。In a possible implementation manner, the
本公开所述的支持编码、解码的深度学习处理装置可以通过硬件电路(例如但不限于专用集成电路ASIC)实现,并且可将内存访问单元50、指令缓存单元60、控制器单元70、参数压缩单元10、参数存储单元20、参数解压缩单元30、运算单元40及结果缓存单元80整合在一个独立的芯片(例如神经网络芯片)内,第一压缩设备91、第二压缩设备92可以通过硬件电路(例如但不限于专用集成电路ASIC)实现。The deep learning processing apparatus supporting encoding and decoding described in the present disclosure can be implemented by a hardware circuit (such as but not limited to an application-specific integrated circuit ASIC), and the
本公开所述的支持编码、解码的深度学习处理装置可以应用于以下(包括但不限于)场景中:数据处理、机器人、电脑、打印机、扫描仪、电话、平板电脑、智能终端、手机、行车记录仪、导航仪、传感器、摄像头、云端服务器、相机、摄像机、投影仪、手表、耳机、移动存储、可穿戴设备等各类电子产品;飞机、轮船、车辆等各类交通工具;电视、空调、微波炉、冰箱、电饭煲、加湿器、洗衣机、电灯、燃气灶、油烟机等各类家用电器;以及包括核磁共振仪、B超、心电图仪等各类医疗设备。The deep learning processing device supporting encoding and decoding described in the present disclosure can be applied to the following (including but not limited to) scenarios: data processing, robots, computers, printers, scanners, phones, tablet computers, smart terminals, mobile phones, driving Recorders, navigators, sensors, cameras, cloud servers, cameras, camcorders, projectors, watches, headphones, mobile storage, wearable devices and other electronic products; aircraft, ships, vehicles and other means of transport; TVs, air conditioners , microwave ovens, refrigerators, rice cookers, humidifiers, washing machines, lamps, gas stoves, range hoods and other household appliances;
本公开可以对待压缩参数进行压缩,从而有效减少神经网络的模型大小、降低了对内存的需求,从而有效提高了神经网络的数据处理速度。The present disclosure can compress the parameters to be compressed, thereby effectively reducing the model size of the neural network, reducing the demand for memory, and effectively improving the data processing speed of the neural network.
请参阅图4,图4示出了根据本公开一实施方式的支持编码、解码的深度学习处理方法的流程图。Please refer to FIG. 4, which shows a flowchart of a deep learning processing method supporting encoding and decoding according to an embodiment of the present disclosure.
所述方法运用于支持编码、解码的深度学习处理装置,所述支持编码、解码的深度学习处理装置包括内存访问单元、指令缓存单元、控制器单元、参数存储单元、参数解压缩单元、运算单元。The method is applied to a deep learning processing device supporting encoding and decoding, and the deep learning processing device supporting encoding and decoding includes a memory access unit, an instruction cache unit, a controller unit, a parameter storage unit, a parameter decompression unit, and an arithmetic unit. .
如图4所示,所述方法包括:As shown in Figure 4, the method includes:
步骤S210,通过指令缓存单元存储通过所述内存访问单元读入的神经网络的指令。Step S210, the instruction cache unit stores the instruction of the neural network read in through the memory access unit.
步骤S220,通过控制器单元从所述指令缓存单元获取所述指令,并将所述指令译码为所述运算单元的微指令。Step S220, obtaining the instruction from the instruction cache unit through the controller unit, and decoding the instruction into a microinstruction of the operation unit.
步骤S230,通过参数存储单元存储所述内存访问单元传来的第一语义向量,并在接收到数据读取指令时,向参数解压缩单元或运算单元发送所述第一语义向量。Step S230: Store the first semantic vector from the memory access unit through the parameter storage unit, and send the first semantic vector to the parameter decompression unit or the operation unit when receiving a data read instruction.
步骤S240,通过参数解压缩单元接收所述参数存储单元传来的第一语义向量,并利用解码器对所述第一语义向量进行解压缩处理,获得所述第一语义向量对应的解压缩参数,并向运算单元发送所述解压缩参数。Step S240, receiving the first semantic vector from the parameter storage unit through the parameter decompression unit, and using a decoder to decompress the first semantic vector to obtain a decompression parameter corresponding to the first semantic vector , and send the decompression parameters to the operation unit.
步骤S250,通过运算单元根据所述微指令,对接收到的所述第一语义向量或所述解压缩参数进行与神经网络模型相关联的运算,以获取输出结果。Step S250, the operation unit performs an operation associated with the neural network model on the received first semantic vector or the decompression parameter according to the microinstruction to obtain an output result.
通过以上方法,本公开可以利用已经压缩过的参数进行神经网络相关的运算,从而提高运算效率。Through the above method, the present disclosure can use the compressed parameters to perform neural network-related operations, thereby improving computing efficiency.
在一种可能的实施方式中,深度学习处理装置还包括参数压缩单元,所述方法还包括:In a possible implementation, the deep learning processing apparatus further includes a parameter compression unit, and the method further includes:
通过所述参数压缩单元获取所述内存访问单元传来的神经网络模型的权值和/或神经网络模型的输入向量,并利用所述编码器对所述神经网络模型的权值和/或神经网络模型的输入向量进行压缩,以获取所述神经网络模型的权值对应的语义向量和/或神经网络模型的输入向量对应的语义向量。The weights of the neural network model and/or the input vector of the neural network model transmitted from the memory access unit are obtained through the parameter compression unit, and the weights and/or neural network models of the encoder are used for The input vector of the network model is compressed to obtain the semantic vector corresponding to the weight of the neural network model and/or the semantic vector corresponding to the input vector of the neural network model.
通过所述内存访问单元将所述神经网络模型的权值对应的语义向量和/或神经网络模型的输入向量对应的语义向量作为所述第二语义向量存储在内存中。The semantic vector corresponding to the weight of the neural network model and/or the semantic vector corresponding to the input vector of the neural network model is stored in the memory as the second semantic vector by the memory access unit.
在一种可能的实施方式中,深度学习处理装置电连接于第一压缩设备,所述方法还包括:In a possible implementation manner, the deep learning processing apparatus is electrically connected to the first compression device, and the method further includes:
通过所述内存访问单元将第三语义向量存储在内存中,其中,所述第三语义向量为所述第一压缩设备获取神经网络模型运算所需的权值和/或神经网络模型的输入向量,并利用所述第一压缩设备中的编码器对所述神经网络模型的权值和/或神经网络模型的输入向量进行压缩,从而获取的所述神经网络模型的权值对应的语义向量和/或神经网络模型的输入向量对应的语义向量。A third semantic vector is stored in the memory by the memory access unit, wherein the third semantic vector is the weight required for the operation of the neural network model and/or the input vector of the neural network model obtained by the first compression device , and use the encoder in the first compression device to compress the weights of the neural network model and/or the input vectors of the neural network model, so that the obtained semantic vectors corresponding to the weights of the neural network model and / or the semantic vector corresponding to the input vector of the neural network model.
在一种可能的实施方式中,所述方法还包括:In a possible implementation, the method further includes:
通过所述参数压缩单元的编码器对所述输出结果进行压缩处理,获得与所述输出结果对应的第四语义向量;The output result is compressed by the encoder of the parameter compression unit to obtain a fourth semantic vector corresponding to the output result;
通过所述内存访问单元将所述第四语义向量存储在内存中。The fourth semantic vector is stored in the memory by the memory access unit.
在一种可能的实施方式中,深度学习处理装置电连接于第二压缩设备,所述方法还包括:In a possible implementation manner, the deep learning processing apparatus is electrically connected to the second compression device, and the method further includes:
通过所述内存访问单元将所述第五语义向量存储在内存中,其中,所述第五语义向量为所述第二压缩设备通过编码器对所述输出结果进行压缩处理,获得的与所述输出结果对应的语义向量。The fifth semantic vector is stored in the memory by the memory access unit, wherein the fifth semantic vector is obtained by the second compression device compressing the output result through an encoder, and the obtained result is the same as the The semantic vector corresponding to the output result.
在一种可能的实施方式中,所述方法还包括:In a possible implementation, the method further includes:
通过所述参数压缩单元判断所述输出结果或神经网络模型的权值或神经网络模型的输入向量是否稀疏,并在所述输出结果或神经网络模型的权值或神经网络模型的输入向量稀疏时,向所述参数存储单元发送与所述第一语义向量对应的稀疏化标记;Whether the output result or the weight of the neural network model or the input vector of the neural network model is sparse is determined by the parameter compression unit, and when the output result or the weight of the neural network model or the input vector of the neural network model is sparse , sending a sparse mark corresponding to the first semantic vector to the parameter storage unit;
所述参数存储单元存储所述稀疏化标记,the parameter storage unit stores the sparse flag,
其中,所述参数存储单元在接收到所述数据读取指令时,向所述参数解压缩单元或运算单元发送所述第一语义向量,包括:Wherein, when the parameter storage unit receives the data read instruction, sending the first semantic vector to the parameter decompression unit or the operation unit includes:
在接收到所述数据读取指令,且所述参数存储单元中存储有与所述第一语义向量对应的稀疏化标记时,向所述运算单元发送所述第一语义向量。The first semantic vector is sent to the operation unit when the data read instruction is received and the parameter storage unit stores a sparse flag corresponding to the first semantic vector.
在一种可能的实施方式中,所述方法还包括:In a possible implementation, the method further includes:
在接收到所述数据读取指令,且所述参数存储单元中未存储与所述第一语义向量对应的稀疏化标记时,向所述参数解压缩单元发送所述第一语义向量。The first semantic vector is sent to the parameter decompression unit when the data read instruction is received and the parameter storage unit does not store a sparse flag corresponding to the first semantic vector.
在一种可能的实施方式中,所述装置还包括结果缓存单元,所述方法还包括:In a possible implementation manner, the apparatus further includes a result caching unit, and the method further includes:
通过结果缓存单元在所述运算单元执行完神经网络模型的最后一层后,存储神经网络模型最后一层的输出结果。The result cache unit stores the output result of the last layer of the neural network model after the operation unit executes the last layer of the neural network model.
应该说明的是,支持编码、解码的深度学习处理方法为前述支持编码、解码的深度学习处理装置对应的方法项,其具体介绍请参照之前对支持编码、解码的深度学习处理装置的介绍,此处不再赘述。It should be noted that the deep learning processing method that supports encoding and decoding is the method item corresponding to the aforementioned deep learning processing device that supports encoding and decoding. For the specific introduction, please refer to the previous introduction to the deep learning processing device that supports encoding and decoding. It is not repeated here.
本公开可以对待压缩参数进行压缩,从而有效减少神经网络的模型大小、降低了对内存的需求,从而有效提高了神经网络的数据处理速度。The present disclosure can compress the parameters to be compressed, thereby effectively reducing the model size of the neural network, reducing the demand for memory, and effectively improving the data processing speed of the neural network.
请参阅图5,图5示出了根据本公开一实施方式的基于支持编码、解码的深度学习处理装置的流程示意图。Please refer to FIG. 5. FIG. 5 shows a schematic flowchart of a deep learning processing apparatus based on supporting encoding and decoding according to an embodiment of the present disclosure.
如图5所示,参数压缩单元10可以包括参数提取子单元11及参数压缩子单元12。As shown in FIG. 5 , the
参数提取子单元11用于获取神经网络模型的多种参数,例如可以获取神经网络模型的输入向量、权值、神经网络模型各层的输出结果等。The
参数提取子单元11接收神经网络模型,可以通过如下步骤提取神经网络模型中的参数:The
步骤S111,参数提取子单元111可以提取神经网络模型中的权值或输入向量,并送入参数压缩子单元12中。In step S111 , the
步骤S112,参数提取子单元112可以提取神经网络模型的N-INPUT个待压缩参数(例如神经网络模型各层的输出结果),并送入参数压缩子单元12中。Step S112 , the parameter extraction subunit 112 may extract N-INPUT parameters of the neural network model to be compressed (eg, output results of each layer of the neural network model), and send them to the parameter compression subunit 12 .
参数提取子单元11在提取这些待压缩参数的同时,还可以对其进行是否稀疏化的判断,如果待压缩参数时稀疏的,则可以在参数存储单元20中存储一个与该参数对应的稀疏化标记,所述稀疏化标记可以是Bool变量。While extracting the parameters to be compressed, the
参数压缩子单元12可以采用Auto-encoder压缩网络模型对参数进行压缩,例如,可以选择CNN、RNN、BiRNN、GRU、LSTM等编码器对权值或输入向量或N-INPUT个待压缩参数进行编码压缩,以得到N-COMPRESS个定长的语义向量。应该说明的是,编码器可以选择上述列示的任意一个或者他们的组合,也可以选择其他未列示的编码器,但是编码器的数量选择可以根据需要进行确定。The parameter compression subunit 12 can use the Auto-encoder compression network model to compress the parameters. For example, encoders such as CNN, RNN, BiRNN, GRU, LSTM can be selected to encode the weights or input vectors or N-INPUT parameters to be compressed Compression to obtain N-COMPRESS fixed-length semantic vectors. It should be noted that the encoder can select any one of the above-listed encoders or a combination thereof, and other encoders not listed can also be selected, but the selection of the number of encoders can be determined as required.
在这些待压缩参数为稀疏参数时,优选地,可以采用COO、CSR、ELL、CSC中的一种或多种对稀疏的待压缩参数进行压缩。When these parameters to be compressed are sparse parameters, preferably, one or more of COO, CSR, ELL, and CSC can be used to compress the sparse parameters to be compressed.
若N-COMPRESS个定长的语义向量是稀疏化参数压缩得到的,则其与参数存储单元20中的稀疏化标记对应。If the N-COMPRESS fixed-length semantic vectors are obtained by sparse parameter compression, they correspond to the sparse flags in the
参数存储单元20接收参数压缩单元10传来的语义向量,当需要利用网络模型的待压缩参数进行训练或计算时,例如,当接收到数据读取指令时,参数存储单元20可以首先进行稀疏化判断,若所述语义向量对应的待压缩参数为具有稀疏化标记,则所述语义向量对应的待压缩参数为稀疏参数,则将所述语义向量直接发送到运算单元40进行运算;若所述语义向量对应的待压缩参数不具有稀疏化标记,则所述语义向量对应的待压缩参数为非稀疏参数,则将压缩后参数(N-COMPRESS个参数的定长语义向量)送入解压缩单元30的decoder网络进行解码解压缩。参数存储单元20还用于存储神经网络模型的网络结构及压缩后的网络模型等。The
参数解压缩单元30接收到N-COMPRESS参数的定长语义向量后,将N-COMPRESS参数的定长语义向量作为输入送入解码器-Decoder网络,解码器的类型由auto-encoder编码器所决定,例如可以选择CNN、RNN、BiRNN、GRU、LSTM、COO、CSR、ELL、CSC中的一种或多种等对N-COMPRESS参数的定长语义向量进行解码解压缩,以获取解压后的IN-INPUT个代表待压缩参数(权值、输入向量等)的解压缩参数并输出到运算单元40进行运算。所述的解压缩参数与压缩前的待压缩参数的大小近似相等。After the
运算单元40获取参数存储单元传来的定长语义向量或参数解压缩单元传来的解压缩参数,并对其进行运算并输出运算结果。The
参数压缩子单元12还可以采用Auto-encoder压缩网络模型对运算单元的输出结果进行压缩,例如,可以选择CNN、RNN、BiRNN、GRU、LSTM等编码器对权值或输入向量或N-INPUT个待压缩参数进行编码压缩,以得到N-COMPRESS个语义向量,该语义向量可以存储在参数存储单元20中,也可以存储在内存中,在神经网络模型进行下一层运算或下一次运算时,通过参数提取子单元11获取。The parameter compression sub-unit 12 can also use the Auto-encoder compression network model to compress the output results of the operation unit. For example, encoders such as CNN, RNN, BiRNN, GRU, LSTM can be selected to The parameters to be compressed are encoded and compressed to obtain N-COMPRESS semantic vectors, which can be stored in the
举例而言,当神经网络模型进行下一层运算时,参数提取子单元11从内存中获取语义向量,并将语义向量输出到参数存储单元20中,参数存储单元20查询该语义向量是否有对应的稀疏化标记,如果有稀疏化标记,则将该语义向量直接送入运算单元40进行神经网络的下一层运算,如果该语义向量在参数存储单元20中不存在与其对应的稀疏化标记,那么,参数存储单元20则将该语义向量送入参数解压缩单元30进行解压缩处理,参数解压缩单元30在对其进行解压缩获得对应的解压缩参数后,将解压缩参数输出到运算单元40,该解压缩参数即作为神经网络下一层的输入向量进行与神经网络相关联的运算。For example, when the neural network model performs the next layer operation, the
应该说明的是,支持编码、解码的深度学习处理方法为前述支持编码、解码的深度学习处理装置对应的方法项,其具体介绍请参照之前对支持编码、解码的深度学习处理装置的介绍,此处不再赘述。It should be noted that the deep learning processing method that supports encoding and decoding is the method item corresponding to the aforementioned deep learning processing device that supports encoding and decoding. For the specific introduction, please refer to the previous introduction to the deep learning processing device that supports encoding and decoding. It is not repeated here.
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请并不受所描述的动作顺序的限制,因为依据本申请,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于可选实施例,所涉及的动作和单元并不一定是本申请所必须的。It should be noted that, for the sake of simple description, the foregoing method embodiments are all expressed as a series of action combinations, but those skilled in the art should know that the present application is not limited by the described action sequence. Because in accordance with the present application, certain steps may be performed in other orders or concurrently. Secondly, those skilled in the art should also know that the embodiments described in the specification are all optional embodiments, and the actions and units involved are not necessarily required by the present application.
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。In the above-mentioned embodiments, the description of each embodiment has its own emphasis. For parts that are not described in detail in a certain embodiment, reference may be made to the relevant descriptions of other embodiments.
在本申请所提供的几个实施例中,应该理解到,所揭露的装置,可通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the apparatus embodiments described above are only illustrative, for example, the division of the units is only a logical function division, and there may be other division methods in actual implementation, for example, multiple units or components may be combined or Integration into another system, or some features can be ignored, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件程序单元的形式实现。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software program units.
所述集成的单元如果以软件程序单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储器中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储器中,包括若干指令用以使得一台计算机设备(可为个人计算机、服务器或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储器包括:U盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。The integrated unit, if implemented in the form of a software program unit and sold or used as a stand-alone product, may be stored in a computer readable memory. Based on this understanding, the technical solution of the present application can be embodied in the form of a software product in essence, or the part that contributes to the prior art, or all or part of the technical solution, and the computer software product is stored in a memory, Several instructions are included to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned memory includes: U disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disk or optical disk and other media that can store program codes.
本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序可以存储于一计算机可读存储器中,存储器可以包括:闪存盘、只读存储器(英文:Read-Only Memory,简称:ROM)、随机存取器(英文:Random Access Memory,简称:RAM)、磁盘或光盘等。Those skilled in the art can understand that all or part of the steps in the various methods of the above embodiments can be completed by instructing relevant hardware through a program, and the program can be stored in a computer-readable memory, and the memory can include: a flash disk , Read-only memory (English: Read-Only Memory, referred to as: ROM), random access device (English: Random Access Memory, referred to as: RAM), magnetic disk or optical disk, etc.
以上已经描述了本公开的各实施例,上述说明是示例性的,并非穷尽性的,并且也不限于所披露的各实施例。在不偏离所说明的各实施例的范围和精神的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。本文中所用术语的选择,旨在最好地解释各实施例的原理、实际应用或对市场中的技术的技术改进,或者使本技术领域的其它普通技术人员能理解本文披露的各实施例。Various embodiments of the present disclosure have been described above, and the foregoing descriptions are exemplary, not exhaustive, and not limiting of the disclosed embodiments. Numerous modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein was chosen to best explain the principles of the embodiments, the practical application or technical improvement over the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201811189326.8ACN111045726B (en) | 2018-10-12 | 2018-10-12 | Deep learning processing device and method supporting encoding and decoding |
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201811189326.8ACN111045726B (en) | 2018-10-12 | 2018-10-12 | Deep learning processing device and method supporting encoding and decoding |
| Publication Number | Publication Date |
|---|---|
| CN111045726A CN111045726A (en) | 2020-04-21 |
| CN111045726Btrue CN111045726B (en) | 2022-04-15 |
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201811189326.8AActiveCN111045726B (en) | 2018-10-12 | 2018-10-12 | Deep learning processing device and method supporting encoding and decoding |
| Country | Link |
|---|---|
| CN (1) | CN111045726B (en) |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112954401A (en)* | 2020-08-19 | 2021-06-11 | 赵蒙 | Model determination method based on video interaction service and big data platform |
| CN114648086B (en)* | 2020-12-17 | 2025-09-26 | 北京灵汐科技有限公司 | Neural network calculation method, electronic device, and computer-readable medium |
| CN114399033B (en)* | 2022-03-25 | 2022-07-19 | 浙江大学 | Brain-like computing system and computing method based on neuron instruction coding |
| CN116661707B (en)* | 2023-07-28 | 2023-10-31 | 北京算能科技有限公司 | Data processing method and device and electronic equipment |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN106485316A (en)* | 2016-10-31 | 2017-03-08 | 北京百度网讯科技有限公司 | Neural network model compression method and device |
| CN106934458A (en)* | 2015-12-31 | 2017-07-07 | 中国科学院深圳先进技术研究院 | Multilayer automatic coding and system based on deep learning |
| CN106991478A (en)* | 2016-01-20 | 2017-07-28 | 南京艾溪信息科技有限公司 | Apparatus and method for performing reverse training of artificial neural network |
| CN106991477A (en)* | 2016-01-20 | 2017-07-28 | 南京艾溪信息科技有限公司 | An artificial neural network compression coding device and method |
| CN107205151A (en)* | 2017-06-26 | 2017-09-26 | 中国科学技术大学 | Coding and decoding device and method based on mixing distortion metrics criterion |
| CN107341542A (en)* | 2016-04-29 | 2017-11-10 | 北京中科寒武纪科技有限公司 | Apparatus and method for performing recurrent neural network and LSTM operations |
| CN108090560A (en)* | 2018-01-05 | 2018-05-29 | 中国科学技术大学苏州研究院 | The design method of LSTM recurrent neural network hardware accelerators based on FPGA |
| CN207529395U (en)* | 2017-09-20 | 2018-06-22 | 湖南师范大学 | A kind of body gait behavior active detecting identifying system folded based on semanteme |
| CN108229644A (en)* | 2016-12-15 | 2018-06-29 | 上海寒武纪信息科技有限公司 | The device of compression/de-compression neural network model, device and method |
| CN108271026A (en)* | 2016-12-30 | 2018-07-10 | 上海寒武纪信息科技有限公司 | The device and system of compression/de-compression, chip, electronic device |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US9076448B2 (en)* | 1999-11-12 | 2015-07-07 | Nuance Communications, Inc. | Distributed real time speech recognition system |
| US7196922B2 (en)* | 2005-07-25 | 2007-03-27 | Stmicroelectronics, Inc. | Programmable priority encoder |
| US11593632B2 (en)* | 2016-12-15 | 2023-02-28 | WaveOne Inc. | Deep learning based on image encoding and decoding |
| KR102457463B1 (en)* | 2017-01-16 | 2022-10-21 | 한국전자통신연구원 | Compressed neural network system using sparse parameter and design method thereof |
| CN107729819B (en)* | 2017-09-22 | 2020-05-19 | 华中科技大学 | A face annotation method based on sparse fully convolutional neural network |
| CN108345860A (en)* | 2018-02-24 | 2018-07-31 | 江苏测联空间大数据应用研究中心有限公司 | Personnel based on deep learning and learning distance metric recognition methods again |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN106934458A (en)* | 2015-12-31 | 2017-07-07 | 中国科学院深圳先进技术研究院 | Multilayer automatic coding and system based on deep learning |
| CN106991478A (en)* | 2016-01-20 | 2017-07-28 | 南京艾溪信息科技有限公司 | Apparatus and method for performing reverse training of artificial neural network |
| CN106991477A (en)* | 2016-01-20 | 2017-07-28 | 南京艾溪信息科技有限公司 | An artificial neural network compression coding device and method |
| CN107341542A (en)* | 2016-04-29 | 2017-11-10 | 北京中科寒武纪科技有限公司 | Apparatus and method for performing recurrent neural network and LSTM operations |
| CN106485316A (en)* | 2016-10-31 | 2017-03-08 | 北京百度网讯科技有限公司 | Neural network model compression method and device |
| CN108229644A (en)* | 2016-12-15 | 2018-06-29 | 上海寒武纪信息科技有限公司 | The device of compression/de-compression neural network model, device and method |
| CN108271026A (en)* | 2016-12-30 | 2018-07-10 | 上海寒武纪信息科技有限公司 | The device and system of compression/de-compression, chip, electronic device |
| CN107205151A (en)* | 2017-06-26 | 2017-09-26 | 中国科学技术大学 | Coding and decoding device and method based on mixing distortion metrics criterion |
| CN207529395U (en)* | 2017-09-20 | 2018-06-22 | 湖南师范大学 | A kind of body gait behavior active detecting identifying system folded based on semanteme |
| CN108090560A (en)* | 2018-01-05 | 2018-05-29 | 中国科学技术大学苏州研究院 | The design method of LSTM recurrent neural network hardware accelerators based on FPGA |
| Title |
|---|
| BP神经网络图像压缩算法乘累加单元的FPGA设计;杨隽等;《现代电子技术》;20091031;第38-41页* |
| EIE: Efficient Inference Engine on Compressed Deep Neural Network;Song Han. etc;《2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA)》;20160825;第1-12页* |
| 基于BP神经网络的机载数字高程模型压缩;冯琦等;《航空工程进展》;20110831;第339-343页* |
| 深度神经网络并行化研究综述;朱虎明等;《计算机学报》;20180831;第1861-1881页* |
| 神经网络七十年:回顾与展望;焦李成等;《计算机学报》;20160831;第1697-1716页* |
| 面向嵌入式应用的深度神经网络模型压缩技术综述;王磊等;《北京交通大学学报》;20171231;第34-41页* |
| Publication number | Publication date |
|---|---|
| CN111045726A (en) | 2020-04-21 |
| Publication | Publication Date | Title |
|---|---|---|
| CN111045726B (en) | Deep learning processing device and method supporting encoding and decoding | |
| US11307865B2 (en) | Data processing apparatus and method | |
| EP3579152B1 (en) | Computing apparatus and related product | |
| CN110163354B (en) | Computing device and method | |
| CN108875926A (en) | Interaction language translating method and Related product | |
| CN111797589A (en) | A text processing network, a method for training a neural network, and related equipment | |
| CN110909870B (en) | Training device and method | |
| CN114821096B (en) | Image processing method, neural network training method and related equipment | |
| CN108320018B (en) | An artificial neural network computing device and method | |
| CN111353591A (en) | Computing device and related product | |
| CN111047020B (en) | Neural network computing device and method supporting compression and decompression | |
| CN114418121A (en) | Model training method, object processing method and device, electronic equipment, medium | |
| CN111930984A (en) | Image retrieval method, device, server, client and medium | |
| CN110196735A (en) | A kind of computing device and Related product | |
| CN109711538B (en) | Operation method, device and related product | |
| CN111382848A (en) | A computing device and related products | |
| CN112070211B (en) | Image recognition method based on computing unloading mechanism | |
| CN110147872B (en) | Code storage device and method, processor and training method | |
| CN111353594B (en) | Data processing method, device, processor and storage medium | |
| CN120358361A (en) | Image processing method and related equipment | |
| CN118861666A (en) | A model training method, data processing method and related equipment | |
| CN120726190A (en) | Image generation method, model training method and related equipment | |
| WO2019165939A1 (en) | Computing device, and related product |
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |