Movatterモバイル変換


[0]ホーム

URL:


CN110515589B - Multiplier, data processing method, chip and electronic device - Google Patents

Multiplier, data processing method, chip and electronic device
Download PDF

Info

Publication number
CN110515589B
CN110515589BCN201910819020.4ACN201910819020ACN110515589BCN 110515589 BCN110515589 BCN 110515589BCN 201910819020 ACN201910819020 ACN 201910819020ACN 110515589 BCN110515589 BCN 110515589B
Authority
CN
China
Prior art keywords
data
circuit
multiplier
multiplication
partial product
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910819020.4A
Other languages
Chinese (zh)
Other versions
CN110515589A (en
Inventor
请求不公布姓名
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Cambricon Information Technology Co Ltd
Original Assignee
Shanghai Cambricon Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Cambricon Information Technology Co LtdfiledCriticalShanghai Cambricon Information Technology Co Ltd
Priority to CN201910819020.4ApriorityCriticalpatent/CN110515589B/en
Publication of CN110515589ApublicationCriticalpatent/CN110515589A/en
Application grantedgrantedCritical
Publication of CN110515589BpublicationCriticalpatent/CN110515589B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

The application provides a multiplier, a data processing method, a chip and electronic equipment, wherein the multiplier comprises: multiplication circuit, register control circuit, register circuit, state control circuit and selection circuit; the multiplication circuit comprises a regular signed number coding sub-circuit and an accumulation sub-circuit, the output end of the regular signed number coding sub-circuit is connected with the input end of the accumulation sub-circuit, the output end of the accumulation sub-circuit is connected with the first input end of the register control circuit, the output end of the register control circuit is connected with the input end of the register circuit, the output end of the register circuit is connected with the first input end of the selection circuit, the first output end of the state control circuit is connected with the second input end of the register control circuit, the second output end of the state control circuit is connected with the second input end of the selection circuit, the multiplier can perform regular signed number coding on received data, the number of obtained effective partial products is small, and the complexity of the multiplier for realizing multiplication operation is reduced.

Description

Translated fromChinese
乘法器、数据处理方法、芯片及电子设备Multiplier, data processing method, chip and electronic device

技术领域Technical Field

本申请涉及计算机技术领域,特别是涉及一种乘法器、数据处理方法、芯片及电子设备。The present application relates to the field of computer technology, and in particular to a multiplier, a data processing method, a chip and an electronic device.

背景技术Background technique

随着数字电子技术的不断发展,各类人工智能(Artificial Intelligence,AI)芯片的快速发展对于高性能数字乘法器的要求也越来越高。神经网络算法作为智能芯片广泛应用的算法之一,通过乘法器进行乘法运算在神经网络算法中是一种常见的操作。With the continuous development of digital electronic technology, the rapid development of various artificial intelligence (AI) chips has put higher and higher requirements on high-performance digital multipliers. As one of the widely used algorithms in smart chips, neural network algorithms use multipliers for multiplication, which is a common operation in neural network algorithms.

目前,乘法器是对乘数中的每三位数值作为一个编码,并根据被乘数得到部分积,并用华莱士树对所有部分积进行压缩处理得到目标运算结果。但是,传统技术中,编码中非零位数值的数目较多,产生的对应部分积的数目较多,导致乘法器实现乘法运算的复杂性较高。At present, the multiplier encodes each three-digit value in the multiplier, obtains partial products according to the multiplicand, and compresses all partial products using the Wallace tree to obtain the target operation result. However, in the traditional technology, the number of non-zero digit values in the encoding is large, and the number of corresponding partial products generated is large, resulting in a high complexity of the multiplication operation implemented by the multiplier.

发明内容Summary of the invention

基于此,有必要针对上述技术问题,提供一种能够减少乘法运算过程中获取的有效部分积数目,以降低乘法器乘法运算复杂性的乘法器、数据处理方法、芯片及电子设备。Based on this, it is necessary to provide a multiplier, data processing method, chip and electronic device that can reduce the number of valid partial products obtained during the multiplication operation to reduce the complexity of the multiplier multiplication operation in order to address the above technical problems.

本申请实施例提供一种乘法器,包括:乘法运算电路、寄存控制电路、寄存器电路、状态控制电路以及选择电路,所述乘法运算电路包括正则有符号数编码子电路以及累加子电路,所述正则有符号数编码子电路的输出端与所述累加子电路的输入端连接,所述累加子电路的输出端与所述寄存控制电路的第一输入端连接,所述寄存控制电路的输出端与所述寄存器电路的输入端连接,所述寄存器电路的输出端与所述选择电路的第一输入端连接,所述状态控制电路的第一输出端与所述寄存控制电路的第二输入端连接,所述状态控制电路的第二输出端与所述选择电路的第二输入端连接。An embodiment of the present application provides a multiplier, comprising: a multiplication operation circuit, a register control circuit, a register circuit, a state control circuit and a selection circuit, wherein the multiplication operation circuit comprises a regular signed number encoding subcircuit and an accumulation subcircuit, wherein the output end of the regular signed number encoding subcircuit is connected to the input end of the accumulation subcircuit, the output end of the accumulation subcircuit is connected to the first input end of the register control circuit, the output end of the register control circuit is connected to the input end of the register circuit, the output end of the register circuit is connected to the first input end of the selection circuit, the first output end of the state control circuit is connected to the second input end of the register control circuit, and the second output end of the state control circuit is connected to the second input end of the selection circuit.

在其中一个实施例中,所述正则有符号数编码子电路包括正则有符号数编码单元以及部分积获取单元,所述正则有符号数编码单元用于接收第一数据,并对所述第一数据进行所述正则有符号数编码处理,得到所述目标编码,所述部分积获取单元用于接收第二数据,根据所述目标编码以及所述第二数据得到原始部分积,并根据所述原始部分积得到所述目标编码的部分积,所述累加子电路用于对所述目标编码的部分积进行累加处理得到乘法运算结果,所述状态控制电路用于获取存储指示信号以及读取指示信号,所述寄存控制电路用于根据所述状态控制电路输入的所述存储指示信号,确定存储所述乘法运算结果的所述寄存器电路,所述寄存器电路用于存储所述乘法运算结果,所述选择电路用于根据接收到的所述读取指示信号,读取所述寄存器电路中存储的所述乘法运算结果中的数据,作为目标运算结果。In one embodiment, the regular signed number encoding subcircuit includes a regular signed number encoding unit and a partial product acquisition unit, the regular signed number encoding unit is used to receive first data and perform the regular signed number encoding process on the first data to obtain the target code, the partial product acquisition unit is used to receive second data, obtain the original partial product according to the target code and the second data, and obtain the partial product of the target code according to the original partial product, the accumulation subcircuit is used to accumulate the partial products of the target code to obtain a multiplication result, the state control circuit is used to obtain a storage indication signal and a read indication signal, the register control circuit is used to determine the register circuit that stores the multiplication result according to the storage indication signal input by the state control circuit, the register circuit is used to store the multiplication result, and the selection circuit is used to read the data in the multiplication result stored in the register circuit as the target result according to the received read indication signal.

在其中一个实施例中,所述正则有符号数编码单元可以包括:数据输入端口和目标编码输出端口;所述数据输入端口用于接收进行正则有符号数编码处理的所述第一数据,所述目标编码输出端口用于输出对所述第一数据进行正则有符号数编码处理后得到的所述目标编码。In one embodiment, the regular signed number encoding unit may include: a data input port and a target code output port; the data input port is used to receive the first data to be processed by regular signed number encoding, and the target code output port is used to output the target code obtained after the regular signed number encoding is performed on the first data.

在其中一个实施例中,所述部分积获取单元具体用于对所述目标编码进行转换处理得到原始部分积,并对所述原始部分积进行符号位扩展处理,得到符号位扩展后的部分积,根据所述符号位扩展后的部分积得到所述目标编码的部分积。In one of the embodiments, the partial product acquisition unit is specifically used to convert the target code to obtain the original partial product, and perform sign bit extension processing on the original partial product to obtain the sign bit extended partial product, and obtain the partial product of the target code according to the sign bit extended partial product.

在其中一个实施例中,所述部分积获取单元包括:目标编码输入端口、第二数据输入端口以及部分积输出端口;所述目标编码输入端口用于接收所述目标编码,所述第二数据输入端口用于接收所述第二数据,所述部分积输出端口用于输出所述目标编码的部分积。In one of the embodiments, the partial product acquisition unit includes: a target code input port, a second data input port and a partial product output port; the target code input port is used to receive the target code, the second data input port is used to receive the second data, and the partial product output port is used to output the partial product of the target code.

在其中一个实施例中,所述累加子电路包括:华莱士树组单元和累加单元;其中,所述华莱士树组单元的输出端与所述累加单元的输入端连接;所述华莱士树组单元用于对所述目标编码的部分积进行累加处理得到累加运算结果,所述累加单元用于对所述累加运算结果进行累加处理。In one embodiment, the accumulation subcircuit includes: a Wallace tree group unit and an accumulation unit; wherein the output end of the Wallace tree group unit is connected to the input end of the accumulation unit; the Wallace tree group unit is used to perform accumulation processing on the partial products of the target code to obtain an accumulation operation result, and the accumulation unit is used to perform accumulation processing on the accumulation operation result.

在其中一个实施例中,所述华莱士树组单元包括:华莱士树子单元,所述华莱士树子单元用于对所有目标编码的部分积中的每一列数值进行累加处理。In one of the embodiments, the Wallace tree group unit includes: a Wallace tree sub-unit, and the Wallace tree sub-unit is used to perform accumulation processing on each column value in the partial products of all target codes.

在其中一个实施例中,所述累加单元包括:加法器,所述加法器用于对接收到的所述累加修正结果进行加法运算。In one embodiment, the accumulation unit includes: an adder, and the adder is used to perform an addition operation on the received accumulation correction results.

在其中一个实施例中,所述加法器包括:进位信号输入端口、和位信号输入端口以及结果输出端口;所述进位信号输入端口用于接收进位信号,所述和位信号输入端口用于接收和位信号,所述结果输出端口用于输出所述进位信号与所述和位信号进行累加处理的结果。In one embodiment, the adder includes: a carry signal input port, a sum signal input port and a result output port; the carry signal input port is used to receive a carry signal, the sum signal input port is used to receive a sum signal, and the result output port is used to output the result of the accumulation processing of the carry signal and the sum signal.

在其中一个实施例中,所述寄存器电路包括:寄存子电路,所述寄存子电路用于将不同存储指示信号对应的所述乘法运算结果进行存储。In one embodiment, the register circuit includes: a register subcircuit, and the register subcircuit is used to store the multiplication results corresponding to different storage indication signals.

本申请实施例提供一种乘法器,该乘法器包括:乘法运算电路以及转数电路,所述乘法运算电路包括正则有符号数编码子电路以及累加子电路,所述正则有符号数编码子电路的输出端与所述累加子电路的输入端连接,所述累加子电路的输出端与所述转数电路的输入端连接,所述转数电路包括第一转换子电路和第二转换子电路;The embodiment of the present application provides a multiplier, which includes: a multiplication circuit and a rotation circuit, the multiplication circuit includes a regular signed number encoding subcircuit and an accumulation subcircuit, the output end of the regular signed number encoding subcircuit is connected to the input end of the accumulation subcircuit, the output end of the accumulation subcircuit is connected to the input end of the rotation circuit, and the rotation circuit includes a first conversion subcircuit and a second conversion subcircuit;

其中,所述正则有符号数编码子电路用于对接收到的数据进行正则有符号数编码处理得到目标编码,并根据所述目标编码得到目标编码的部分积,所述累加子电路用于对所述目标编码的部分积进行修正累加处理得到乘法运算结果,所述第一转换子电路及第二转换子电路分别用于对所述乘法运算结果进行转数处理,得到目标运算结果。Among them, the regular signed number encoding subcircuit is used to perform regular signed number encoding processing on the received data to obtain the target code, and obtain the partial product of the target code according to the target code, the accumulation subcircuit is used to perform modified accumulation processing on the partial product of the target code to obtain the multiplication result, and the first conversion subcircuit and the second conversion subcircuit are respectively used to perform conversion processing on the multiplication result to obtain the target operation result.

在其中一个实施例中,所述转数电路中包括输入端口,用于接收数据转换信号;所述数据转换信号用于确定所述转数电路处理的数据转换类型。In one of the embodiments, the conversion circuit includes an input port for receiving a data conversion signal; the data conversion signal is used to determine the type of data conversion processed by the conversion circuit.

在其中一个实施例中,所述第一转换子电路具体用于将所述乘法运算结果转换成浮点类型的所述目标运算结果,所述第二转换子电路具体用于将所述乘法运算结果转换成定点类型的所述目标运算结果。In one embodiment, the first conversion subcircuit is specifically used to convert the multiplication operation result into the target operation result of the floating point type, and the second conversion subcircuit is specifically used to convert the multiplication operation result into the target operation result of the fixed point type.

本实施例提供的一种乘法器,上述乘法器能够通过正则有符号数编码子电路对接收到的数据进行正则有符号数编码,得到的有效部分积的数目较少,从而降低了乘法器实现乘法运算的复杂性。The present embodiment provides a multiplier, which can perform regular signed number encoding on received data through a regular signed number encoding subcircuit, and obtains a smaller number of effective partial products, thereby reducing the complexity of the multiplication operation implemented by the multiplier.

本申请实施例提供一种数据处理方法,所述方法包括:The present application provides a data processing method, the method comprising:

接收待处理数据;receiving data to be processed;

对所述待处理数据进行正则有符号数编码处理,得到目标编码的部分积;Performing regular signed number encoding processing on the data to be processed to obtain a partial product of a target code;

对所述目标编码的部分积进行累加处理,得到乘法运算结果;Accumulating partial products of the target code to obtain a multiplication result;

获取存储指示信号以及读取指示信号;Acquiring a storage indication signal and a read indication signal;

根据所述存储指示信号将多个所述乘法运算结果存储至不同的寄存子电路中;storing the plurality of multiplication results in different register subcircuits according to the storage indication signal;

根据所述读取指示信号,读取不同寄存子电路中存储的对应所述乘法运算结果中的部分数据,得到目标运算结果。According to the read indication signal, partial data corresponding to the multiplication result stored in different register sub-circuits are read to obtain a target operation result.

在其中一个实施例中,所述对所述待处理数据进行正则有符号数编码处理,得到目标编码的部分积,包括:In one embodiment, performing regular signed number encoding processing on the data to be processed to obtain a partial product of a target code includes:

对所述待处理数据进行正则有符号数编码处理,得到原始部分积;Performing regular signed number encoding processing on the data to be processed to obtain original partial products;

对所述原始部分积进行符号位扩展处理,得到所述目标编码的部分积。The original partial product is subjected to sign bit extension processing to obtain the target coded partial product.

在其中一个实施例中,所述对所述待处理数据进行正则有符号数编码处理,得到原始部分积,包括:In one embodiment, performing regular signed number encoding processing on the data to be processed to obtain the original partial product includes:

对所述待处理数据进行正则有符号数编码处理,得到目标编码;Performing regular signed number encoding processing on the data to be processed to obtain a target code;

根据所述待处理数据与所述目标编码进行转换处理,得到所述原始部分积。The original partial product is obtained by performing conversion processing on the data to be processed and the target code.

在其中一个实施例中,所述对所述原始部分积进行符号位扩展处理,得到所述目标编码的部分积,包括:对所述原始部分积进行补位处理,得到所述目标编码的部分积。In one of the embodiments, the performing of sign bit extension processing on the original partial product to obtain the target coded partial product includes: performing bit padding processing on the original partial product to obtain the target coded partial product.

在其中一个实施例中,所述根据所述存储指示信号将多个所述乘法运算结果存储至不同的寄存子电路中,包括:In one embodiment, storing the plurality of multiplication results in different register subcircuits according to the storage indication signal comprises:

将第一存储指示信号对应的第一乘法运算结果存储至第一寄存子电路中;storing the first multiplication result corresponding to the first storage indication signal in the first register subcircuit;

将第二存储指示信号对应的第二乘法运算结果存储至第二寄存子电路中。The second multiplication result corresponding to the second storage indication signal is stored in the second register subcircuit.

在其中一个实施例中,所述根据所述读取指示信号,读取不同寄存子电路中存储的对应所述乘法运算结果中的部分数据,得到目标运算结果,包括:In one embodiment, the step of reading, according to the read indication signal, part of the data corresponding to the multiplication result stored in different register subcircuits to obtain the target operation result includes:

根据第一读取指示信号,读取所述第一寄存子电路中存储的第一乘法运算结果中的第一部分数据,得到第一运算结果;Reading a first portion of data in a first multiplication result stored in the first register subcircuit according to a first read indication signal to obtain a first operation result;

根据第二读取指示信号,读取所述第一寄存子电路中存储的所述第一乘法运算结果中的第二部分数据,得到第二运算结果;Reading a second portion of data in the first multiplication result stored in the first register subcircuit according to a second read indication signal to obtain a second operation result;

根据第三读取指示信号,读取所述第二寄存子电路中存储的第二乘法运算结果中的第一部分数据,得到第三运算结果;Reading a first portion of data in the second multiplication result stored in the second register subcircuit according to a third read indication signal to obtain a third operation result;

根据第四读取指示信号,读取所述第二寄存子电路中存储的所述第二乘法运算结果中的第二部分数据,得到第四运算结果。According to a fourth read indication signal, a second portion of data in the second multiplication result stored in the second register subcircuit is read to obtain a fourth operation result.

本申请实施例提供一种数据处理方法,所述方法包括:The present application provides a data processing method, the method comprising:

接收数据转换信号以及待处理数据;receiving a data conversion signal and data to be processed;

对所述待处理数据进行正则有符号数编码处理,得到目标编码的部分积;Performing regular signed number encoding processing on the data to be processed to obtain a partial product of a target code;

对所述目标编码的部分积进行累加处理,得到乘法运算结果;Accumulating partial products of the target code to obtain a multiplication result;

根据所述数据转换信号将所述乘法运算结果进行转数处理,得到目标运算结果,其中,所述数据转换信号用于指示乘法器需要将所述目标运算结果转换为需求的数据类型。The multiplication result is converted according to the data conversion signal to obtain a target operation result, wherein the data conversion signal is used to indicate that the multiplier needs to convert the target operation result into a required data type.

本实施例提供的一种数据处理方法,上述方法能够对接收到的待处理数据进行正则有符号数编码,降低乘法运算中有效部分积的数目,从而降低乘法运算的复杂性。This embodiment provides a data processing method, which can perform regular signed number encoding on the received data to be processed, thereby reducing the number of effective partial products in the multiplication operation, thereby reducing the complexity of the multiplication operation.

本申请实施例提供的一种机器学习运算装置,该机器学习运算装置包括一个或者多个所述的乘法器;该机器学习运算装置用于从其它处理装置中获取待运算数据和控制信息,并执行指定的机器学习运算,将执行结果通过I/O接口传递给其它处理装置;The embodiment of the present application provides a machine learning computing device, which includes one or more multipliers as described above; the machine learning computing device is used to obtain data to be computed and control information from other processing devices, and perform specified machine learning operations, and transmit the execution results to other processing devices through an I/O interface;

当所述机器学习运算装置包含多个所述乘法器时,多个所述计算装置间通过预设特定结构进行连接并传输数据;When the machine learning computing device includes a plurality of the multipliers, the plurality of computing devices are connected and transmit data through a preset specific structure;

其中,多个所述乘法器通过PCIE总线进行互联并传输数据,以支持更大规模的机器学习的运算;多个所述乘法器共享同一控制系统或拥有各自的控制系统;多个所述乘法器共享内存或者拥有各自的内存;多个所述乘法器的互联方式是任意互联拓扑。Among them, multiple multipliers are interconnected and transmit data through a PCIE bus to support larger-scale machine learning operations; multiple multipliers share the same control system or have their own control systems; multiple multipliers share memory or have their own memory; the interconnection method of multiple multipliers is any interconnection topology.

本申请实施例提供的一种组合处理装置,该组合处理装置包括如所述的机器学习处理装置、通用互联接口,和其它处理装置;该机器学习运算装置与上述其它处理装置进行交互,共同完成用户指定的操作;该组合处理装置还可以包括存储装置,该存储装置分别与所述机器学习运算装置和所述其它处理装置连接,用于保存所述机器学习运算装置和所述其它处理装置的数据。An embodiment of the present application provides a combined processing device, which includes a machine learning processing device, a universal interconnection interface, and other processing devices as described above; the machine learning operation device interacts with the above-mentioned other processing devices to jointly complete user-specified operations; the combined processing device may also include a storage device, which is respectively connected to the machine learning operation device and the other processing devices, and is used to save data of the machine learning operation device and the other processing devices.

本申请实施例提供的一种神经网络芯片,该神经网络芯片包括上述所述的乘法器、上述所述的机器学习运算装置或者上述所述的组合处理装置。An embodiment of the present application provides a neural network chip, which includes the multiplier mentioned above, the machine learning operation device mentioned above, or the combined processing device mentioned above.

本申请实施例提供的一种神经网络芯片封装结构,该神经网络芯片封装结构包括上述所述的神经网络芯片。An embodiment of the present application provides a neural network chip packaging structure, which includes the neural network chip described above.

本申请实施例提供的一种板卡,该板卡包括上述所述的神经网络芯片封装结构。An embodiment of the present application provides a board card, which includes the neural network chip packaging structure described above.

本申请实施例提供了一种电子装置,该电子装置包括上述所述的神经网络芯片或者上述所述的板卡。An embodiment of the present application provides an electronic device, which includes the neural network chip or the board mentioned above.

本申请实施例提供的一种芯片,包括至少一个如上述任一项所述的乘法器。A chip provided in an embodiment of the present application includes at least one multiplier as described in any one of the above items.

本申请实施例提供的一种电子设备,包括如所述的芯片。An electronic device provided in an embodiment of the present application includes the chip as described above.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为一实施例提供的乘法器的结构示意图;FIG1 is a schematic diagram of the structure of a multiplier provided by an embodiment;

图2为另一实施例提供的乘法器的结构示意图;FIG2 is a schematic diagram of the structure of a multiplier provided by another embodiment;

图3为另一实施例提供的9个目标编码的部分积的分布规律示意图;FIG3 is a schematic diagram showing the distribution of partial products of nine target codes provided in another embodiment;

图4为另一实施例提供的8位数据运算时累加电路的具体电路结构图;FIG4 is a specific circuit structure diagram of an accumulator circuit for 8-bit data operation provided by another embodiment;

图5为一实施例提供的一种数据处理方法的流程示意图;FIG5 is a schematic flow chart of a data processing method provided by an embodiment;

图6为另一实施例提供的另一种数据处理方法的流程示意图;FIG6 is a schematic flow chart of another data processing method provided by another embodiment;

图7为一实施例提供的一种组合处理装置的结构图;FIG7 is a structural diagram of a combined processing device provided by an embodiment;

图8为一实施例提供的另一种组合处理装置的结构图;FIG8 is a structural diagram of another combined processing device provided by an embodiment;

图9为一实施例提供的一种板卡的结构示意图。FIG. 9 is a schematic diagram of the structure of a board provided by an embodiment.

具体实施方式Detailed ways

为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。In order to make the purpose, technical solution and advantages of the present application more clearly understood, the present application is further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present application and are not used to limit the present application.

本申请提供的乘法器可应用于AI芯片、现场可编程门阵列FPGA(Field-Programmable Gate Array,FPGA)芯片、或者是其它的硬件电路设备中进行比较运算处理,其具体结构示意图如图1和图2所示。The multiplier provided in the present application can be applied to AI chips, field-programmable gate array FPGA (Field-Programmable Gate Array, FPGA) chips, or other hardware circuit devices for comparative operation processing, and its specific structural schematic diagram is shown in Figures 1 and 2.

如图1所示为一实施例提供的一种乘法器的结构示意图。该乘法器包括:乘法运算电路11、寄存控制电路12、寄存器电路13、状态控制电路14以及选择电路15,所述乘法运算电路11包括正则有符号数编码子电路111以及累加子电路112,所述正则有符号数编码子电路111的输出端与所述累加子电路112的输入端连接,所述累加子电路112的输出端与所述寄存控制电路12的第一输入端连接,所述寄存控制电路12的输出端与所述寄存器电路13的输入端连接,所述寄存器电路13的输出端与所述选择电路15的第一输入端连接,所述状态控制电路14的第一输出端与所述寄存控制电路13的第二输入端连接,所述状态控制电路14的第二输出端与所述选择电路15的第二输入端连接。As shown in FIG1 , it is a schematic diagram of the structure of a multiplier provided by an embodiment. The multiplier includes: a multiplication circuit 11, a register control circuit 12, a register circuit 13, a state control circuit 14 and a selection circuit 15, wherein the multiplication circuit 11 includes a regular signed number encoding subcircuit 111 and an accumulation subcircuit 112, wherein the output end of the regular signed number encoding subcircuit 111 is connected to the input end of the accumulation subcircuit 112, the output end of the accumulation subcircuit 112 is connected to the first input end of the register control circuit 12, the output end of the register control circuit 12 is connected to the input end of the register circuit 13, the output end of the register circuit 13 is connected to the first input end of the selection circuit 15, the first output end of the state control circuit 14 is connected to the second input end of the register control circuit 13, and the second output end of the state control circuit 14 is connected to the second input end of the selection circuit 15.

其中,所述正则有符号数编码子电路111包括正则有符号数编码单元1111以及部分积获取单元1112,所述正则有符号数编码单元1111用于接收第一数据,并对所述第一数据进行所述正则有符号数编码处理,得到所述目标编码,所述部分积获取单元1112用于接收第二数据,根据所述目标编码以及所述第二数据得到原始部分积,并根据所述原始部分积得到所述目标编码的部分积,所述累加子电路112用于对所述目标编码的部分积进行累加处理得到乘法运算结果;所述状态控制电路14用于获取存储指示信号以及读取指示信号;所述寄存控制电路12用于根据所述状态控制电路14输入的所述存储指示信号,确定存储所述乘法运算结果的所述寄存器电路13,所述寄存器电路13用于存储所述乘法运算结果,所述选择电路15用于根据接收到的所述读取指示信号,读取所述寄存器电路13中存储的所述乘法运算结果中的数据,作为目标运算结果。Among them, the regular signed number encoding subcircuit 111 includes a regular signed number encoding unit 1111 and a partial product acquisition unit 1112, the regular signed number encoding unit 1111 is used to receive the first data, and perform the regular signed number encoding process on the first data to obtain the target code, the partial product acquisition unit 1112 is used to receive the second data, obtain the original partial product according to the target code and the second data, and obtain the partial product of the target code according to the original partial product, the accumulation subcircuit 112 is used to accumulate the partial products of the target code to obtain the multiplication result; the state control circuit 14 is used to obtain a storage indication signal and a read indication signal; the register control circuit 12 is used to determine the register circuit 13 that stores the multiplication result according to the storage indication signal input by the state control circuit 14, the register circuit 13 is used to store the multiplication result, and the selection circuit 15 is used to read the data in the multiplication result stored in the register circuit 13 as the target result according to the received read indication signal.

具体的,上述正则有符号数编码子电路111通过正则有符号数编码单元1111,可以对接收到的第一数据进行正则有符号数编码处理得到目标编码,上述第一数据可以为乘法运算中的乘数。可选的,上述部分积获取单元1112可以根据接收到的第二数据以及目标编码,得到原始部分积,并根据原始部分积得到目标编码的部分积,该第二数据可以为乘法运算中的被乘数。其中,上述乘数以及被乘数可以为同位宽的定点数。可选的,上述寄存器电路13可以包括多个存储单元。可选的,上述乘法运算结果的位宽可以等于正则有符号数编码子电路111接收到的数据位宽的2倍。可选的,上述正则有符号数编码子电路111可以对固定位宽的数据进行处理,并且正则有符号数编码子电路111接收到的数据位宽可以等于乘法器输入端口的位宽,另外,在本实施例中,乘法器输出端口的位宽可以小于输入端口位宽的2倍。可选的,上述选择电路15的输入端口均可以有多个,每个输入端口的功能可以不相同,并且输出端口可以有一个。可选的,上述目标运算结果的位宽可以等于乘法运算结果位宽的1/2,本实施例对此不做任何限定。在本实施例中,还可以理解为,目标运算结果的位宽可以小于乘法运算结果位宽的2倍。可选的,上述目标编码的数目可以等于目标编码的部分积的数目,且该目标编码中可以包含三种数值,分别为-1、0和1。Specifically, the regular signed number encoding subcircuit 111 can perform regular signed number encoding processing on the received first data to obtain the target code through the regular signed number encoding unit 1111, and the first data can be the multiplier in the multiplication operation. Optionally, the partial product acquisition unit 1112 can obtain the original partial product according to the received second data and the target code, and obtain the partial product of the target code according to the original partial product, and the second data can be the multiplicand in the multiplication operation. Among them, the multiplier and the multiplicand can be fixed-point numbers with the same bit width. Optionally, the register circuit 13 can include multiple storage units. Optionally, the bit width of the multiplication result can be equal to twice the bit width of the data received by the regular signed number encoding subcircuit 111. Optionally, the regular signed number encoding subcircuit 111 can process data with a fixed bit width, and the bit width of the data received by the regular signed number encoding subcircuit 111 can be equal to the bit width of the multiplier input port. In addition, in this embodiment, the bit width of the multiplier output port can be less than twice the bit width of the input port. Optionally, the selection circuit 15 may have multiple input ports, each of which may have different functions, and may have one output port. Optionally, the bit width of the target operation result may be equal to 1/2 of the bit width of the multiplication operation result, and this embodiment does not impose any limitation on this. In this embodiment, it can also be understood that the bit width of the target operation result may be less than 2 times the bit width of the multiplication operation result. Optionally, the number of the target codes may be equal to the number of partial products of the target codes, and the target code may contain three values, namely -1, 0 and 1.

需要说明的是,上述状态控制电路14可以自动获取累加子电路112得到每一个乘法运算时,对应的存储指示信号,例如,累加子电路112得到第一个乘法运算结果时,状态控制电路14获取的存储指示信号可以为1,累加子电路112得到第二个乘法运算结果时,状态控制电路14获取的存储指示信号可以为2,依次类推,累加子电路112得到每一个乘法运算结果,状态控制电路14获取的存储指示信号的数值,可以为在上一个乘法运算结果对应存储指示信号数值的基础上加1。可选的,上述状态控制电路14还可以自动获取寄存器电路13中存在乘法运算结果时,当前时钟周期数对应的读取指示信号,其中,上述状态控制电路14可以自动获取当前的时钟周期数,还可以接收外界设备传输的时钟周期数。例如,若第一时钟周期下,寄存器电路13中存储第一个乘法运算结果时,状态控制电路14获取的对应读取指示信号可以为1,此时,选择电路15可以读取寄存器电路13中存储的部分数据,第二时钟周期时,状态控制电路14获取的对应读取指示信号可以为2,此时,选择电路15可以读取寄存器电路13存储的第一个乘法运算结果中的剩余部分数据,还可以理解为,乘法器对应两个时钟周期可以输出一个乘法运算结果;但是,得到第一个乘法运算结果后需要经过五个时钟周期才可以得到第二个乘法运算结果时,第六个时钟周期下,寄存器电路13才可以存储第二个乘法运算结果,此时,状态控制电路14获取的对应读取指示信号可以为3,相当于读取指示信号的数值可以根据寄存器电路13中存储数据的数目确定。It should be noted that the state control circuit 14 can automatically obtain the corresponding storage indication signal when the accumulation subcircuit 112 obtains each multiplication operation. For example, when the accumulation subcircuit 112 obtains the first multiplication operation result, the storage indication signal obtained by the state control circuit 14 can be 1, and when the accumulation subcircuit 112 obtains the second multiplication operation result, the storage indication signal obtained by the state control circuit 14 can be 2. By analogy, when the accumulation subcircuit 112 obtains each multiplication operation result, the value of the storage indication signal obtained by the state control circuit 14 can be based on the value of the storage indication signal corresponding to the previous multiplication operation result plus 1. Optionally, the state control circuit 14 can also automatically obtain the read indication signal corresponding to the current clock cycle number when there is a multiplication operation result in the register circuit 13, wherein the state control circuit 14 can automatically obtain the current clock cycle number and can also receive the clock cycle number transmitted by the external device. For example, if in the first clock cycle, when the first multiplication result is stored in the register circuit 13, the corresponding read indication signal obtained by the state control circuit 14 may be 1, at which time, the selection circuit 15 may read part of the data stored in the register circuit 13; in the second clock cycle, the corresponding read indication signal obtained by the state control circuit 14 may be 2, at which time, the selection circuit 15 may read the remaining data in the first multiplication result stored in the register circuit 13. It can also be understood that the multiplier can output a multiplication result corresponding to two clock cycles; however, when it takes five clock cycles after obtaining the first multiplication result to obtain the second multiplication result, the register circuit 13 can store the second multiplication result in the sixth clock cycle. At this time, the corresponding read indication signal obtained by the state control circuit 14 may be 3, which is equivalent to that the value of the read indication signal can be determined according to the number of data stored in the register circuit 13.

另外,累加子电路112得到的乘法运算结果并不是乘法器得到的目标运算结果,目标运算结果可以通过乘法器两次输出的两个运算结果拼接得到,且乘法器中的选择电路15第一次输出的运算结果,与第二次输出的运算结果拼接,可以得到乘法器得到的目标运算结果,依次类推,选择电路15两次输出的运算结果拼接,可以得到乘法器每一次乘法运算得到的目标运算结果。此外,乘法运算电路11对应多个时钟周期也可以输出一个目标运算结果。In addition, the multiplication result obtained by the accumulation subcircuit 112 is not the target operation result obtained by the multiplier. The target operation result can be obtained by splicing the two operation results output by the multiplier twice, and the operation result output by the selection circuit 15 in the multiplier for the first time is spliced with the operation result output for the second time to obtain the target operation result obtained by the multiplier. By analogy, the operation results output by the selection circuit 15 twice are spliced to obtain the target operation result obtained by each multiplication operation of the multiplier. In addition, the multiplication circuit 11 can also output a target operation result corresponding to multiple clock cycles.

需要说明的是,乘法器可以通过寄存控制电路12,接收累加子电路112每一次乘法运算输出的乘法运算结果,并根据接收到的存储指示信号,确定存储每一个乘法运算结果的存储单元。可选的,选择电路15可以根据接收到的不同读取指示信号,确定读取对应的寄存器电路13中,存储的乘法运算结果中的数据。可选的,若乘法器输入端口的位宽为N,并且接收到的数据位宽也为N,此时,乘法器输出端口的位宽M可以等于2N/t+deta((2N/t+deta)<2N),其中,通常情况下,乘法运算电路11经过t(t>1)个时钟周期可以完成一次乘法器实现的乘法运算,得到一个乘法运算结果,并将乘法运算电路11中的累加子电路112得到的乘法运算结果存储至寄存器电路13中,其中,deta(deta>=0)为一个常数。另外,还存在一种小概率的情况,乘法器可以经过m(m<t,且m<=1)个时钟周期完成一次乘法运算,得到一个乘法运算结果,并将乘法运算电路11中的累加子电路112得到的乘法运算结果存储至寄存器电路13中。可选的,选择电路15可以两次读取寄存器电路13存储的乘法运算结果中的数据,其中,乘法运算结果的位宽可以等于2N,读取的乘法运算结果中的数据位宽可以等于N,选择电路15两次可以分别读取同一个乘法运算结果中的高N位数据,和低N位数据作为两次运算结果,并将两个运算结果拼接,得到乘法器进行乘法运算得到的目标运算结果。It should be noted that the multiplier can receive the multiplication result output by the accumulation subcircuit 112 for each multiplication operation through the register control circuit 12, and determine the storage unit for storing each multiplication result according to the received storage indication signal. Optionally, the selection circuit 15 can determine to read the data in the multiplication result stored in the corresponding register circuit 13 according to the different read indication signals received. Optionally, if the bit width of the multiplier input port is N, and the received data bit width is also N, then the bit width M of the multiplier output port can be equal to 2N/t+deta ((2N/t+deta)<2N), wherein, under normal circumstances, the multiplication circuit 11 can complete a multiplication operation implemented by the multiplier after t (t>1) clock cycles, obtain a multiplication result, and store the multiplication result obtained by the accumulation subcircuit 112 in the multiplication circuit 11 in the register circuit 13, wherein deta (deta>=0) is a constant. In addition, there is a small probability that the multiplier can complete a multiplication operation after m (m<t, and m<=1) clock cycles, obtain a multiplication operation result, and store the multiplication operation result obtained by the accumulation subcircuit 112 in the multiplication operation circuit 11 in the register circuit 13. Optionally, the selection circuit 15 can read the data in the multiplication operation result stored in the register circuit 13 twice, wherein the bit width of the multiplication operation result can be equal to 2N, and the data bit width in the read multiplication operation result can be equal to N. The selection circuit 15 can read the high N-bit data and the low N-bit data in the same multiplication operation result twice as two operation results, and splice the two operation results to obtain the target operation result obtained by the multiplier performing the multiplication operation.

另外,在本实施例中,可以理解的是,上述部分积获取单元1112可以根据原始部分积得到符号位扩展后的部分积,并且根据符号位扩展后的部分积得到目标编码的部分积。可选的,上述符号位扩展后的部分积的位宽可以等于乘法器接收到的数据位宽N的2倍,上述原始部分积的位宽可以等于乘法器接收到的数据位宽N。可选的,符号位扩展后的部分积中的高N位数值可以等于原始部分积中的最高位数值,即原始部分积的符号位数值,也就是,符号位扩展后的部分积中的高N+1位数值相等于,低N-1位数值可以等于原始部分积中的低N-1位数值。In addition, in this embodiment, it can be understood that the partial product acquisition unit 1112 can obtain the partial product after the sign bit is extended according to the original partial product, and obtain the partial product of the target code according to the partial product after the sign bit is extended. Optionally, the bit width of the partial product after the sign bit is extended can be equal to twice the data bit width N received by the multiplier, and the bit width of the original partial product can be equal to the data bit width N received by the multiplier. Optionally, the high N-bit value in the partial product after the sign bit is extended can be equal to the highest bit value in the original partial product, that is, the sign bit value of the original partial product, that is, the high N+1 bit value in the partial product after the sign bit is extended is equal to, and the low N-1 bit value can be equal to the low N-1 bit value in the original partial product.

示例性的,若乘法器当前处理8位*8位的定点数乘法运算,通过部分积获取单元1112得到的一个原始部分积为“p7p6p5p4p3p2p1p0”,对原始部分积进行符号位扩展处理,得到的符号位扩展后的部分积可以表示为“p7p7p7p7p7p7p7p7p7p6p5p4p3p2p1p0”。Exemplarily, if the multiplier currently processes an 8-bit*8-bit fixed-point multiplication operation, an original partial product obtained by the partial product acquisition unit 1112 is "p7 p6 p5 p4 p3 p2 p1 p0 ", and the original partial product is sign-extended, and the partial product after sign-extended processing can be expressed as "p7 p7 p7 p7 p 7 p7 p7 p7 p7 p6 p5 p4 p3 p2 p1p0 ".

还可以理解的是,所有目标编码的部分积的分布规律中,每一个目标编码的部分积均可以有对应的符号位扩展后的部分积,第一个目标编码的部分积可以为第一个符号位扩展后的部分积,从第二个目标编码的部分积开始,对应的符号位扩展后的部分积在上一个目标编码的部分积的基础上,可以向左移动一位数值,每一个目标编码的部分积的最高位数值与第一个目标编码的部分积的最高位数值位于同一列,相当于,从第二个目标编码的部分积开始,左移每一个符号位扩展后的部分积后,对应左移的更高位数值不进行加法运算。It can also be understood that, in the distribution pattern of the partial products of all target codes, the partial products of each target code can have a corresponding partial product after sign bit extension, the partial product of the first target code can be the first partial product after sign bit extension, and starting from the partial product of the second target code, the corresponding partial product after sign bit extension can be shifted one value to the left based on the partial product of the previous target code, and the highest bit value of the partial product of each target code is in the same column as the highest bit value of the partial product of the first target code, which is equivalent to, starting from the partial product of the second target code, after each sign bit extended partial product is shifted left, the corresponding higher bit values of the left shift are not added.

本实施例提供的一种乘法器,乘法器通过正则有符号数编码子电路对接收到的数据进行正则有符号数编码处理得到目标编码,并根据目标编码得到目标编码的部分积,通过累加子电路对符号位扩展后的部分积进行累加处理得到乘法运算结果,通过状态控制电路获取存储指示信号以及读取指示信号,并且寄存控制电路根据存储指示信号,确定存储所述乘法运算结果的寄存器电路,通过寄存器电路存储乘法运算结果,同时,选择电路根据读取指示信号读取寄存器电路中,存储的乘法运算结果中的数据得到目标运算结果,该乘法器能够采用正则有符号数编码子电路对接收到的数据进行正则有符号数编码处理,降低乘法运算过程中获取的有效部分积的数目,从而降低乘法器实现乘法运算的复杂性;同时,该乘法器能够提高乘法运算的运算效率,有效降低了乘法器的功耗。The present embodiment provides a multiplier, wherein the multiplier performs regular signed number encoding processing on received data through a regular signed number encoding subcircuit to obtain a target code, and obtains partial products of the target code according to the target code, performs accumulation processing on partial products after sign bit extension through an accumulation subcircuit to obtain a multiplication result, obtains a storage indication signal and a read indication signal through a state control circuit, and a register control circuit determines a register circuit for storing the multiplication result according to the storage indication signal, stores the multiplication result through the register circuit, and at the same time, a selection circuit reads the data in the multiplication result stored in the register circuit according to the read indication signal to obtain the target operation result. The multiplier can use the regular signed number encoding subcircuit to perform regular signed number encoding processing on the received data, reduce the number of effective partial products obtained during the multiplication operation, and thus reduce the complexity of the multiplication operation implemented by the multiplier; at the same time, the multiplier can improve the operation efficiency of the multiplication operation and effectively reduce the power consumption of the multiplier.

如图2所示为一实施例提供的一种乘法器的具体结构示意图。该乘法器包括:乘法运算电路21和转数电路22,所述乘法运算电路21包括正则有符号数编码子电路211以及累加子电路212,所述正则有符号数编码子电路211的输出端与所述累加子电路212的输入端连接,所述累加子电212路的输出端与所述转数电路22的输入端连接,所述转数电路22包括第一转换子电路221和第二转换子电路222;其中,所述正则有符号数编码子电路211用于对接收到的数据进行正则有符号数编码处理得到目标编码,并根据所述目标编码得到目标编码的部分积,所述累加子电路212用于对所述目标编码的部分积进行累加处理得到乘法运算结果,所述第一转换子电路221及第二转换子电路222分别用于对所述乘法运算结果进行转数处理,得到目标运算结果。As shown in FIG2 , a specific structural diagram of a multiplier provided by an embodiment is shown. The multiplier includes: a multiplication circuit 21 and a rotation circuit 22, wherein the multiplication circuit 21 includes a regular signed number encoding subcircuit 211 and an accumulation subcircuit 212, wherein the output end of the regular signed number encoding subcircuit 211 is connected to the input end of the accumulation subcircuit 212, and the output end of the accumulation subcircuit 212 is connected to the input end of the rotation circuit 22, and the rotation circuit 22 includes a first conversion subcircuit 221 and a second conversion subcircuit 222; wherein the regular signed number encoding subcircuit 211 is used to perform regular signed number encoding processing on the received data to obtain a target code, and obtain a partial product of the target code according to the target code, and the accumulation subcircuit 212 is used to perform accumulation processing on the partial product of the target code to obtain a multiplication result, and the first conversion subcircuit 221 and the second conversion subcircuit 222 are respectively used to perform rotation processing on the multiplication result to obtain a target operation result.

可选的,所述正则有符号数编码子电路211包括正则有符号数编码单元2111以及部分积获取单元2112,所述正则有符号数编码单元2111用于接收第一数据,并对所述第一数据进行所述正则有符号数编码处理,得到所述目标编码,所述部分积获取单元2112用于接收第二数据,根据所述目标编码以及所述第二数据得到原始部分积,并根据所述原始部分积得到所述目标编码的部分积。Optionally, the regular signed number encoding subcircuit 211 includes a regular signed number encoding unit 2111 and a partial product acquisition unit 2112, the regular signed number encoding unit 2111 is used to receive first data and perform the regular signed number encoding process on the first data to obtain the target code, the partial product acquisition unit 2112 is used to receive second data, obtain the original partial product according to the target code and the second data, and obtain the partial product of the target code according to the original partial product.

具体的,上述正则有符号数编码子电路211可以对接收到的数据进行正则有符号数编码处理,上述数据可以为乘法运算中的乘数以及被乘数,且乘数以及被乘数可以为同位宽的定点数。可选的,上述正则有符号数编码子电路211可以包括多个具有不同功能的数据处理子电路,多个不同功能的数据处理子电路的输入端口可以有一个或多个,每个数据处理子电路中的每个输入端口的功能可以不相同,输出端口也可以有一个,每个数据处理子电路中的每个输出端口的功能可以不相同,并且不同功能数据处理子电路的电路结构可以不相同。可选的,上述转数电路22可以将累加子电212输出的乘法运算结果,转换成目标格式的数据,即目标运算结果,其中,乘法运算结果可以为定点数,则上述目标格式的数据可以为定点数,也可以为浮点数,另外,目标格式的数据位宽可以小于乘法运算结果位宽的2倍。可选的,目标运算结果可以为乘法运算结果中的部分数据。可选的,上述目标运算结果的位宽可以等于乘法运算结果位宽的1/2,还可以等于乘法运算结果位宽的1/4,本实施例对此不做任何限定。在本实施例中,还可以理解为,目标运算结果的位宽小于乘法运算结果位宽的2倍。另外,累加子电212得到的乘法运算结果并不是乘法器实现乘法运算得到的目标运算结果,只是目标运算结果中的部分数据。可选的,上述目标编码的数目可以等于目标编码的部分积的数目,且该目标编码中可以包含三种数值,分别为-1、0和1。Specifically, the regular signed number encoding subcircuit 211 can perform regular signed number encoding processing on the received data, and the data can be the multiplier and the multiplicand in the multiplication operation, and the multiplier and the multiplicand can be fixed-point numbers with the same bit width. Optionally, the regular signed number encoding subcircuit 211 can include multiple data processing subcircuits with different functions, and the input ports of the multiple data processing subcircuits with different functions can be one or more, and the functions of each input port in each data processing subcircuit can be different, and the output port can also be one, and the functions of each output port in each data processing subcircuit can be different, and the circuit structures of the data processing subcircuits with different functions can be different. Optionally, the conversion circuit 22 can convert the multiplication result output by the accumulation circuit 212 into data in a target format, that is, a target operation result, wherein the multiplication result can be a fixed-point number, and the data in the target format can be a fixed-point number or a floating-point number. In addition, the data bit width of the target format can be less than twice the bit width of the multiplication result. Optionally, the target operation result can be part of the data in the multiplication result. Optionally, the bit width of the above-mentioned target operation result can be equal to 1/2 of the bit width of the multiplication operation result, and can also be equal to 1/4 of the bit width of the multiplication operation result, and this embodiment does not impose any limitation on this. In this embodiment, it can also be understood that the bit width of the target operation result is less than twice the bit width of the multiplication operation result. In addition, the multiplication operation result obtained by the accumulator 212 is not the target operation result obtained by the multiplication operation implemented by the multiplier, but only part of the data in the target operation result. Optionally, the number of the above-mentioned target codes can be equal to the number of partial products of the target codes, and the target code can contain three values, namely -1, 0 and 1.

需要说明的是,上述正则有符号数编码子电路211可以对固定位宽的数据进行乘法运算处理,并且正则有符号数编码子电路211接收到的数据位宽可以等于乘法器输入端口的位宽,另外,在本实施例中,乘法器输出端口的位宽可以小于输入端口位宽的2倍。It should be noted that the above-mentioned regular signed number encoding subcircuit 211 can perform multiplication operations on data with a fixed bit width, and the data bit width received by the regular signed number encoding subcircuit 211 can be equal to the bit width of the multiplier input port. In addition, in this embodiment, the bit width of the multiplier output port can be less than twice the bit width of the input port.

可选的,所述转数电路22中包括输入端口,用于接收数据转换信号。可选的,所述数据转换信号用于确定所述转数电路22处理的数据转换类型。Optionally, the conversion circuit 22 includes an input port for receiving a data conversion signal. Optionally, the data conversion signal is used to determine the type of data conversion processed by the conversion circuit 22.

可选的,上述数据转换信号可以有多种,不同数据转换信号对应转数电路22可以将接收到的数据转换成目标格式的数据。可选的,上述数据转换类型可以包括定点数转定点数,以及定点数转浮点数。示例性的,若乘法器输入端口和输出端口的位宽均为N,则乘法器可以得到2N比特位宽的乘法运算结果,并且乘法器可以通过转数电路22将2N比特位宽的乘法运算结果,转换成N比特位宽的目标运算结果,该目标运算结果可以为浮点数,另外,乘法器还可以通过转数电路22将2N比特位宽的乘法运算结果,转换成N比特位宽的定点数,即目标运算结果。在本实施例中,正则有符号数编码子电路211的电路结构及其功能,与正则有符号数编码子电路111的电路结构及其功能相同,对此本实施例不再赘述正则有符号数编码子电路211的具体结构。Optionally, the above-mentioned data conversion signal may have multiple types, and the conversion circuit 22 corresponding to different data conversion signals may convert the received data into data of a target format. Optionally, the above-mentioned data conversion type may include fixed-point number to fixed-point number, and fixed-point number to floating-point number. Exemplarily, if the bit width of the input port and the output port of the multiplier are both N, the multiplier can obtain a multiplication result of 2N bits, and the multiplier can convert the multiplication result of 2N bits into a target operation result of N bits through the conversion circuit 22, and the target operation result can be a floating point number. In addition, the multiplier can also convert the multiplication result of 2N bits into a fixed-point number of N bits, i.e., the target operation result, through the conversion circuit 22. In this embodiment, the circuit structure and function of the regular signed number encoding subcircuit 211 are the same as the circuit structure and function of the regular signed number encoding subcircuit 111, and the specific structure of the regular signed number encoding subcircuit 211 will not be repeated in this embodiment.

本实施例提供的一种乘法器,该乘法器能够采用正则有符号数编码子电路对接收到的数据进行正则有符号数编码处理,降低乘法运算过程中获取的有效部分积的数目,从而降低乘法器实现乘法运算的复杂性;同时,该乘法器能够提高乘法运算的运算效率,有效降低了乘法器的功耗。A multiplier provided in this embodiment can use a regular signed number encoding subcircuit to perform regular signed number encoding processing on received data, thereby reducing the number of effective partial products obtained during the multiplication operation, thereby reducing the complexity of the multiplication operation implemented by the multiplier; at the same time, the multiplier can improve the operational efficiency of the multiplication operation and effectively reduce the power consumption of the multiplier.

作为其中一个实施例,所述正则有符号数编码单元1111可以包括:第一数据输入端口1111a和目标编码输出端口1111b;所述第一数据输入端口1111a用于接收进行正则有符号数编码处理的所述第一数据,所述目标编码输出端口1111b用于输出对所述第一数据进行正则有符号数编码处理后得到的所述目标编码。As one of the embodiments, the regular signed number encoding unit 1111 may include: a first data input port 1111a and a target code output port 1111b; the first data input port 1111a is used to receive the first data for regular signed number encoding processing, and the target code output port 1111b is used to output the target code obtained after regular signed number encoding processing is performed on the first data.

具体的,正则有符号数编码单元1111中的第一数据输入端口1111a接收到的第一数据可以为乘法运算中的乘数,该乘数可以为定点数。可选的,部分积获取单元1112接收到的第二数据可以为乘法运算中的被乘数,该被乘数可以为定点数,且上述乘数和被乘数可以为同位宽的数据。可选的,上述目标编码的数目可以等于原始部分积的数目以及目标编码的部分积的数目。Specifically, the first data received by the first data input port 1111a in the regular signed number encoding unit 1111 may be a multiplier in a multiplication operation, and the multiplier may be a fixed-point number. Optionally, the second data received by the partial product acquisition unit 1112 may be a multiplicand in a multiplication operation, and the multiplicand may be a fixed-point number, and the multiplier and multiplicand may be data of the same bit width. Optionally, the number of the target codes may be equal to the number of the original partial products and the number of the target code partial products.

需要说明的是,上述正则有符号数编码处理的方法可以通过以下方式表征:对于N位乘数而言,从低位数值向高位数值处理,若存在连续l(l>=2)位数值1时,则可以将连续n位数值1转换处理为数据“1(0)l-1(-1)”,并且将其余对应(N-l)位数值与转换后的(l+1)位数值进行结合得到一个新的数据;然后将该新数据作为下一级转换处理的初始数据,直到转换处理后得到的新数据中不存在连续l(l>=2)位数值1为止;其中,对N位乘数进行正则有符号数编码处理,得到的目标编码的位宽可以等于(N+1)。进一步地,在正则有符号数编码处理时,数据11可以转换为(100-001),即数据11可以等价转换为10(-1);数据111可以转换为(1000-0001),即数据111可以等价转换为100(-1);依次类推,其它连续l(l>=2)位数值1转换处理的方式也类似。It should be noted that the above-mentioned method of regular signed number encoding processing can be characterized in the following manner: for an N-bit multiplier, processing is performed from low-order numerical values to high-order numerical values. If there are continuous l (l>=2)-bit numerical values 1, then the continuous n-bit numerical values 1 can be converted and processed into data "1(0)l-1 (-1)", and the remaining corresponding (Nl)-bit numerical values are combined with the converted (l+1)-bit numerical value to obtain a new data; then the new data is used as the initial data for the next level of conversion processing until there are no continuous l (l>=2)-bit numerical values 1 in the new data obtained after the conversion processing; wherein, the regular signed number encoding processing is performed on the N-bit multiplier, and the bit width of the obtained target code can be equal to (N+1). Furthermore, when regular signed number encoding is performed, data 11 can be converted to (100-001), that is, data 11 can be equivalently converted to 10(-1); data 111 can be converted to (1000-0001), that is, data 111 can be equivalently converted to 100(-1); and so on, the conversion processing method for other consecutive l (l>=2)-bit values 1 is similar.

例如,正则有符号数编码单元1111接收到的乘数为“001010101101110”,对该乘数进行第一级转换处理后得到的第一新数据为“0010101011100(-1)0”,继续对第一新数据进行第二级转换处理后得到的第二新数据为“0010101100(-1)00(-1)0”,继续对第二新数据进行第三级转换处理后得到的第三新数据为“0010110(-1)00(-1)00(-1)0”,继续对第三新数据进行第四级转换处理后得到的第四新数据为“00110(-1)0(-1)00(-1)00(-1)0”,继续对第四新数据进行第五级转换处理后得到的第五新数据为“010(-1)0(-1)0(-1)00(-1)00(-1)0”,第五新数据中不存在连续的l(l>=2)位数值1,此时,将第五新数据可以称为中间编码,并对中间编码进行一次补位处理后,表征正则有符号数编码处理完成,其中,中间编码的位宽可以等于乘数的位宽。可选的,正则有符号数编码单元1111对乘数进行正则有符号数编码处理后,得到的新数据(即中间编码)中,若新数据中的最高位数值和次高位数值为“10”或“01”,则正则有符号数编码单元1111可以对该新数据得到的中间编码的最高位数值的更高一位处补一位数值0,得到对应目标编码的高三位数值分别为“010”或“001”。可选的,上述中间编码的位宽可以等于目标编码的位宽减1。For example, the multiplier received by the regular signed number encoding unit 1111 is "001010101101110", and the first new data obtained after the first level conversion processing of the multiplier is "0010101011100(-1)0", and the second new data obtained after the second level conversion processing of the first new data is "0010101100(-1)00(-1)0", and the third new data obtained after the third level conversion processing of the second new data is "0010110(-1)00(-1)00(-1)0", and the third new data obtained after the third level conversion processing of the third new data is The fourth new data obtained after the fourth-level conversion processing is "00110(-1)0(-1)00(-1)00(-1)00(-1)0", and the fifth new data obtained after continuing to perform the fifth-level conversion processing on the fourth new data is "010(-1)0(-1)0(-1)00(-1)00(-1)00(-1)0". There are no consecutive l (l>=2)-bit values 1 in the fifth new data. At this time, the fifth new data can be called an intermediate code, and after a bit-filling processing is performed on the intermediate code, the regular signed number encoding processing is completed, wherein the bit width of the intermediate code can be equal to the bit width of the multiplier. Optionally, after the regular signed number encoding unit 1111 performs regular signed number encoding on the multiplier, in the new data (i.e., the intermediate code) obtained, if the highest bit value and the second highest bit value in the new data are "10" or "01", the regular signed number encoding unit 1111 can add a single-digit value 0 to the higher bit of the highest bit value of the intermediate code obtained for the new data, and obtain the corresponding target code The high three-digit values are "010" or "001" respectively. Optionally, the bit width of the above intermediate code can be equal to the bit width of the target code minus 1.

需要说明的是,正则有符号数编码单元1111可以通过目标编码输出端口1111b将目标编码输出。可选的,上述目标编码的位宽可以等于正则有符号数编码单元1111接收到的数据的位宽,且目标编码中可以包含三种数值,分别为-1、0以及1,也可以理解的是,目标编码中包含的数值的数目可以等于目标编码的位宽。It should be noted that the regular signed number encoding unit 1111 can output the target code through the target code output port 1111b. Optionally, the bit width of the target code can be equal to the bit width of the data received by the regular signed number encoding unit 1111, and the target code can include three values, namely -1, 0 and 1. It can also be understood that the number of values included in the target code can be equal to the bit width of the target code.

本实施例提供的一种乘法器,通过乘法运算电路中的正则有符号数编码单元可以对接收到的数据进行正则有符号数编码处理得到目标编码,再通过部分积获取单元根据每一个目标编码得到原始部分积,并根据原始部分积得到目标编码的部分积,最后通过累加子电路对目标编码的部分积进行累加处理,得到乘法运算处理,通过状态控制电路获取存储指示信号以及读取指示信号,并且寄存控制电路根据存储指示信号,确定存储所述乘法运算结果的寄存器电路,通过寄存器电路存储乘法运算结果,同时,选择电路根据读取指示信号读取寄存器电路中,存储的乘法运算结果中的数据得到目标运算结果,该乘法器能够采用正则有符号数编码单元对接收到的数据进行正则有符号数编码处理,降低乘法运算过程中获取的有效部分积的数目,从而降低乘法器实现乘法运算的复杂性;同时,该乘法器能够提高乘法运算的运算效率,有效降低了乘法器的功耗。A multiplier provided in this embodiment can perform regular signed number encoding processing on received data to obtain a target code through a regular signed number encoding unit in a multiplication operation circuit, and then obtain an original partial product according to each target code through a partial product acquisition unit, and obtain a partial product of the target code according to the original partial product, and finally perform accumulation processing on the partial products of the target code through an accumulation subcircuit to obtain a multiplication operation, obtain a storage indication signal and a read indication signal through a state control circuit, and a storage control circuit determines a register circuit for storing the multiplication operation result according to the storage indication signal, and stores the multiplication operation result through the register circuit, and at the same time, a selection circuit reads the data in the stored multiplication operation result in the register circuit according to the read indication signal to obtain the target operation result. The multiplier can use the regular signed number encoding unit to perform regular signed number encoding processing on the received data, reduce the number of effective partial products obtained during the multiplication operation, and thus reduce the complexity of the multiplier to implement the multiplication operation; at the same time, the multiplier can improve the operation efficiency of the multiplication operation and effectively reduce the power consumption of the multiplier.

作为其中一个实施例,所述部分积获取单元1112具体用于对所述目标编码进行转换处理得到原始部分积,并对所述原始部分积进行符号位扩展处理,得到符号位扩展后的部分积,根据所述符号位扩展后的部分积得到所述目标编码的部分积。As one of the embodiments, the partial product acquisition unit 1112 is specifically used to convert the target code to obtain the original partial product, and perform sign bit extension processing on the original partial product to obtain the sign bit extended partial product, and obtain the partial product of the target code according to the sign bit extended partial product.

具体的,上述转换处理可以表征为,基于乘法运算中的被乘数(即X),将目标编码中的数值转换成原始部分积。可选的,目标编码中的每一位数值均有对应的原始部分积;若目标编码中的数值为-1,则对应的原始部分积可以为-X,若目标编码中的数值为1,则对应的原始部分积可以为X,若目标编码中的数值为0,则对应的原始部分积可以为0。可选的,上述原始部分积可以为未进行符号位扩展的部分积,且原始部分积的位宽可以与乘法运算电路11当前所处理数据的位宽相同。可选的,上述符号位扩展后的部分积的位宽可以等于乘法器处理数据位宽N的2倍,此时,原始部分积的位宽可以等于N。可选的,符号位扩展后的部分积中的低N位数值可以等于原始部分积包含的N位数值,符号位扩展后的部分积中的高N位数值可以等于原始部分积的最高位数值,即原始部分积的符号位数值。Specifically, the above conversion process can be characterized as converting the value in the target code into the original partial product based on the multiplicand (i.e., X) in the multiplication operation. Optionally, each bit value in the target code has a corresponding original partial product; if the value in the target code is -1, the corresponding original partial product can be -X, if the value in the target code is 1, the corresponding original partial product can be X, and if the value in the target code is 0, the corresponding original partial product can be 0. Optionally, the above original partial product can be a partial product without sign bit extension, and the bit width of the original partial product can be the same as the bit width of the data currently processed by the multiplication circuit 11. Optionally, the bit width of the partial product after the sign bit extension can be equal to 2 times the bit width N of the data processed by the multiplier, in which case the bit width of the original partial product can be equal to N. Optionally, the lower N-bit value in the partial product after the sign bit extension can be equal to the N-bit value contained in the original partial product, and the upper N-bit value in the partial product after the sign bit extension can be equal to the highest bit value of the original partial product, that is, the sign bit value of the original partial product.

另外,部分积获取单元1112可以根据得到的所有符号位扩展后的部分积得到目标编码的部分积,所有目标编码的部分积的分布规律中,第一个目标编码的部分积可以等于第一个符号位扩展后的部分积,从第二个目标编码的部分积开始,每一个目标编码的部分积的最高位数值可以与第一个目标编码的部分积的最高位数值位于同一列,每一个目标编码的部分积的位宽可以等于上一个目标编码的部分积的位宽减1,还可以等于每一个对应符号位扩展后的部分积的位宽2N减(i-1),其中,i表示目标编码的部分积从1开始的编号,得到的9个目标编码的部分积的分布图可见图3所示。In addition, the partial product acquisition unit 1112 can obtain the partial product of the target code based on all the partial products after the sign bit expansion. In the distribution pattern of the partial products of all the target codes, the partial product of the first target code can be equal to the partial product after the first sign bit expansion. Starting from the partial product of the second target code, the highest bit value of the partial product of each target code can be located in the same column as the highest bit value of the partial product of the first target code. The bit width of the partial product of each target code can be equal to the bit width of the partial product of the previous target code minus 1, and can also be equal to the bit width of each corresponding partial product after the sign bit expansion 2N minus (i-1), where i represents the number of the partial products of the target code starting from 1. The distribution diagram of the partial products of the 9 target codes obtained can be seen in Figure 3.

可选的,所述部分积获取单元1112包括:目标编码输入端口1112a、第二数据输入端口1112b以及部分积输出端口1112c;所述目标编码输入端口1112a用于接收所述目标编码,所述第二数据输入端口1112b用于接收所述第二数据,所述部分积输出端口1112c用于输出所述目标编码的部分积。Optionally, the partial product acquisition unit 1112 includes: a target code input port 1112a, a second data input port 1112b and a partial product output port 1112c; the target code input port 1112a is used to receive the target code, the second data input port 1112b is used to receive the second data, and the partial product output port 1112c is used to output the partial product of the target code.

在本实施例中,部分积获取单元1112可以通过目标编码输入端口1112a接收正则有符号数编码单元1111得到的目标编码,通过第二数据输入端口1112b接收第二数据,根据目标编码以及第二数据进行转换处理,以及移位处理得到目标编码的部分积,并将目标编码的部分积通过部分积输出端口1112c输出。In this embodiment, the partial product acquisition unit 1112 can receive the target code obtained by the regular signed number encoding unit 1111 through the target code input port 1112a, receive the second data through the second data input port 1112b, perform conversion processing and shift processing according to the target code and the second data to obtain the partial product of the target code, and output the partial product of the target code through the partial product output port 1112c.

本实施例提供的一种乘法器,乘法器能够获取的有效部分积的数目较少,从而降低乘法器实现乘法运算的复杂性;同时,该乘法器能够提高乘法运算的运算效率,有效降低了乘法器的功耗。The multiplier provided in this embodiment can obtain a smaller number of effective partial products, thereby reducing the complexity of the multiplication operation implemented by the multiplier; at the same time, the multiplier can improve the operational efficiency of the multiplication operation and effectively reduce the power consumption of the multiplier.

另一实施例提供的一种乘法器,其中,乘法器包括所述累加子电路112,该累加子电路112包括:华莱士树组单元1121和累加单元1122;其中,所述华莱士树组单元1121的输出端与所述累加单元1122的输入端连接;所述华莱士树组单元1121用于对所述目标编码的部分积进行累加处理得到累加运算结果,所述累加单元1122用于对所述累加运算结果进行累加处理。Another embodiment provides a multiplier, wherein the multiplier includes the accumulation subcircuit 112, and the accumulation subcircuit 112 includes: a Wallace tree group unit 1121 and an accumulation unit 1122; wherein the output end of the Wallace tree group unit 1121 is connected to the input end of the accumulation unit 1122; the Wallace tree group unit 1121 is used to perform accumulation processing on the partial products of the target code to obtain an accumulation operation result, and the accumulation unit 1122 is used to perform accumulation processing on the accumulation operation result.

具体的,上述华莱士树组单元1121可以对部分积获取单元1112得到的所有目标编码的部分积中的数值进行累加处理得到累加运算结果,并通过累加单元1122对华莱士树组单元1121得到累加运算结果进行累加处理,得到目标运算结果。Specifically, the Wallace tree group unit 1121 may perform accumulation processing on the values in the partial products of all target codes obtained by the partial product acquisition unit 1112 to obtain an accumulation operation result, and perform accumulation processing on the accumulation operation result obtained by the Wallace tree group unit 1121 through the accumulation unit 1122 to obtain a target operation result.

本实施例提供的一种乘法器,通过华莱士树组单元可以对目标编码的部分积进行累加处理,并通过累加单元对累加结果进行累加处理,得到乘法运算结果,并根据乘法运算结果得到目标运算结果,从而保证乘法器获取的有效部分积的数目较少,降低乘法器实现乘法运算的复杂性;同时,该乘法器能够提高乘法运算的运算效率,有效降低了乘法器的功耗。A multiplier provided in this embodiment can perform accumulation processing on partial products of a target code through a Wallace tree group unit, and perform accumulation processing on the accumulation results through an accumulation unit to obtain a multiplication operation result, and obtain a target operation result based on the multiplication operation result, thereby ensuring that the number of effective partial products obtained by the multiplier is small, reducing the complexity of the multiplication operation implemented by the multiplier; at the same time, the multiplier can improve the operation efficiency of the multiplication operation and effectively reduce the power consumption of the multiplier.

另一个实施例提供的乘法器中的华莱士树组单元1121包括:华莱士树子单元1121_1~1121_n,多个所述华莱士树子单元1121_1~1121_n用于对所有目标编码的部分积中的每一列数值进行累加处理。The Wallace tree group unit 1121 in the multiplier provided in another embodiment includes: Wallace tree sub-units 1121_1~1121_n, and the multiple Wallace tree sub-units 1121_1~1121_n are used to perform accumulation processing on each column value in the partial products of all target codes.

具体的,华莱士树子单元1121_1~1121_n的电路结构可以由全加器和半加器组合实现,另外,还可以理解为华莱士树子单元1121_1~1121_n是一种能够对多位输入信号进行处理,将多位输入信号相加得到两位输出信号的电路。可选的,华莱士树组单元1121包含的华莱士树子单元的数目n可以等于乘法运算电路11当前所处理数据位宽的2倍,并且n个华莱士树子单元可以对目标编码的部分积进行并行处理,但连接方式可以是串行连接。可选的,华莱士树组单元1121中每一个华莱士树子单元可以对所有目标编码的部分积中的每一列数值进行加法处理,每一个华莱士树子单元可以输出两个信号,即进位信号Carryi与一个和位信号Sumi,其中,i可以表示每一个华莱士树子单元对应的编号,第一个华莱士树子单元的编号为1。可选的,每一个华莱士树子单元接收到输入信号的数目可以等于目标编码的数目或者符号位扩展后的部分积的数目。Specifically, the circuit structure of the Wallace tree subunits 1121_1 to 1121_n can be implemented by a combination of a full adder and a half adder. In addition, it can also be understood that the Wallace tree subunits 1121_1 to 1121_n are a circuit that can process multi-bit input signals and add the multi-bit input signals to obtain a two-bit output signal. Optionally, the number n of Wallace tree subunits included in the Wallace tree group unit 1121 can be equal to twice the data bit width currently processed by the multiplication circuit 11, and the n Wallace tree subunits can process the partial products of the target code in parallel, but the connection method can be a serial connection. Optionally, each Wallace tree subunit in the Wallace tree group unit 1121 can perform addition processing on each column of values in the partial products of all target codes, and each Wallace tree subunit can output two signals, namely a carry signal Carryi and a sum signal Sumi , where i can represent the number corresponding to each Wallace tree subunit, and the first Wallace tree subunit is numbered 1. Optionally, the number of input signals received by each Wallace tree subunit may be equal to the number of target codes or the number of partial products after sign bit extension.

另外,华莱士树组单元1121中每一个华莱士树子单元接收到的信号可以包括进位输入信号Cini,部分积输入信号,进位输出信号Couti。可选的,每一个华莱士树子单元接收到的部分积输入信号可以为所有目标编码的部分积中的每一列数值,每一个华莱士树子单元输出的进位信号Couti的位数可以等于NCout=floor((NI+NCin)/2)-1。其中,NI可以表示该华莱士树子单元的部分积数值输入信号的数目,NCin可以表示该华莱士树子单元的进位输入信号的数目,NCout可以表示该华莱士树子单元最少的进位输出信号的数目,floor(·)可以表示向下取整函数。可选的,华莱士树组单元1121中每一个华莱士树子单元接收到的进位输入信号,可以为上一个华莱士树子单元输出的进位输出信号,而第一个华莱士树子单元接收到的进位输入信号可以为0,同时,第一个华莱士树子单元接收到的进位信号输入端口的数目,可以与其它华莱士树子单元的进位信号输入端口的数目相同。In addition, the signal received by each Wallace tree subunit in the Wallace tree group unit 1121 may include a carry input signalCini , a partial product input signal, and a carry output signalCouti . Optionally, the partial product input signal received by each Wallace tree subunit may be each column value in the partial product of all target codes, and the number of bits of the carry signalCouti output by each Wallace tree subunit may be equal to NCout =floor((NI +NCin )/2)-1. Wherein, NI may represent the number of partial product value input signals of the Wallace tree subunit, NCin may represent the number of carry input signals of the Wallace tree subunit, NCout may represent the minimum number of carry output signals of the Wallace tree subunit, and floor(·) may represent a floor function. Optionally, the carry input signal received by each Wallace tree subunit in the Wallace tree group unit 1121 may be the carry output signal output by the previous Wallace tree subunit, and the carry input signal received by the first Wallace tree subunit may be 0. At the same time, the number of carry signal input ports received by the first Wallace tree subunit may be the same as the number of carry signal input ports of other Wallace tree subunits.

示例性的,若乘法运算电路11当前处理8位*8位的乘法运算,通过部分积获取单元1112得到的符号位扩展后的部分积为“pi9pi9pi9pi9pi9pi9pi9pi9pi8pi7pi6pi5pi4pi3pi2pi1”(i=1,…,n=9),其中,i可以表示第i个符号位扩展后的部分积,并根据9个符号位扩展后的部分积得到9个目标编码的部分积,并对这9个目标编码的部分积进行累加处理。可选的,9个目标编码的部分积的分布规律可以参见图3所示,每一个原点可以代表符号位扩展后的部分积中的每一位数值,且第一个目标编码的部分积可以为第一个符号位扩展后的部分积,其中,9个目标编码的部分积的分布规律中,每一个目标编码的部分积均可以有对应的符号位扩展后的部分积,从第二个目标编码的部分积开始,对应的符号位扩展后的部分积在上一个目标编码的部分积的基础上,可以向左移动一位数值,每一个目标编码的部分积的最高位数值与第一个目标编码的部分积的最高位数值位于同一列,相当于,从第二个目标编码的部分积开始,左移每一个符号位扩展后的部分积后,对应左移的更高位数值不进行加法运算。可选的,9个目标编码的部分积中,第一个目标编码的部分积可以为第一个符号位扩展后的部分积,从第二个目标编码的部分积开始,每一个目标编码的部分积的最高位数值,与第一个目标编码的部分积的最高位数值位于同一列;从最右列至最左列算起,共需要16个华莱士树子单元对9个符目标编码的部分积进行累加处理,16个华莱士树子单元的连接电路图如图4所示,其中,图4中Wallace_i表示华莱士树子单元,i为华莱士树子单元从1开始的编号,并且两两华莱士树子单元之间连接的实线表示高位编号对应的华莱士树子单元有进位输出信号,虚线表示高位编号对应的华莱士树子单元没有进位输出信号。Exemplarily, if the multiplication circuit 11 currently processes an 8-bit*8-bit multiplication operation, the partial product after sign bit extension obtained by the partial product acquisition unit 1112 is "pi9 pi9 pi9 pi9 pi9 pi9 pi9 pi9 pi8 pi7 pi6 pi5 pi4 pi3 pi2 pi1 " (i=1,…,n=9), where i can represent the partial product after the i-th sign bit extension, and 9 target coded partial products are obtained based on the 9 sign bit extended partial products, and the 9 target coded partial products are accumulated. Optionally, the distribution pattern of the partial products of the 9 target codes can be shown in Figure 3, each origin can represent each bit value in the partial product after sign bit extension, and the partial product of the first target code can be the first sign bit extended partial product, wherein, in the distribution pattern of the partial products of the 9 target codes, the partial product of each target code can have a corresponding partial product after sign bit extension, starting from the partial product of the second target code, the corresponding partial product after sign bit extension can be shifted left by one bit value on the basis of the partial product of the previous target code, and the highest bit value of the partial product of each target code is in the same column as the highest bit value of the partial product of the first target code, which is equivalent to, starting from the partial product of the second target code, after each sign bit extended partial product is shifted left, the corresponding higher bit values of the left shift are not added. Optionally, among the partial products of the 9 target codes, the partial product of the first target code can be the partial product after the first sign bit is extended. Starting from the partial product of the second target code, the highest bit value of the partial product of each target code is located in the same column as the highest bit value of the partial product of the first target code; starting from the rightmost column to the leftmost column, a total of 16 Wallace tree subunits are required to accumulate the partial products of the 9 target codes, and the connection circuit diagram of the 16 Wallace tree subunits is shown in Figure 4, wherein Wallace_i in Figure 4 represents a Wallace tree subunit, i is the number of the Wallace tree subunit starting from 1, and the solid line connecting two Wallace tree subunits indicates that the Wallace tree subunit corresponding to the high-order number has a carry output signal, and the dotted line indicates that the Wallace tree subunit corresponding to the high-order number has no carry output signal.

本实施例提供的一种乘法器,该乘法器获取的有效部分积的数目较少,降低乘法器实现乘法运算的复杂性;同时,该乘法器能够提高乘法运算的运算效率,有效降低了乘法器的功耗。The present embodiment provides a multiplier which obtains a smaller number of effective partial products, thereby reducing the complexity of the multiplication operation implemented by the multiplier; at the same time, the multiplier can improve the operational efficiency of the multiplication operation, thereby effectively reducing the power consumption of the multiplier.

作为其中一个实施例,其中,乘法器中的累加单元1122包括:加法器,所述加法器用于对接收到的所述累加修正结果进行加法运算。As one of the embodiments, the accumulation unit 1122 in the multiplier includes: an adder, and the adder is used to perform an addition operation on the received accumulation correction result.

具体的,加法器可以为不同位宽的加法器,该加法器可以为超前进位加法器。可选的,加法器可以接收修正华莱士树组单元1121输出的两路信号,对两路输出信号进行加法运算,输出乘法运算结果。Specifically, the adder may be an adder with different bit widths, and the adder may be a carry-lookahead adder. Optionally, the adder may receive two signals output by the modified Wallace tree group unit 1121, perform addition operation on the two output signals, and output a multiplication operation result.

可选的,所述加法器包括:进位信号输入端口、和位信号输入端口以及结果输出端口;所述进位信号输入端口用于接收进位信号,所述和位信号输入端口用于接收和位信号,所述结果输出端口用于输出所述进位信号与所述和位信号进行累加处理的结果。Optionally, the adder includes: a carry signal input port, a sum signal input port and a result output port; the carry signal input port is used to receive a carry signal, the sum signal input port is used to receive a sum signal, and the result output port is used to output the result of accumulation processing of the carry signal and the sum signal.

具体的,加法器可以通过进位信号输入端口接收修正华莱士树组单元1121输出的进位信号Carry,通过和位信号输入端口接收修正华莱士树组单元1121输出的和位信号Sum,并将进位信号Carry与和位信号Sum进行累加的结果,通过结果输出端口输出。Specifically, the adder can receive the carry signal Carry output by the modified Wallace tree group unit 1121 through the carry signal input port, receive the sum signal Sum output by the modified Wallace tree group unit 1121 through the sum signal input port, and accumulate the carry signal Carry and the sum signal Sum, and output the result through the result output port.

需要说明的是,乘法运算时,乘法处理电路11可以采用不同位宽的加法器对修正华莱士树组单元1121输出的进位输出信号Carry与和位输出信号Sum进行加法运算,其中,上述加法器可处理数据的位宽可以等于乘法器当前处理的数据位宽N的2倍。可选的,修正华莱士树组单元1121中的每一个华莱士树子单元可以输出一个进位输出信号Carryi,与一个和位输出信号Sumi(i=0,…,2N-1,i为每一个华莱士树子单元的对应编号,编号从0开始)。可选的,加法器接收到的Carry={[Carry0:Carry2N-2],0},也就是说,加法器接收到的进位输出信号Carry的位宽为2N,进位输出信号Carry中前2N-1位数值对应修正华莱士树组单元1121中,前2N-1个华莱士树子单元的进位输出信号,进位输出信号Carry中最后一位数值可以用数值0代替。可选的,加法器接收到的和位输出信号Sum的位宽为2N,和位输出信号Sum中的数值可以等于修正华莱士树组单元1121中每一个华莱士树子单元的和位输出信号。It should be noted that, during the multiplication operation, the multiplication processing circuit 11 can use adders with different bit widths to perform addition operation on the carry output signal Carry and the sum output signal Sum output by the modified Wallace tree group unit 1121, wherein the bit width of the data that can be processed by the adder can be equal to twice the bit width N of the data currently processed by the multiplier. Optionally, each Wallace tree subunit in the modified Wallace tree group unit 1121 can output a carry output signal Carryi and a sum output signal Sumi (i=0, ..., 2N-1, i is the corresponding number of each Wallace tree subunit, and the numbering starts from 0). Optionally, Carry received by the adder is 2N={[Carry0 : Carry2N-2 ], 0}, that is, the bit width of the carry output signal Carry received by the adder is 2N, the first 2N-1 bits of the carry output signal Carry correspond to the carry output signals of the first 2N-1 Wallace tree subunits in the modified Wallace tree group unit 1121, and the last bit of the carry output signal Carry can be replaced by a value of 0. Optionally, the bit width of the sum output signal Sum received by the adder is 2N, and the value in the sum output signal Sum can be equal to the sum output signal of each Wallace tree subunit in the modified Wallace tree group unit 1121.

示例性的,若乘法运算电路11当前处理8位*8位的乘法运算,则加法器可以为16位超前进位加法器,继续如图4所示,修正华莱士树组单元1121可以输出16个华莱士树子单元的和位输出信号Sum和进位输出信号Carry,但是,16位超前进位加法器接收到的和位输出信号可以为修正华莱士树组单元1121输出的完整和位信号Sum,接收到的进位输出信号可以为修正华莱士树组单元1121中,除去最后一个华莱士树子单元输出的进位输出信号的所有进位输出信号,与数值0结合后的进位信号Carry。Exemplarily, if the multiplication circuit 11 currently processes an 8-bit*8-bit multiplication operation, the adder can be a 16-bit carry-lookahead adder. Continuing as shown in FIG. 4 , the modified Wallace tree group unit 1121 can output the sum output signal Sum and the carry output signal Carry of 16 Wallace tree sub-units. However, the sum output signal received by the 16-bit carry-lookahead adder can be the complete sum output signal Sum output by the modified Wallace tree group unit 1121, and the received carry output signal can be all carry output signals in the modified Wallace tree group unit 1121 except the carry output signal output by the last Wallace tree sub-unit, and the carry signal Carry after being combined with the value 0.

本实施例提供的一种乘法器,该乘法器获取的有效部分积的数目较少,降低乘法运算的复杂性,提高了乘法运算的运算效率,有效降低了乘法器的功耗。The present embodiment provides a multiplier that obtains a smaller number of effective partial products, thereby reducing the complexity of multiplication operations, improving the operational efficiency of multiplication operations, and effectively reducing the power consumption of the multiplier.

在一个实施例中,乘法器包括所述寄存器电路13,该寄存器电路13包括:寄存子电路131,所述寄存子电路131用于将不同存储指示信号对应的所述乘法运算结果进行存储。In one embodiment, the multiplier includes the register circuit 13, and the register circuit 13 includes: a register subcircuit 131, and the register subcircuit 131 is used to store the multiplication results corresponding to different storage indication signals.

具体的,上述寄存器电路13可以包括两个或多个寄存子电路131,还可以理解为,寄存器电路13中寄存子电路131的数目,可以等于2Nin/Nout,Nin表示乘法器接收到的数据位宽,Nout(Nout<2Nin)表示乘法器输出的数据位宽。可选的,寄存子电路131存储的数据位宽可以等于乘法器输入端口位宽的2倍。可选的,乘法器接收到的数据位宽可以等于乘法器输入端口的位宽,并且乘法器输出的数据位宽可以等于乘法器输入端口的位宽,还可以小于乘法器输入端口位宽的2倍。示例性的,若乘法器输入端口的位宽和输出端口的位宽均为N比特,则寄存器电路13需要通过两个寄存子电路131组合而成;若乘法器输入端口的位宽为N比特,输出端口的位宽为N/2比特,则寄存器电路13需要通过四个寄存子电路131组合而成。可选的,乘法器可以根据存储指示信号将每一次乘法运算得到的乘法运算结果,存储至对应的2Nin/Nout个寄存子电路131中,其中,不同的存储指示信号有对应存储乘法运算结果的不同寄存子电路131。可选的,乘法器得到的每一个乘法运算结果,只能按照存储指示信号对应的寄存子电路131存储,并不能将每一次得到的乘法运算结果,存储至与存储指示信号不对应的其它寄存子电路131中。Specifically, the register circuit 13 may include two or more register subcircuits 131. It can also be understood that the number of register subcircuits 131 in the register circuit 13 may be equal to 2Nin /Nout , where Nin represents the data bit width received by the multiplier, and Nout (Nout <2Nin ) represents the data bit width output by the multiplier. Optionally, the data bit width stored in the register subcircuit 131 may be equal to twice the bit width of the multiplier input port. Optionally, the data bit width received by the multiplier may be equal to the bit width of the multiplier input port, and the data bit width output by the multiplier may be equal to the bit width of the multiplier input port, and may also be less than twice the bit width of the multiplier input port. Exemplarily, if the bit width of the multiplier input port and the bit width of the output port are both N bits, the register circuit 13 needs to be composed of two register subcircuits 131; if the bit width of the multiplier input port is N bits and the bit width of the output port is N/2 bits, the register circuit 13 needs to be composed of four register subcircuits 131. Optionally, the multiplier can store the multiplication result obtained by each multiplication operation in the corresponding 2Nin /Nout register sub-circuits 131 according to the storage indication signal, wherein different storage indication signals have corresponding different register sub-circuits 131 for storing the multiplication results. Optionally, each multiplication result obtained by the multiplier can only be stored in the register sub-circuit 131 corresponding to the storage indication signal, and each multiplication result obtained cannot be stored in other register sub-circuits 131 that do not correspond to the storage indication signal.

示例性的,若寄存器电路13中有n个寄存子电路131,对应编号为1,2,3,...,n,则乘法器得到的第一个乘法运算结果可以存储至1号寄存子电路131中,此时,存储指示信号的数值可以为1,乘法器得到的第二个乘法运算结果可以存储至2号寄存子电路132中,此时,存储指示信号的数值可以为2,还可以理解为,存储指示信号的数值为奇数时,存储乘法运算结果的寄存子电路131的对应编号也为奇数,存储指示信号的数值为偶数时,存储乘法运算结果的寄存子电路131的对应编号也为偶数,其中,存储指示信号的数值可以等于对应存储乘法运算结果的寄存子电路131的编号。Exemplarily, if there are n register sub-circuits 131 in the register circuit 13, and the corresponding numbers are 1, 2, 3, ..., n, then the first multiplication result obtained by the multiplier can be stored in register sub-circuit No. 1 131, at this time, the value of the storage indication signal can be 1, and the second multiplication result obtained by the multiplier can be stored in register sub-circuit No. 2 132, at this time, the value of the storage indication signal can be 2. It can also be understood that when the value of the storage indication signal is an odd number, the corresponding number of the register sub-circuit 131 storing the multiplication result is also an odd number, and when the value of the storage indication signal is an even number, the corresponding number of the register sub-circuit 131 storing the multiplication result is also an even number, wherein the value of the storage indication signal can be equal to the number of the corresponding register sub-circuit 131 storing the multiplication result.

本实施例提供的一种乘法器,乘法器中的寄存子电路,根据不同的存储指示信号将每一次乘法运算得到的乘法运算结果,存储至不同的寄存子电路中,进而根据读取指示信号输出对应寄存子电路存储的乘法运算结果中的数据,以便后续通过输出端口位宽不匹配输入端口位宽2倍的乘法器,输出目标运算结果,同时,上述乘法器获取的有效部分积的数目较少,降低乘法器实现乘法运算的复杂性。A multiplier provided in this embodiment has a register subcircuit in which the multiplication result obtained from each multiplication operation is stored in different register subcircuits according to different storage indication signals, and then the data in the multiplication result stored in the corresponding register subcircuit is output according to the read indication signal, so that the target operation result can be output later through a multiplier whose output port bit width does not match the input port bit width by 2 times, and at the same time, the number of effective partial products obtained by the multiplier is small, thereby reducing the complexity of the multiplier implementing the multiplication operation.

另一实施例提供的一种乘法器,其中,乘法器包括所述累加子电路212,该累加子电路212包括:华莱士树组单元2121和累加单元2122;其中,所述华莱士树组单元2121的输出端与所述累加单元2122的输入端连接;所述华莱士树组单元2121用于对所述目标编码的部分积进行累加处理得到累加运算结果,所述累加单元2122用于对所述累加运算结果进行累加处理,得到所述目标运算结果。Another embodiment provides a multiplier, wherein the multiplier includes the accumulation subcircuit 212, and the accumulation subcircuit 212 includes: a Wallace tree group unit 2121 and an accumulation unit 2122; wherein the output end of the Wallace tree group unit 2121 is connected to the input end of the accumulation unit 2122; the Wallace tree group unit 2121 is used to perform accumulation processing on the partial products of the target code to obtain an accumulation operation result, and the accumulation unit 2122 is used to perform accumulation processing on the accumulation operation result to obtain the target operation result.

具体的,上述华莱士树组单元2121可以对部分积获取单元2112得到的所有目标编码的部分积中的数值进行累加处理得到累加运算结果,并通过累加单元2122对华莱士树组单元2121得到累加运算结果进行累加处理,得到目标运算结果。Specifically, the Wallace tree group unit 2121 may perform accumulation processing on the values in the partial products of all target codes obtained by the partial product acquisition unit 2112 to obtain an accumulation operation result, and perform accumulation processing on the accumulation operation result obtained by the Wallace tree group unit 2121 through the accumulation unit 2122 to obtain a target operation result.

可选的,一种乘法器包括所述华莱士树组单元2121,该华莱士树组单元2121包括:华莱士树子单元2121_1~2121_n,多个所述华莱士树子单元2121_1~2121_n用于对所有目标编码的部分积中的每一列数值进行累加处理。Optionally, a multiplier includes the Wallace tree group unit 2121, and the Wallace tree group unit 2121 includes: Wallace tree sub-units 2121_1~2121_n, and the multiple Wallace tree sub-units 2121_1~2121_n are used to accumulate each column value in the partial products of all target codes.

在本实施例中,华莱士树组单元2121的电路结构及其功能,与华莱士树组单元1121的电路结构及其功能可以相同,对此本实施例不再赘述华莱士树组单元2121的具体结构。In this embodiment, the circuit structure and function of the Wallace tree group unit 2121 may be the same as the circuit structure and function of the Wallace tree group unit 1121, and the specific structure of the Wallace tree group unit 2121 will not be described in detail in this embodiment.

本实施例提供的一种乘法器,通过华莱士树组单元可以对目标编码的部分积进行累加处理,并通过累加单元对结果进行累加处理,得到乘法运算结果,并根据乘法运算结果得到目标运算结果,从而保证乘法器获取的有效部分积的数目较少,降低乘法器实现乘法运算的复杂性;同时,该乘法器能够提高乘法运算的运算效率,有效降低了乘法器的功耗。A multiplier provided in this embodiment can accumulate partial products of a target code through a Wallace tree group unit, and accumulate the results through an accumulation unit to obtain a multiplication result, and obtain a target operation result based on the multiplication result, thereby ensuring that the number of effective partial products obtained by the multiplier is small and reducing the complexity of the multiplication operation implemented by the multiplier; at the same time, the multiplier can improve the operation efficiency of the multiplication operation and effectively reduce the power consumption of the multiplier.

作为其中一个实施例,其中,乘法器包括所述累加单元2122,该累加单元2122包括:加法器,所述加法器用于对所述累加运算结果进行加法运算。As one of the embodiments, the multiplier includes the accumulation unit 2122, and the accumulation unit 2122 includes: an adder, and the adder is used to perform an addition operation on the accumulation operation result.

具体的,加法器可以为不同位宽的加法器,该加法器可以为超前进位加法器。可选的,加法器可以接收华莱士树组单元2121输出的两路信号,对两路输出信号进行加法运算,输出乘法运算结果。Specifically, the adder may be an adder with different bit widths, and the adder may be a carry-lookahead adder. Optionally, the adder may receive two signals output by the Wallace tree group unit 2121, perform addition operation on the two output signals, and output a multiplication operation result.

本实施例提供的一种乘法器,通过累加单元可以对华莱士树组单元输出的两路信号进行累加处理,输出乘法运算结果,并根据乘法运算结果得到目标运算结果,从而保证乘法器获取的有效部分积的数目较少,降低乘法器实现乘法运算的复杂性;同时,该乘法器能够提高乘法运算的运算效率,有效降低了乘法器的功耗。A multiplier provided in this embodiment can perform accumulation processing on two signals output by a Wallace tree group unit through an accumulation unit, output a multiplication operation result, and obtain a target operation result according to the multiplication operation result, thereby ensuring that the number of effective partial products obtained by the multiplier is small and reducing the complexity of the multiplication operation implemented by the multiplier; at the same time, the multiplier can improve the operation efficiency of the multiplication operation and effectively reduce the power consumption of the multiplier.

在其中一个实施例中,其中,乘法器包括所述加法器,该加法器包括:进位信号输入端口、和位信号输入端口以及结果输出端口;所述进位信号输入端口用于接收进位信号,所述和位信号输入端口用于接收和位信号,所述结果输出端口用于输出所述进位信号与所述和位信号进行累加处理得到的乘法运算结果。In one of the embodiments, the multiplier includes the adder, which includes: a carry signal input port, a sum signal input port and a result output port; the carry signal input port is used to receive a carry signal, the sum signal input port is used to receive a sum signal, and the result output port is used to output a multiplication result obtained by accumulating the carry signal and the sum signal.

具体的,加法器可以通过进位信号输入端口接收华莱士树组单元2121输出的进位信号Carry,通过和位信号输入端口接收华莱士树组单元2121输出的和位信号Sum,并将进位信号Carry与和位信号Sum进行累加得到的乘法运算结果,通过结果输出端口输出。Specifically, the adder can receive the carry signal Carry output by the Wallace tree group unit 2121 through the carry signal input port, receive the sum signal Sum output by the Wallace tree group unit 2121 through the sum signal input port, and accumulate the carry signal Carry and the sum signal Sum to obtain the multiplication result, and output it through the result output port.

需要说明的是,乘法运算时,乘法运算电路21可以采用不同位宽的加法器对华莱士树组单元2121输出的进位输出信号Carry与和位输出信号Sum进行加法运算,其中,上述加法器可处理数据的位宽可以等于乘法器当前处理的数据位宽N的2倍。可选的,华莱士树组单元2121中的每一个华莱士树子单元可以输出一个进位输出信号Carryi,与一个和位输出信号Sumi(i=0,…,2N-1,i为每一个华莱士树子单元的对应编号,编号从0开始)。可选的,加法器接收到的Carry={[Carry0:Carry2N-2],0},也就是说,加法器接收到的进位输出信号Carry的位宽为2N,进位输出信号Carry中前2N-1位数值对应华莱士树组单元2121中前2N-1个华莱士树子单元的进位输出信号,进位输出信号Carry中最后一位数值可以用0代替。可选的,加法器接收到的和位输出信号Sum的位宽为2N,和位输出信号Sum中的数值可以等于华莱士树组单元2121中每一个华莱士树子单元的和位输出信号。It should be noted that, during the multiplication operation, the multiplication circuit 21 can use adders with different bit widths to perform addition operation on the carry output signal Carry and the sum output signal Sum output by the Wallace tree group unit 2121, wherein the bit width of the data that can be processed by the adder can be equal to twice the bit width N of the data currently processed by the multiplier. Optionally, each Wallace tree sub-unit in the Wallace tree group unit 2121 can output a carry output signal Carryi and a sum output signal Sumi (i=0, ..., 2N-1, i is the corresponding number of each Wallace tree sub-unit, and the numbering starts from 0). Optionally, Carry received by the adder is 2N={[Carry0 : Carry2N-2 ], 0}, that is, the bit width of the carry output signal Carry received by the adder is 2N, the first 2N-1 bits of the carry output signal Carry correspond to the carry output signals of the first 2N-1 Wallace tree subunits in the Wallace tree group unit 2121, and the last bit of the carry output signal Carry can be replaced by 0. Optionally, the bit width of the sum output signal Sum received by the adder is 2N, and the value in the sum output signal Sum can be equal to the sum output signal of each Wallace tree subunit in the Wallace tree group unit 2121.

示例性的,若乘法运算电路11当前处理8位*8位的乘法运算,则加法器可以为16位超前进位加法器,继续如图4所示,华莱士树组单元2121可以输出16个华莱士树子单元的和位输出信号Sum和进位输出信号Carry,但是,16位超前进位加法器接收到的和位输出信号可以为华莱士树组单元2121输出的完整和位信号Sum,接收到的进位输出信号可以为华莱士树组单元2121中,除去最后一个华莱士树子单元输出的进位输出信号的所有进位输出信号与0结合后的进位信号Carry。Exemplarily, if the multiplication circuit 11 currently processes an 8-bit*8-bit multiplication operation, the adder can be a 16-bit carry-lookahead adder. Continuing as shown in FIG. 4 , the Wallace tree group unit 2121 can output the sum output signal Sum and the carry output signal Carry of 16 Wallace tree sub-units. However, the sum output signal received by the 16-bit carry-lookahead adder can be the complete sum output signal Sum output by the Wallace tree group unit 2121, and the received carry output signal can be the carry signal Carry obtained by combining all carry output signals except the carry output signal output by the last Wallace tree sub-unit in the Wallace tree group unit 2121 with 0.

本实施例提供的一种乘法器,通过累加单元可以对华莱士树组单元输出的两路信号进行累加运算,输出乘法运算结果,并根据乘法运算结果得到目标运算结果,从而保证乘法器获取的有效部分积的数目较少,降低乘法器实现乘法运算的复杂性;同时,该乘法器能够提高乘法运算的运算效率,有效降低了乘法器的功耗。A multiplier provided in this embodiment can perform accumulation operation on two signals output by a Wallace tree group unit through an accumulation unit, output a multiplication operation result, and obtain a target operation result according to the multiplication operation result, thereby ensuring that the number of effective partial products obtained by the multiplier is small and reducing the complexity of the multiplication operation implemented by the multiplier; at the same time, the multiplier can improve the operation efficiency of the multiplication operation and effectively reduce the power consumption of the multiplier.

另一实施例提供的一种乘法器,该乘法器包括所述第一转换子电路221以及所述第二转换子电路222,所述第一转换子电路221具体用于将所述乘法运算结果转换成浮点类型的所述目标运算结果,所述第二转换子电路222具体用于将所述乘法运算结果转换成定点类型的所述目标运算结果。Another embodiment provides a multiplier, which includes the first conversion sub-circuit 221 and the second conversion sub-circuit 222, the first conversion sub-circuit 221 is specifically used to convert the multiplication result into the target operation result of the floating-point type, and the second conversion sub-circuit 222 is specifically used to convert the multiplication result into the target operation result of the fixed-point type.

具体的,上述乘法运算结果的位宽可以等于乘法器接收到的数据位宽的2倍,浮点类型运算结果的位宽和定点类型运算结果的位宽均可以等于乘法器输出端口的位宽,并且转数电路22中,浮点类型的运算结果的位宽可以等于定点类型的运算结果的位宽。Specifically, the bit width of the multiplication result can be equal to twice the bit width of the data received by the multiplier, the bit width of the floating-point type operation result and the bit width of the fixed-point type operation result can both be equal to the bit width of the multiplier output port, and in the conversion circuit 22, the bit width of the floating-point type operation result can be equal to the bit width of the fixed-point type operation result.

需要说明的是,在转数电路22中,第一转换子电路221和第二转换子电路222没有任何连接关系,两者相互独立,每一次乘法运算时,转数电路22只需要用到第一转换子电路221或第二转换子电路222进行数据转数处理,得到目标运算结果即可。可选的,转数电路22可以根据接收到的数据转换信号,确定本次乘法运算需要通过第一转换子电路221还是第二转换子电路222进行数据转数处理。It should be noted that, in the number conversion circuit 22, the first conversion subcircuit 221 and the second conversion subcircuit 222 have no connection relationship, and the two are independent of each other. During each multiplication operation, the number conversion circuit 22 only needs to use the first conversion subcircuit 221 or the second conversion subcircuit 222 to perform data number conversion processing to obtain the target operation result. Optionally, the number conversion circuit 22 can determine whether the multiplication operation needs to be processed by the first conversion subcircuit 221 or the second conversion subcircuit 222 according to the received data conversion signal.

可选的,数据转换信号可以包括两种信号,分别可以用二进制数值表示成00,01,其中,数据转换信号为00表征的信号可以包括转数电路22接收到的数据为2N比特位宽的定点数,将该2N比特位宽的定点数需要转换成N比特位宽的定点数,以及转换后定点数小数点的位置,其中,转换前2N比特位宽的定点数小数点的位置可以是确定的;数据转换信号为01表征的信号可以包括转数电路22接收到的乘法运算结果为2N比特位宽的定点数,将该2N比特位宽的定点数需要转换成N比特位宽的浮点数。可选的,转数电路22可以根据接收到的两种不同的数据转换信号,通过第一转换子电路221或第二转换子电路222将接收到的乘法运算结果进行不同的转数处理,具体实现方式通过如下方式实现:Optionally, the data conversion signal may include two signals, which may be represented by binary values as 00 and 01, respectively, wherein the signal characterized by the data conversion signal being 00 may include that the data received by the conversion circuit 22 is a fixed-point number with a width of 2N bits, the fixed-point number with a width of 2N bits needs to be converted into a fixed-point number with a width of N bits, and the position of the decimal point of the fixed-point number after the conversion, wherein the position of the decimal point of the fixed-point number with a width of 2N bits before the conversion may be determined; the signal characterized by the data conversion signal being 01 may include that the multiplication result received by the conversion circuit 22 is a fixed-point number with a width of 2N bits, and the fixed-point number with a width of 2N bits needs to be converted into a floating-point number with a width of N bits. Optionally, the conversion circuit 22 may perform different conversion processing on the received multiplication result through the first conversion subcircuit 221 or the second conversion subcircuit 222 according to the two different data conversion signals received, and the specific implementation method is implemented in the following manner:

(1)若转数电路22接收到的数据转换信号为00,则转数电路22可以将2N比特位宽的定点数转换成N比特位宽的定点数,此时,转数电路22可以通过第二转换子电路222对接收到的2N比特位宽的定点数进行数据转换,具体地,转数处理时,需要将目标转换后N比特位宽的定点数小数点的位置,与转换前2N比特位宽的定点数小数点的位置对齐,然后截取转换前2N比特位宽的定点数小数点位置前后共N位数值,得到转换后的N比特位宽的定点数,截取的方式可以分为三种情况:(1) If the data conversion signal received by the conversion circuit 22 is 00, the conversion circuit 22 can convert the fixed-point number with a width of 2N bits into a fixed-point number with a width of N bits. At this time, the conversion circuit 22 can perform data conversion on the received fixed-point number with a width of 2N bits through the second conversion subcircuit 222. Specifically, during the conversion process, the decimal point position of the target fixed-point number with a width of N bits after conversion needs to be aligned with the decimal point position of the fixed-point number with a width of 2N bits before conversion, and then the N-bit value before and after the decimal point position of the fixed-point number with a width of 2N bits before conversion is intercepted to obtain the fixed-point number with a width of N bits after conversion. The interception method can be divided into three cases:

情况a,当即将截取N位数值均包含在转换前2N比特位宽的定点数内,则第二转换子电路222可以直接截取转换前2N比特位宽的定点数中小数点位置前后共N位数值;Case a: when the N-bit values to be intercepted are all contained in the fixed-point number with a width of 2N bits before conversion, the second conversion subcircuit 222 can directly intercept the N-bit values before and after the decimal point position in the fixed-point number with a width of 2N bits before conversion;

情况b,当即将截取的N位数值中的一部分数值包含在转换前2N比特位宽的定点数内,而需要截取的N位数值中的高位部分数值,在转换前2N比特位宽的定点数内没有对应的部分数值可截取,则第二转换子电路222均可以用转换前2N比特位宽的定点数的符号位,对这部分每位数值进行补位,然后从补位后的定点数中截取N位数值;Case b, when a part of the N-bit value to be intercepted is contained in the fixed-point number with a width of 2N bits before conversion, and the high-order part of the N-bit value to be intercepted has no corresponding part of the value to be intercepted in the fixed-point number with a width of 2N bits before conversion, the second conversion subcircuit 222 can use the sign bit of the fixed-point number with a width of 2N bits before conversion to fill each part of the value, and then intercept the N-bit value from the fixed-point number after the filling;

情况c,当即将截取的N位数值中的一部分数值包含在转换前2N比特位宽的定点数内,而需要截取的N位数值中的低位部分数值,在转换前2N比特位宽的定点数内没有对应的部分数值可截取,则第二转换子电路222可以根据转换前2N比特位宽的定点数的正负,对这部分每位数值进行补位,若转换前2N比特位宽的定点数为正数,这部分每位数值可以用数值0补位,否则用数值1补位,然后从补位后的定点数中截取N位数值;Case c, when a part of the N-bit value to be intercepted is contained in the fixed-point number with a width of 2N bits before conversion, and the low-order part of the N-bit value to be intercepted has no corresponding part of the value to be intercepted in the fixed-point number with a width of 2N bits before conversion, the second conversion subcircuit 222 can fill each bit of this part of the value according to the positive or negative of the fixed-point number with a width of 2N bits before conversion. If the fixed-point number with a width of 2N bits before conversion is a positive number, each bit of this part of the value can be filled with a value of 0, otherwise it can be filled with a value of 1, and then the N-bit value is intercepted from the fixed-point number after filling;

(2)若转数电路22接收到的数据转换信号为01,则转数电路22可以将2N比特位宽的定点数转换成N比特位宽的浮点数,此时,转数电路22可以通过第一转换子电路221对接收到的2N比特位宽的定点数进行数据转换,具体地,转数处理时,将定点数的最高位数值(即符号位)可以作为转换后浮点数的符号位数值,另外,若转换前2N位定点数为正数,则除去最高位数值符号位,从2N-1位定点数最高位往最低位方向查找,查找到数值1时,统计数值1后还有m位数值,此时,转换后浮点数的指数位数值可以等于m加指数位偏移值i,并减去转换前2N位定点数小数点的位置,但是,若转换前2N位定点数为负数,则除去最高位数值符号位,从2N-1位定点数最高位往最低位方向查找,查找到数值0时,统计的是数值0后还有m位数值,此外,还需要截取m位数值中的高n位数值作为转换后浮点数的尾数位数值,若m>=n,则可以直接截取n位数值作为尾数位数值,若m<n,则可以在转换前2N位定点数后补n-m位最高位(即符号位)数值。(2) If the data conversion signal received by the number conversion circuit 22 is 01, the number conversion circuit 22 can convert the fixed-point number with a width of 2N bits into a floating-point number with a width of N bits. At this time, the number conversion circuit 22 can perform data conversion on the received fixed-point number with a width of 2N bits through the first conversion subcircuit 221. Specifically, during the number conversion process, the highest bit value (i.e., the sign bit) of the fixed-point number can be used as the sign bit value of the floating-point number after conversion. In addition, if the 2N-bit fixed-point number before conversion is a positive number, the sign bit of the highest bit value is removed, and the search is performed from the highest bit of the 2N-1-bit fixed-point number to the lowest bit. When the value 1 is found, it is counted that there are m bits of value after the value 1. At this time, the exponent bit value of the converted floating-point number can be equal to m plus the exponent bit offset value i, minus the position of the decimal point of the 2N-bit fixed-point number before the conversion. However, if the 2N-bit fixed-point number before the conversion is a negative number, the sign bit of the highest bit value is removed, and the search is performed from the highest bit of the 2N-1-bit fixed-point number to the lowest bit. When the value 0 is found, it is counted that there are m bits of value after the value 0. In addition, it is necessary to intercept the high n bits of the m-bit value as the mantissa value of the converted floating-point number. If m>=n, the n-bit value can be directly intercepted as the mantissa value. If m<n, the n-m highest bit (i.e., the sign bit) value can be added after the 2N-bit fixed-point number before the conversion.

示例性的,若需要将2N比特位宽的定点数转换成16比特位宽的浮点数,则i可以等于16,n可以等于10;若需要将2N比特位宽的定点数转换成32比特位宽的浮点数,则i可以等于127,n可以等于23;若需要将2N比特位宽的定点数转换成64比特位宽的浮点数,则i可以等于1023,n可以等于52。For example, if a 2N-bit fixed-point number needs to be converted into a 16-bit floating-point number, i can be equal to 16 and n can be equal to 10; if a 2N-bit fixed-point number needs to be converted into a 32-bit floating-point number, i can be equal to 127 and n can be equal to 23; if a 2N-bit fixed-point number needs to be converted into a 64-bit floating-point number, i can be equal to 1023 and n can be equal to 52.

本实施例提供的一种乘法器,该乘法器可以通过转数电路将乘法运算结果,转换成位宽与乘法器输出端口位宽相等的数据后,输出目标运算结果,使得获得的目标运算结果的位宽,可以小于乘法器输入的数据位宽的2倍,从而有效降低了乘法器对输入输出端口位宽的要求,同时,上述乘法器获取的有效部分积的数目较少,降低乘法器实现乘法运算的复杂性。A multiplier provided in this embodiment can convert the multiplication result into data with a bit width equal to the bit width of the multiplier output port through a conversion circuit, and then output the target operation result, so that the bit width of the obtained target operation result can be less than twice the bit width of the data input to the multiplier, thereby effectively reducing the requirements of the multiplier on the bit width of the input and output ports. At the same time, the number of effective partial products obtained by the above multiplier is small, reducing the complexity of the multiplier to implement the multiplication operation.

图5为一实施例提供的数据处理方法的流程示意图,该方法可以通过图1所示的乘法器进行处理,本实施例涉及的是对数据进行比较运算的过程。如图5所示,该方法包括:FIG5 is a flow chart of a data processing method provided by an embodiment. The method can be processed by the multiplier shown in FIG1. The embodiment involves a process of performing a comparison operation on data. As shown in FIG5, the method includes:

S101、接收待处理数据。S101, receiving data to be processed.

具体的,乘法器中的正则有符号数编码子电路可以接收两个待处理数据。可选的,正则有符号数编码子电路可以处理两个固定位宽的数据,并且固定位宽可以等于乘法器输入端口的位宽。可选的,上述正则有符号数编码子电路接收到的待处理数据可以为定点数,且定点数的位宽可以等于乘法器输入端口的位宽。Specifically, the regular signed number encoding subcircuit in the multiplier can receive two data to be processed. Optionally, the regular signed number encoding subcircuit can process two fixed bit width data, and the fixed bit width can be equal to the bit width of the multiplier input port. Optionally, the data to be processed received by the regular signed number encoding subcircuit can be a fixed point number, and the bit width of the fixed point number can be equal to the bit width of the multiplier input port.

S102、对所述待处理数据进行正则有符号数编码处理,得到目标编码的部分积。S102, performing regular signed number encoding processing on the data to be processed to obtain a partial product of a target code.

具体的,上述正则有符号数编码处理的方法可以通过以下方式表征:对于N位乘数而言,从低位数值向高位数值处理,若存在连续l(l>=2)位数值1时,则可以将连续n位数值1转换处理为数据“1(0)l-1(-1)”,并且将其余对应(N-l)位数值与转换后的(l+1)位数值进行结合得到一个新的数据;然后将该新数据作为下一级转换处理的初始数据,直到转换处理后得到的新数据中不存在连续l(l>=2)位数值1为止;其中,对N位乘数进行正则有符号数编码处理,得到的目标编码的位宽可以等于(N+1)。需要说明的是,上述目标编码的部分积的数目可以等于乘法器接收到的数据位宽N加1。Specifically, the above-mentioned method of regular signed number encoding processing can be characterized in the following manner: for an N-bit multiplier, from the low-order value to the high-order value, if there are continuous l (l>=2)-bit values of 1, then the continuous n-bit values of 1 can be converted and processed into data "1(0)l-1 (-1)", and the remaining corresponding (Nl)-bit values are combined with the converted (l+1)-bit values to obtain a new data; then the new data is used as the initial data for the next level of conversion processing until there are no continuous l (l>=2)-bit values of 1 in the new data obtained after the conversion processing; wherein, the regular signed number encoding processing is performed on the N-bit multiplier, and the bit width of the target code obtained can be equal to (N+1). It should be noted that the number of partial products of the above-mentioned target code can be equal to the data bit width N received by the multiplier plus 1.

S103、对所述目标编码的部分积进行累加处理,得到乘法运算结果。S103, performing accumulation processing on the partial products of the target code to obtain a multiplication result.

具体的,累加子电路可以对所有目标编码的部分积中的每一列数值进行累加运算,得到乘法运算结果。可选的,上述乘法运算结果的位宽可以等于乘法器接收到的数据位宽的2倍,还可以等于乘法器输入端口位宽的2倍。Specifically, the accumulation subcircuit can perform accumulation operation on each column value in the partial products of all target codes to obtain a multiplication result. Optionally, the bit width of the multiplication result can be equal to twice the bit width of the data received by the multiplier, or can be equal to twice the bit width of the multiplier input port.

S104、获取存储指示信号以及读取指示信号。S104: Acquire a storage indication signal and a read indication signal.

具体的,乘法器通过状态控制电路可以自动获取存储指示信号以及读取指示信号。Specifically, the multiplier can automatically obtain the storage indication signal and the read indication signal through the state control circuit.

S105、根据所述存储指示信号将多个所述乘法运算结果存储至不同的寄存子电路中。S105 , storing the plurality of multiplication operation results in different register sub-circuits according to the storage indication signal.

具体的,乘法器中的状态控制电路将获取的存储指示信号可以输入至寄存控制电路,寄存控制电路根据接收到的存储指示信号,确定本次乘法运算得到的乘法运算结果,可以存储至对应的寄存子电路中。Specifically, the state control circuit in the multiplier can input the acquired storage indication signal into the register control circuit, and the register control circuit determines the multiplication result obtained by this multiplication operation according to the received storage indication signal, and can store it in the corresponding register sub-circuit.

需要说明的是,一个寄存子电路最多只能存储一个乘法运算结果,并且多个寄存子电路中可以有部分寄存子电路为空闲状态。It should be noted that one register sub-circuit can store at most one multiplication result, and some of the multiple register sub-circuits may be in an idle state.

S106、根据所述读取指示信号,读取不同寄存子电路中存储的对应所述乘法运算结果中的部分数据,得到目标运算结果。S106 . Read partial data corresponding to the multiplication operation result stored in different register sub-circuits according to the read indication signal to obtain a target operation result.

具体的,乘法器中的选择电路可以根据接收到的读取指示信号,读取对应寄存子电路中存储的乘法运算结果中的部分数据,作为目标运算结果。可选的,上述运算结果并不是目标运算结果,乘法运算的目标运算结果可以为读取两次运算结果拼接而成,也可以为读取多次运算结果拼接而成,可以理解为,上述乘法运算结果中部分数据的位宽可以等于乘法运算结果位宽的1/2,还可以小于乘法运算结果位宽的1/2。可选的,目标运算结果的位宽可以小于等于乘法器输入端口的位宽。Specifically, the selection circuit in the multiplier can read part of the data in the multiplication result stored in the corresponding register subcircuit according to the received read indication signal as the target operation result. Optionally, the above operation result is not the target operation result. The target operation result of the multiplication operation can be spliced by reading the operation results twice, or can be spliced by reading the operation results multiple times. It can be understood that the bit width of part of the data in the above multiplication operation result can be equal to 1/2 of the bit width of the multiplication operation result, or can be less than 1/2 of the bit width of the multiplication operation result. Optionally, the bit width of the target operation result can be less than or equal to the bit width of the multiplier input port.

本实施例提供的一种数据处理方法,该方法可以对接收到的数据进行正则有符号数编码处理,得到目标编码的部分积,对目标编码的部分积进行累加处理,得到乘法运算结果,分别读取乘法运算结果中的高位数据以及低位数据,作为目标运算结果,使得获得的目标运算结果的位宽可以小于乘法器输入的数据位宽的2倍,从而有效降低了乘法器对输入输出端口位宽的要求;同时,该方法能够采用正则有符号数编码电路对接收到的数据进行正则有符号数编码处理,降低乘法运算过程中获取的有效部分积的数目,从而降低乘法运算的复杂性;同时,该方法能够提高乘法运算的运算效率。The present embodiment provides a data processing method, which can perform regular signed number encoding processing on received data to obtain partial products of target codes, perform accumulation processing on the partial products of the target codes to obtain multiplication results, and read high-order data and low-order data in the multiplication results respectively as target results, so that the bit width of the obtained target result can be less than 2 times the data bit width of the multiplier input, thereby effectively reducing the requirements of the multiplier on the bit width of the input and output ports; at the same time, the method can use a regular signed number encoding circuit to perform regular signed number encoding processing on the received data, reduce the number of effective partial products obtained during the multiplication operation, thereby reducing the complexity of the multiplication operation; at the same time, the method can improve the operation efficiency of the multiplication operation.

作为其中一个实施例,上述S102中对所述待处理数据进行正则有符号数编码处理,得到目标编码的部分积的步骤,可以包括:As one of the embodiments, the step of performing regular signed number encoding processing on the data to be processed in S102 to obtain a partial product of a target code may include:

S1021、对所述待处理数据进行正则有符号数编码处理,得到原始部分积。S1021. Perform regular signed number encoding processing on the data to be processed to obtain original partial products.

可选的,上述S1021中对所述待处理数据进行正则有符号数编码处理,得到原始部分积的步骤,可以包括:Optionally, the step of performing regular signed number encoding processing on the data to be processed to obtain the original partial product in S1021 may include:

S1021a、对所述待处理数据进行正则有符号数编码处理,得到目标编码。S1021a. Perform regular signed number encoding processing on the data to be processed to obtain a target code.

具体的,乘法器可以通过正则有符号数编码单元对接收到的待处理乘数进行正则有符号数编码处理,得到目标编码。其中,目标编码的位宽可以等于待处理乘数位宽N加1。Specifically, the multiplier can perform regular signed number encoding on the received multiplier to be processed through the regular signed number encoding unit to obtain a target code, wherein the bit width of the target code can be equal to the bit width N of the multiplier to be processed plus 1.

可选的,上述S1021a中对所述待处理数据进行正则有符号数编码处理,得到目标编码的步骤,可以包括:将所述待处理数据中连续的l位数值1转换为(l+1)位最高位数值为1,最低位数值为-1,其余位为数值0后,得到所述目标编码,其中,l大于等于2。Optionally, the step of performing regular signed number encoding on the data to be processed to obtain a target code in the above S1021a may include: converting consecutive l-bit values 1 in the data to be processed into (l+1)-bit highest bit value 1, lowest bit value -1, and remaining bits as values 0, to obtain the target code, where l is greater than or equal to 2.

需要说明的是,上述正则有符号数编码处理的方法可以通过以下方式表征:对于N位乘数而言,从低位数值向高位数值处理,若存在连续l(l>=2)位数值1时,则可以将连续n位数值1转换处理为数据“1(0)l-1(-1)”,并且将其余对应(N-l)位数值与转换后的(l+1)位数值进行结合得到一个新的数据;然后将该新数据作为下一级转换处理的初始数据,直到转换处理后得到的新数据中不存在连续l(l>=2)位数值1为止;其中,对N位乘数进行正则有符号数编码处理,得到的目标编码的位宽可以等于(N+1)。It should be noted that the above-mentioned method of regular signed number encoding processing can be characterized in the following manner: for an N-bit multiplier, processing is performed from low-order numerical values to high-order numerical values. If there are continuous l (l>=2)-bit numerical values 1, then the continuous n-bit numerical values 1 can be converted and processed into data "1(0)l-1 (-1)", and the remaining corresponding (Nl)-bit numerical values are combined with the converted (l+1)-bit numerical value to obtain a new data; then the new data is used as the initial data for the next level of conversion processing until there are no continuous l (l>=2)-bit numerical values 1 in the new data obtained after the conversion processing; wherein, the regular signed number encoding processing is performed on the N-bit multiplier, and the bit width of the obtained target code can be equal to (N+1).

S1022b、根据所述待处理数据与所述目标编码进行转换处理,得到所述原始部分积。S1022b, performing conversion processing according to the data to be processed and the target code to obtain the original partial product.

需要说明的是,上述原始部分积的数目可以等于目标编码的位宽。It should be noted that the number of the above original partial products may be equal to the bit width of the target code.

示例性的,若部分积获取单元接收到一个8位的被乘数“x7x6x5x4x3x2x1x0”(即X),则部分积获取单元可以根据被乘数“x7x6x5x4x3x2x1x0”(即X)与目标编码中包含的三种数值-1,0,1直接得到对应原始部分积,当目标编码中一位数值为-1时,则原始部分积可以为-X,当目标编码中一位数值为0时,则原始部分积可以为0,当目标编码中一位数值为1时,则原始部分积可以为X。可选的,上述转换处理可以表征为,基于乘法运算中的被乘数,将目标编码中的数值转换成原始部分积。Exemplarily, if the partial product acquisition unit receives an 8-bit multiplicand "x7 x6 x5 x4 x3 x 2 x1 x0 " (i.e., X), the partial product acquisition unit can directly obtain the corresponding original partial product according to the multiplicand "x7 x6 x5 x4 x3 x2 x1 x0 " (i.e., X) and the three values -1, 0, and1 contained in the target code. When a single-bit value in the target code is -1, the original partial product can be -X, when a single-bit value in the target code is 0, the original partial product can be 0, and when a single-bit value in the target code is 1, the original partial product can be X. Optionally, the above conversion processing can be characterized as converting the value in the target code into the original partial product based on the multiplicand in the multiplication operation.

S1022、对所述原始部分积进行符号位扩展处理,得到所述目标编码的部分积。S1022. Perform sign bit extension processing on the original partial product to obtain the target coded partial product.

可选的,上述S1022中对所述原始部分积进行符号位扩展处理,得到所述目标编码的部分积的步骤,具体可以包括:对所述原始部分积进行补位处理,得到所述目标编码的部分积。Optionally, the step of performing sign bit extension processing on the original partial product to obtain the target coded partial product in the above S1022 may specifically include: performing bit padding processing on the original partial product to obtain the target coded partial product.

具体的,符号位扩展后的部分积的位宽可以等于乘法器当前所处理数据位宽N的2倍,而原始部分积的位宽可以等于N,符号位扩展位的位数可以等于N。可选的,符号位扩展处理可以理解为,将符号位扩展位的数值均用原始部分积中符号位的数值进行补位,即补位数值可以为原始部分积中的符号位数值,该符号位数值可以为原始部分积中的最高位数值,得到一个2N比特位宽的符号位扩展后的部分积。可选的,上述补位的位数可以等于N。可选的,在所有符号位扩展后的部分积的分布规律中,所有符号位扩展后的部分积中的最高位数值可以位于同一列,最低位数值也可以位于同一列,其它对应位数值也可以对应同一列。Specifically, the bit width of the partial product after the sign bit is extended can be equal to twice the bit width N of the data currently processed by the multiplier, while the bit width of the original partial product can be equal to N, and the number of bits of the sign bit extension bit can be equal to N. Optionally, the sign bit extension processing can be understood as padding the value of the sign bit extension bit with the value of the sign bit in the original partial product, that is, the padding value can be the sign bit value in the original partial product, and the sign bit value can be the highest bit value in the original partial product, so as to obtain a partial product after the sign bit is extended with a bit width of 2N bits. Optionally, the number of bits of the above padding can be equal to N. Optionally, in the distribution law of all partial products after the sign bit is extended, the highest bit value in all partial products after the sign bit is extended can be located in the same column, the lowest bit value can also be located in the same column, and other corresponding bit values can also correspond to the same column.

本实施例提供的一种数据处理方法,该方法可以对所述待处理数据进行正则有符号数编码处理,得到原始部分积,对所述原始部分积进行符号位扩展处理,得到所述目标编码的部分积,并对目标编码的部分积进行累加处理,得到乘法运算结果,进而分别读取乘法运算结果中的高位数据以及低位数据,作为目标运算结果,使得获得的目标运算结果的位宽可以小于乘法器输入的数据位宽的2倍,从而有效降低了乘法器对输入输出端口位宽的要求;同时,该方法能够获取的有效部分积的数目较少,从而降低乘法运算的复杂性;同时,该方法能够提高乘法运算的运算效率。The present embodiment provides a data processing method, which can perform regular signed number encoding processing on the data to be processed to obtain original partial products, perform sign bit extension processing on the original partial products to obtain partial products of the target code, and perform accumulation processing on the partial products of the target code to obtain a multiplication result, and then read the high-order data and the low-order data in the multiplication result respectively as the target operation result, so that the bit width of the obtained target operation result can be less than 2 times the data bit width of the multiplier input, thereby effectively reducing the multiplier's requirements on the bit width of the input and output ports; at the same time, the method can obtain a small number of effective partial products, thereby reducing the complexity of the multiplication operation; at the same time, the method can improve the operation efficiency of the multiplication operation.

另一实施例提供的数据处理方法,上述S105中根据所述存储指示信号将多个所述乘法运算结果存储至不同的寄存子电路中的步骤,具体可以包括:In the data processing method provided in another embodiment, the step of storing the plurality of multiplication operation results in different register sub-circuits according to the storage indication signal in S105 may specifically include:

S1051、将第一存储指示信号对应的第一乘法运算结果存储至第一寄存子电路中。S1051. Store a first multiplication result corresponding to a first storage indication signal in a first register sub-circuit.

具体的,存储指示信号的数目可以等于乘法器实现乘法运算的次数,乘法器实现一次乘法运算,可以得到一个乘法运算结果,并且状态控制电路可以获取一个对应的存储指示信号。若乘法器进行第一次乘法运算,得到第一乘法运算结果,状态控制电路自动获取第一存储指示信号,寄存控制电路根据状态控制电路输入的第一存储指示信号,确定存储第一乘法运算结果的第一寄存子电路,并将第一乘法运算结果输入至第一寄存子电路存储。Specifically, the number of storage indication signals can be equal to the number of times the multiplier implements multiplication operations. The multiplier implements a multiplication operation and can obtain a multiplication operation result, and the state control circuit can obtain a corresponding storage indication signal. If the multiplier performs the first multiplication operation and obtains the first multiplication operation result, the state control circuit automatically obtains the first storage indication signal, and the register control circuit determines the first register subcircuit for storing the first multiplication operation result according to the first storage indication signal input by the state control circuit, and inputs the first multiplication operation result into the first register subcircuit for storage.

S1052、将第二存储指示信号对应的第二乘法运算结果存储至第二寄存子电路中。S1052: Store the second multiplication result corresponding to the second storage indication signal in the second register sub-circuit.

需要说明的是,若乘法器进行第二次乘法运算,得到第二乘法运算结果,状态控制电路自动获取第二存储指示信号,寄存控制电路根据状态控制电路输入的第二存储指示信号,确定存储第二乘法运算结果的第二寄存子电路,并将第二乘法运算结果输入至第二寄存子电路存储。依次类推,乘法器可以将每一次乘法运算得到的乘法运算结果存储至不同的寄存子电路中,并且按照寄存子电路的编号顺序存储对应的乘法运算结果,也就是连续的两次乘法运算结果可以存储至相邻的两个寄存子电路中。It should be noted that if the multiplier performs a second multiplication operation and obtains a second multiplication result, the state control circuit automatically obtains a second storage indication signal, and the register control circuit determines the second register subcircuit for storing the second multiplication result according to the second storage indication signal input by the state control circuit, and inputs the second multiplication result into the second register subcircuit for storage. By analogy, the multiplier can store the multiplication result obtained by each multiplication operation in a different register subcircuit, and store the corresponding multiplication result in the order of the register subcircuit numbers, that is, two consecutive multiplication results can be stored in two adjacent register subcircuits.

本实施例提供的一种数据处理方法,将第一存储指示信号对应的第一乘法运算结果存储至第一寄存子电路中,将第二存储指示信号对应的第二乘法运算结果存储至第二寄存子电路中,从而避免出现乘法运算结果覆盖的问题;另外,该方法还能使得获得的目标运算结果的位宽可以小于乘法器输入的数据位宽的2倍,有效降低乘法器对输入输出端口位宽的要求,同时,该方法能够获取的有效部分积的数目较少,降低乘法运算的复杂性。A data processing method provided in this embodiment stores a first multiplication result corresponding to a first storage indication signal in a first register subcircuit, and stores a second multiplication result corresponding to a second storage indication signal in a second register subcircuit, thereby avoiding the problem of overlapping multiplication results. In addition, the method can also make the bit width of the obtained target operation result less than twice the data bit width of the multiplier input, effectively reducing the multiplier's requirements for the bit width of the input and output ports. At the same time, the method can obtain a smaller number of effective partial products, reducing the complexity of the multiplication operation.

作为其中一个实施例,上述S106中根据所述读取指示信号,读取不同寄存子电路中存储的对应所述乘法运算结果中的部分数据,得到目标运算结果的步骤,具体可以通过以下方式实现:As one of the embodiments, the step of reading part of the data corresponding to the multiplication result stored in different register sub-circuits according to the read indication signal to obtain the target operation result in S106 can be specifically implemented in the following manner:

S1061、根据第一读取指示信号,读取所述第一寄存子电路中存储的第一乘法运算结果中的第一部分数据,得到第一运算结果。S1061. Read a first portion of data in a first multiplication result stored in the first register subcircuit according to a first read indication signal to obtain a first operation result.

S1062、根据第二读取指示信号,读取所述第一寄存子电路中存储的所述第一乘法运算结果中的第二部分数据,得到第二运算结果。S1062. Read a second portion of data in the first multiplication result stored in the first register subcircuit according to a second read indication signal to obtain a second operation result.

具体的,乘法器中的状态控制电路获取的读取指示信号的数目,可以等于乘法器读取运算结果的次数,相当于乘法运算结果数目的2倍。可选的,乘法运算结果可以包括两部分数据,即第一部分数据以及第二部分数据。示例性的,若乘法运算结果的位宽等于2N,则乘法运算结果可以分成两部分数据,高N位数据和低N位数据,其中,第一部分数据可以为高N位数据或低N位数据,第二部分数据可以为低N位数据或高N位数据。Specifically, the number of read indication signals obtained by the state control circuit in the multiplier may be equal to the number of times the multiplier reads the operation result, which is equivalent to twice the number of multiplication operation results. Optionally, the multiplication operation result may include two parts of data, namely a first part of data and a second part of data. Exemplarily, if the bit width of the multiplication operation result is equal to 2N, the multiplication operation result may be divided into two parts of data, high N-bit data and low N-bit data, wherein the first part of data may be high N-bit data or low N-bit data, and the second part of data may be low N-bit data or high N-bit data.

S1063、根据第三读取指示信号,读取所述第二寄存子电路中存储的第二乘法运算结果中的第一部分数据,得到第三运算结果。S1063. Read a first portion of data in the second multiplication result stored in the second register subcircuit according to a third read indication signal to obtain a third operation result.

可选的,每一个读取指示信号可以对应乘法运算结果中的第一部分数据或第二部分数据。Optionally, each read indication signal may correspond to the first portion of data or the second portion of data in the multiplication result.

S1064、根据第四读取指示信号,读取所述第二寄存子电路中存储的所述第二乘法运算结果中的第二部分数据,得到第四运算结果。S1064. Read the second part of data in the second multiplication result stored in the second register subcircuit according to the fourth read indication signal to obtain a fourth operation result.

具体的,乘法器可以对多组待处理数据进行乘法运算,得到多个乘法运算结果,因此,乘法器读取第四运算结果后,可以根据下一读取指示信号,读取下一乘法运算结果中的部分数据。Specifically, the multiplier can perform multiplication operations on multiple groups of data to be processed to obtain multiple multiplication results. Therefore, after the multiplier reads the fourth operation result, it can read part of the data in the next multiplication result according to the next read indication signal.

示例性的,若乘法器的输入端口位宽为32比特,输出端口位宽为64/t+deta比特(一般,乘法器经过t个时钟周期可以完成一次乘法运算,得到一个乘法运算结果,t>1,deta>=0),乘法器接收到的数据位宽也为32比特,并且该乘法器需要对多组待处理数据进行乘法运算,该情况下,寄存器电路13中包括(64/(64/t+deta))个寄存子电路131(即寄存子电路A1,A2,...,Ai,i可以等于(64/(64/t+deta))),则得到目标运算结果的实现过程可以为:Exemplarily, if the input port bit width of the multiplier is 32 bits, the output port bit width is 64/t+deta bits (generally, the multiplier can complete a multiplication operation after t clock cycles and obtain a multiplication operation result, t>1, deta>=0), the data bit width received by the multiplier is also 32 bits, and the multiplier needs to perform multiplication operations on multiple groups of data to be processed. In this case, the register circuit 13 includes (64/(64/t+deta)) register subcircuits 131 (that is, register subcircuitsA1 ,A2 , ...,Ai , i can be equal to (64/(64/t+deta))), then the implementation process of obtaining the target operation result can be:

若乘法器经过t(t可以大于等于0)个时钟周期得到第一乘法运算结果M_0,则寄存控制电路根据第一存储指示信号将M_0(64比特位宽)可以存储至寄存子电路A1中,此时,选择电路可以根据第一读取指示信号,从寄存子电路A1中读取M_0的高32位数据,作为第一次乘法运算得到的第一运算结果;If the multiplier obtains the first multiplication result M_0 after t (t may be greater than or equal to 0) clock cycles, the register control circuit may store M_0 (64-bit width) in the register subcircuitA1 according to the first storage indication signal. At this time, the selection circuit may read the upper 32-bit data of M_0 from the register subcircuitA1 according to the first read indication signal as the first operation result obtained by the first multiplication operation;

同时,当乘法器到第t+1个时钟周期时,则选择电路可以根据第二读取指示信号,从寄存子电路A1中读取M_0的低32位数据,作为第一次乘法运算得到的第二运算结果,在本实施例中,乘法器将第一运算结果与第二运算结果拼接,可以得到待处理数据的目标运算结果;At the same time, when the multiplier reaches the t+1th clock cycle, the selection circuit can read the lower 32-bit data of M_0 from the register subcircuitA1 according to the second read indication signal as the second operation result obtained by the first multiplication operation. In this embodiment, the multiplier splices the first operation result with the second operation result to obtain the target operation result of the data to be processed;

若乘法器到第2t个时钟周期时,可以得到第二乘法运算结果M_1,则寄存控制电路根据第二存储指示信号将M_1可以存储至寄存子电路A2中,此时,选择电路可以根据第三读取指示信号,从寄存子电路A2中读取M_1的高32位数据,作为第二次乘法运算得到的第三运算结果;If the multiplier can obtain the second multiplication result M_1 at the 2tth clock cycle, the register control circuit can store M_1 in the register sub-circuitA2 according to the second storage indication signal. At this time, the selection circuit can read the high 32-bit data of M_1 from the register sub-circuitA2 according to the third read indication signal as the third operation result obtained by the second multiplication operation;

同时,当乘法器到第2t+1个时钟周期的运算时,则选择电路可以根据第四读取指示信号,从寄存子电路A2中读取M_1的低32位数据,作为第二次乘法运算得到的第四运算结果,在本实施例中,数据比较器将第三运算结果与第四运算结果合并,可以得到待处理数据的目标运算结果;At the same time, when the multiplier reaches the 2t+1th clock cycle, the selection circuit can read the lower 32-bit data of M_1 from the register subcircuitA2 according to the fourth read indication signal as the fourth operation result obtained by the second multiplication operation. In this embodiment, the data comparator combines the third operation result with the fourth operation result to obtain the target operation result of the data to be processed;

依次类推,根据不同的存储指示信号将得到的乘法运算结果,可以存储至对应不同的寄存子电路中,并根据不同的读取指示信号读取不同寄存子电路中,存储的乘法运算结果中的部分数据,得到目标运算结果。By analogy, the multiplication results obtained according to different storage indication signals can be stored in corresponding different register sub-circuits, and part of the data in the multiplication results stored in different register sub-circuits can be read according to different read indication signals to obtain the target operation result.

另外,若多组待处理数据中的一组待处理数据,存在零值的情况,此时,乘法器经过m(m<t)个时钟周期可以得到该组待处理数据对应的乘法运算结果,乘法器可以根据存储指示信号将该乘法运算结果存储至对应的寄存子电路中,当前时钟周期下,乘法器可以根据读取指示信号读取不同寄存子电路存储的乘法运算结果中的部分数据,下一时钟周期乘法器可以输出乘法运算结果中的剩余部分数据;若下一组待处理数据中也存在零值的情况,并且需要1个时钟周期就可以完成一次乘法运算,得到乘法运算结果,此时,乘法器可以将该乘法运算结果存储至相邻的下一寄存子电路中。In addition, if one of the multiple groups of data to be processed contains zero values, then the multiplier can obtain the multiplication result corresponding to the group of data to be processed after m (m<t) clock cycles, and the multiplier can store the multiplication result in the corresponding register sub-circuit according to the storage indication signal. In the current clock cycle, the multiplier can read part of the multiplication results stored in different register sub-circuits according to the read indication signal, and the multiplier can output the remaining part of the multiplication result in the next clock cycle; if the next group of data to be processed also contains zero values, and it takes 1 clock cycle to complete a multiplication operation to obtain the multiplication result, then the multiplier can store the multiplication result in the next adjacent register sub-circuit.

本实施例提供的一种数据处理方法,乘法器根据读取指示信号,读取不同寄存子电路中存储的对应乘法运算结果中的部分数据,得到目标运算结果,该方法可以分别读取乘法运算结果中的高位数据以及低位数据,作为目标运算结果,使得获得的目标运算结果的位宽可以小于乘法器输入的数据位宽的2倍,从而有效降低了乘法器对输入输出端口位宽的要求;同时,该方法能够获取的有效部分积的数目较少,降低乘法运算的复杂性。The present embodiment provides a data processing method, in which a multiplier reads part of the data in the corresponding multiplication operation results stored in different register subcircuits according to a read indication signal to obtain a target operation result. The method can read the high-order data and the low-order data in the multiplication operation result respectively as the target operation result, so that the bit width of the obtained target operation result can be less than 2 times the bit width of the data input to the multiplier, thereby effectively reducing the requirements of the multiplier on the bit width of the input and output ports; at the same time, the method can obtain a small number of effective partial products, reducing the complexity of the multiplication operation.

图6为一个实施例提供的数据处理方法的流程示意图,该方法可以通过图2所示的乘法器进行处理,本实施例涉及的是对数据进行乘法运算的过程。如图6所示,该方法包括:FIG6 is a flow chart of a data processing method provided by an embodiment. The method can be processed by the multiplier shown in FIG2. The embodiment involves a process of performing a multiplication operation on data. As shown in FIG6, the method includes:

S201、接收数据转换信号以及待处理数据。S201, receiving a data conversion signal and data to be processed.

具体的,乘法器中的乘法运算电路可以接收两个待处理数据和数据转换信号。可选的,待处理数据的位宽可以等于乘法器输入端口的位宽。可选的,若转数电路接收不同的数据转换信号,则转数电路可以将接收到的数据转换成,数据转换信号对应格式的数据。Specifically, the multiplication circuit in the multiplier can receive two data to be processed and a data conversion signal. Optionally, the bit width of the data to be processed can be equal to the bit width of the input port of the multiplier. Optionally, if the conversion circuit receives different data conversion signals, the conversion circuit can convert the received data into data in a format corresponding to the data conversion signal.

S202、对所述待处理数据进行正则有符号数编码处理,得到目标编码的部分积。S202, performing regular signed number encoding processing on the data to be processed to obtain a partial product of a target code.

具体的,上述正则有符号数编码处理的原理可以表征为,对于N位乘数而言,从低位向高位数值处理,若存在连续l(l>=2)位1时,则可以将n位1转换处理为数据“1(0)l-1(-1)”,并且将其余对应N-l位数值与转化后的l+1位数值结合得到一个新的数据,将该新数据作为下一级转换处理的初始数据,直到转换处理后得到的新数据中不存在连续l(l>=2)位1为止,其中,对N位乘数进行正则有符号数编码处理得到的目标编码的位宽可以等于N+1位数值。需要说明的是,上述目标编码的部分积的数目可以等于乘法器接收到的数据位宽N加1。Specifically, the principle of the above-mentioned regular signed number encoding processing can be characterized as follows: for an N-bit multiplier, from the low bit to the high bit value processing, if there are l (l>=2) consecutive 1 bits, then the n-bit 1 can be converted and processed into the data "1 (0)l-1 (-1)", and the remaining corresponding Nl-bit values are combined with the converted l+1-bit values to obtain a new data, and the new data is used as the initial data for the next level of conversion processing, until there are no consecutive l (l>=2) bits of 1 in the new data obtained after the conversion processing, wherein the bit width of the target code obtained by the regular signed number encoding processing of the N-bit multiplier can be equal to the N+1-bit value. It should be noted that the number of partial products of the above-mentioned target code can be equal to the data bit width N received by the multiplier plus 1.

S203、对所述目标编码的部分积进行累加处理,得到乘法运算结果。S203, performing accumulation processing on the partial products of the target code to obtain a multiplication result.

具体的,累加子电路可以对所有目标编码的部分积中的每一列数值进行累加运算,得到乘法运算结果。可选的,上述乘法运算结果的位宽可以等于乘法器接收到的数据位宽的2倍,还可以等于乘法器输入端口位宽的2倍。可选的,上述乘法运算结果的位宽可以等于乘法器输入端口的位宽的2倍,还可以等于待处理数据的位宽的2倍。Specifically, the accumulation subcircuit can perform accumulation operation on each column value in the partial products of all target codes to obtain a multiplication operation result. Optionally, the bit width of the multiplication operation result can be equal to twice the bit width of the data received by the multiplier, or can be equal to twice the bit width of the multiplier input port. Optionally, the bit width of the multiplication operation result can be equal to twice the bit width of the multiplier input port, or can be equal to twice the bit width of the data to be processed.

S204、根据所述数据转换信号将所述乘法运算结果进行转数处理,得到目标运算结果,其中,所述数据转换信号用于指示乘法器需要将所述目标运算结果转换为需求的数据类型。S204. Convert the multiplication result according to the data conversion signal to obtain a target operation result, wherein the data conversion signal is used to indicate that the multiplier needs to convert the target operation result into a required data type.

具体的,转数电路根据接收到的数据转换信号确定,可以将乘法运算结果转换成,定点类型的运算结果或者浮点类型的运算结果。示例性的,若转数电路可以接收两种数据转换信号,分别表示为00和01,同时,乘法器输入端口和输出端口的位宽均为N比特,则00表示转数电路可以将接收到的2N位乘法运算结果转换成,N位定点类型的运算结果,01表示转数电路可以将接收到的2N位乘法运算结果转换成,N位浮点类型的运算结果,其中,不同数据转换信号对应转数电路实现的功能可以灵活设置。可选的,每一种数据转换信号可以表征乘法器需要将乘法运算结果转换为需求的一种数据类型。Specifically, the conversion circuit can convert the multiplication result into a fixed-point type result or a floating-point type result according to the received data conversion signal. Exemplarily, if the conversion circuit can receive two data conversion signals, represented as 00 and 01 respectively, and at the same time, the bit width of the multiplier input port and the output port are both N bits, then 00 means that the conversion circuit can convert the received 2N-bit multiplication result into an N-bit fixed-point type result, and 01 means that the conversion circuit can convert the received 2N-bit multiplication result into an N-bit floating-point type result, wherein the functions implemented by the conversion circuit corresponding to different data conversion signals can be flexibly set. Optionally, each data conversion signal can represent a data type that the multiplier needs to convert the multiplication result into.

本实施例提供的一种数据处理方法,接收数据转换信号以及待处理数据,对所述待处理数据进行乘法运算处理,得到乘法运算结果,并根据所述数据转换信号将所述乘法运算结果进行转数处理,得到目标运算结果,该方法能够使得获得的目标运算结果的位宽,可以小于乘法器输入数据位宽的2倍,从而有效降低了乘法器对输入输出端口位宽的要求;同时,该方法能够获取的有效部分积的数目较少,降低乘法运算的复杂性。A data processing method provided in this embodiment receives a data conversion signal and data to be processed, performs multiplication processing on the data to be processed to obtain a multiplication result, and performs a multiplication processing on the multiplication result according to the data conversion signal to obtain a target operation result. This method can make the bit width of the obtained target operation result less than twice the bit width of the multiplier input data, thereby effectively reducing the multiplier's requirements on the bit width of the input and output ports; at the same time, this method can obtain a small number of effective partial products, reducing the complexity of the multiplication operation.

本申请实施例还提供了一个机器学习运算装置,其包括一个或多个在本申请中提到的乘法器,用于从其它处理装置中获取待运算数据和控制信息,执行指定的机器学习运算,执行结果通过I/O接口传递给外围设备。外围设备譬如摄像头,显示器,鼠标,键盘,网卡,wifi接口,服务器。当包含一个以上乘法器时,乘法器间可以通过特定的结构进行链接并传输数据,譬如,通过快速外部设备互连总线进行互联并传输数据,以支持更大规模的机器学习的运算。此时,可以共享同一控制系统,也可以有各自独立的控制系统;可以共享内存,也可以每个加速器有各自的内存。此外,其互联方式可以是任意互联拓扑。The embodiment of the present application also provides a machine learning operation device, which includes one or more multipliers mentioned in the present application, which are used to obtain data to be calculated and control information from other processing devices, perform specified machine learning operations, and pass the execution results to peripheral devices through I/O interfaces. Peripheral devices include cameras, displays, mice, keyboards, network cards, wifi interfaces, and servers. When more than one multiplier is included, the multipliers can be linked and data can be transmitted through a specific structure, for example, through a fast external device interconnect bus to interconnect and transmit data to support larger-scale machine learning operations. At this time, the same control system can be shared, or there can be independent control systems; memory can be shared, or each accelerator can have its own memory. In addition, the interconnection method can be any interconnection topology.

该机器学习运算装置具有较高的兼容性,可通过快速外部设备互连接口与各种类型的服务器相连接。The machine learning computing device has high compatibility and can be connected to various types of servers through a fast external device interconnection interface.

本申请实施例还提供了一个组合处理装置,其包括上述的机器学习运算装置,通用互联接口,和其它处理装置。机器学习运算装置与其它处理装置进行交互,共同完成用户指定的操作。图7为组合处理装置的示意图。The embodiment of the present application also provides a combined processing device, which includes the above-mentioned machine learning computing device, a universal interconnection interface, and other processing devices. The machine learning computing device interacts with other processing devices to jointly complete the operation specified by the user. Figure 7 is a schematic diagram of the combined processing device.

其它处理装置,包括中央处理器CPU、图形处理器GPU、神经网络处理器等通用/专用处理器中的一种或以上的处理器类型。其它处理装置所包括的处理器数量不做限制。其它处理装置作为机器学习运算装置与外部数据和控制的接口,包括数据搬运,完成对本机器学习运算装置的开启、停止等基本控制;其它处理装置也可以和机器学习运算装置协作共同完成运算任务。Other processing devices include one or more types of processors such as central processing unit (CPU), graphics processing unit (GPU), neural network processor, and other general/special processors. There is no limit on the number of processors included in other processing devices. Other processing devices serve as interfaces between the machine learning computing device and external data and control, including data handling, to complete basic control of the machine learning computing device such as starting and stopping; other processing devices can also collaborate with the machine learning computing device to complete computing tasks.

通用互联接口,用于在所述机器学习运算装置与其它处理装置间传输数据和控制指令。该机器学习运算装置从其它处理装置中获取所需的输入数据,写入机器学习运算装置片上的存储装置;可以从其它处理装置中获取控制指令,写入机器学习运算装置片上的控制缓存;也可以读取机器学习运算装置的存储模块中的数据并传输给其它处理装置。A universal interconnection interface is used to transmit data and control instructions between the machine learning computing device and other processing devices. The machine learning computing device can obtain the required input data from other processing devices and write it into the storage device on the machine learning computing device chip; it can obtain control instructions from other processing devices and write them into the control cache on the machine learning computing device chip; it can also read data in the storage module of the machine learning computing device and transmit it to other processing devices.

可选的,该结构如图8所示,还可以包括存储装置,存储装置分别与所述机器学习运算装置和所述其它处理装置连接。存储装置用于保存在所述机器学习运算装置和所述其它处理装置的数据,尤其适用于所需要运算的数据在本机器学习运算装置或其它处理装置的内部存储中无法全部保存的数据。Optionally, the structure as shown in FIG8 may further include a storage device, which is connected to the machine learning operation device and the other processing device respectively. The storage device is used to store data in the machine learning operation device and the other processing device, and is particularly suitable for data that cannot be fully stored in the internal storage of the machine learning operation device or other processing devices.

该组合处理装置可以作为手机、机器人、无人机、视频监控设备等设备的SOC片上系统,有效降低控制部分的核心面积,提高处理速度,降低整体功耗。此情况时,该组合处理装置的通用互联接口与设备的某些部件相连接。某些部件譬如摄像头,显示器,鼠标,键盘,网卡,wifi接口。The combined processing device can be used as a SOC chip system for mobile phones, robots, drones, video surveillance equipment and other devices, effectively reducing the core area of the control part, improving the processing speed, and reducing the overall power consumption. In this case, the universal interconnection interface of the combined processing device is connected to certain components of the device. Certain components include cameras, displays, mice, keyboards, network cards, and wifi interfaces.

在一些实施例里,还申请了一种芯片,其包括了上述机器学习运算装置或组合处理装置。In some embodiments, a chip is also applied for, which includes the above-mentioned machine learning computing device or combined processing device.

在一些实施例里,申请了一种芯片封装结构,其包括了上述芯片。In some embodiments, a chip packaging structure is applied for, which includes the above-mentioned chip.

在一些实施例里,申请了一种板卡,其包括了上述芯片封装结构。如图9所示,图9提供了一种板卡,上述板卡除了包括上述芯片389以外,还可以包括其它的配套部件,该配套部件包括但不限于:存储器件390、接收装置391和控制器件392;In some embodiments, a board card is applied, which includes the above chip packaging structure. As shown in FIG9 , FIG9 provides a board card, which, in addition to the above chip 389, may also include other supporting components, including but not limited to: a storage device 390, a receiving device 391 and a control device 392;

所述存储器件390与所述芯片封装结构内的芯片通过总线连接,用于存储数据。所述存储器件可以包括多组存储单元393。每一组所述存储单元与所述芯片通过总线连接。可以理解,每一组所述存储单元可以是DDR SDRAM(英文:Double Data Rate SDRAM,双倍速率同步动态随机存储器)。The memory device 390 is connected to the chip in the chip package structure via a bus for storing data. The memory device may include multiple groups of memory cells 393. Each group of memory cells is connected to the chip via a bus. It is understood that each group of memory cells may be DDR SDRAM (English: Double Data Rate SDRAM, double rate synchronous dynamic random access memory).

DDR不需要提高时钟频率就能加倍提高SDRAM的速度。DDR允许在时钟脉冲的上升沿和下降沿读出数据。DDR的速度是标准SDRAM的两倍。在一个实施例中,所述存储装置可以包括4组所述存储单元。每一组所述存储单元可以包括多个DDR4颗粒(芯片)。在一个实施例中,所述芯片内部可以包括4个72位DDR4控制器,上述72位DDR4控制器中64bit用于传输数据,8bit用于ECC校验。可以理解,当每一组所述存储单元中采用DDR4-3200颗粒时,数据传输的理论带宽可达到25600MB/s。DDR can double the speed of SDRAM without increasing the clock frequency. DDR allows data to be read out on the rising and falling edges of the clock pulse. The speed of DDR is twice that of standard SDRAM. In one embodiment, the storage device may include 4 groups of storage units. Each group of storage units may include multiple DDR4 particles (chips). In one embodiment, the chip may include 4 72-bit DDR4 controllers, 64 bits of the above 72-bit DDR4 controllers are used to transmit data, and 8 bits are used for ECC verification. It can be understood that when DDR4-3200 particles are used in each group of storage units, the theoretical bandwidth of data transmission can reach 25600MB/s.

在一个实施例中,每一组所述存储单元包括多个并联设置的双倍速率同步动态随机存储器。DDR在一个时钟周期内可以传输两次数据。在所述芯片中设置控制DDR的控制器,用于对每个所述存储单元的数据传输与数据存储的控制。In one embodiment, each group of the storage units includes a plurality of double rate synchronous dynamic random access memories arranged in parallel. DDR can transmit data twice in one clock cycle. A controller for controlling DDR is arranged in the chip to control the data transmission and data storage of each of the storage units.

所述接收装置与所述芯片封装结构内的芯片电连接。所述接收装置用于实现所述芯片与外部设备(例如服务器或计算机)之间的数据传输。例如在一个实施例中,所述接收装置可以为标准快速外部设备互连接口。比如,待处理数据由服务器通过标准快速外部设备互连接口传递至所述芯片,实现数据转移。优选的,当采用快速外部设备互连3.0X 16接口传输时,理论带宽可达到16000MB/s。在另一个实施例中,所述接收装置还可以是其它的接口,本申请并不限制上述其它的接口的具体表现形式,所述接口单元能够实现转接功能即可。另外,所述芯片的计算结果仍由所述接收装置传送回外部设备(例如服务器)。The receiving device is electrically connected to the chip in the chip packaging structure. The receiving device is used to realize data transmission between the chip and an external device (such as a server or a computer). For example, in one embodiment, the receiving device can be a standard fast external device interconnection interface. For example, the data to be processed is transmitted to the chip by the server through the standard fast external device interconnection interface to realize data transfer. Preferably, when the fast external device interconnection 3.0X 16 interface is used for transmission, the theoretical bandwidth can reach 16000MB/s. In another embodiment, the receiving device can also be other interfaces. The present application does not limit the specific forms of expression of the above-mentioned other interfaces. The interface unit can realize the switching function. In addition, the calculation results of the chip are still transmitted back to the external device (such as a server) by the receiving device.

所述控制器件与所述芯片电连接。所述控制器件用于对所述芯片的状态进行监控。具体的,所述芯片与所述控制器件可以通过SPI接口电连接。所述控制器件可以包括单片机(Micro Controller Unit,MCU)。如所述芯片可以包括多个处理芯片、多个处理核或多个处理电路,可以带动多个负载。因此,所述芯片可以处于多负载和轻负载等不同的工作状态。通过所述控制装置可以实现对所述芯片中多个处理芯片、多个处理和或多个处理电路的工作状态的调控。The control device is electrically connected to the chip. The control device is used to monitor the state of the chip. Specifically, the chip and the control device can be electrically connected via an SPI interface. The control device may include a single-chip microcomputer (Micro Controller Unit, MCU). For example, the chip may include multiple processing chips, multiple processing cores or multiple processing circuits, which can drive multiple loads. Therefore, the chip can be in different working states such as multi-load and light load. The control device can realize the regulation of the working states of multiple processing chips, multiple processing and/or multiple processing circuits in the chip.

在一些实施例里,申请了一种电子设备,其包括了上述板卡。In some embodiments, an electronic device is applied for, which includes the above-mentioned board.

电子设备可以为乘法器、机器人、电脑、打印机、扫描仪、平板电脑、智能终端、手机、行车记录仪、导航仪、传感器、摄像头、服务器、云端服务器、相机、摄像机、投影仪、手表、耳机、移动存储、可穿戴设备、交通工具、家用电器、和/或医疗设备。The electronic device may be a multiplier, a robot, a computer, a printer, a scanner, a tablet computer, a smart terminal, a mobile phone, a driving recorder, a navigator, a sensor, a camera, a server, a cloud server, a camera, a camcorder, a projector, a watch, a headset, a mobile storage, a wearable device, a vehicle, a household appliance, and/or a medical device.

所述交通工具包括飞机、轮船和/或车辆;所述家用电器包括电视、空调、微波炉、冰箱、电饭煲、加湿器、洗衣机、电灯、燃气灶、油烟机;所述医疗设备包括核磁共振仪、B超仪和/或心电图仪。The transportation means include airplanes, ships and/or vehicles; the household appliances include televisions, air conditioners, microwave ovens, refrigerators, rice cookers, humidifiers, washing machines, electric lights, gas stoves, and range hoods; the medical equipment includes magnetic resonance imaging, ultrasound machines and/or electrocardiographs.

需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的电路组合,但是本领域技术人员应该知悉,本申请并不受所描述的电路组合方式的限制,因为依据本申请,某些电路可以采用其它方式或者结构实现。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于可选实施例,所涉及的器件和模块并不一定是本申请所必须的。It should be noted that, for the above-mentioned method embodiments, for the sake of simplicity, they are all expressed as a series of circuit combinations, but those skilled in the art should be aware that the present application is not limited by the described circuit combination method, because according to the present application, some circuits can be implemented in other ways or structures. Secondly, those skilled in the art should also be aware that the embodiments described in the specification are all optional embodiments, and the devices and modules involved are not necessarily required by the present application.

在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其它实施例的相关描述。In the above embodiments, the description of each embodiment has its own emphasis. For parts that are not described in detail in a certain embodiment, reference can be made to the relevant descriptions of other embodiments.

以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对本申请专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。The above-mentioned embodiments only express several implementation methods of the present application, and the descriptions thereof are relatively specific and detailed, but they cannot be understood as limiting the scope of the present application. It should be pointed out that, for a person of ordinary skill in the art, several variations and improvements can be made without departing from the concept of the present application, and these all belong to the protection scope of the present application. Therefore, the protection scope of the present application shall be subject to the attached claims.

Claims (26)

1. A multiplier, the multiplier comprising: the device comprises a multiplication circuit, a register control circuit, a register circuit, a state control circuit and a selection circuit, wherein the multiplication circuit comprises a regular signed number coding sub-circuit and an accumulation sub-circuit, the output end of the regular signed number coding sub-circuit is connected with the input end of the accumulation sub-circuit, the output end of the accumulation sub-circuit is connected with the first input end of the register control circuit, the output end of the register control circuit is connected with the input end of the register circuit, the output end of the register circuit is connected with the first input end of the selection circuit, the first output end of the state control circuit is connected with the second input end of the register control circuit, and the second output end of the state control circuit is connected with the second input end of the selection circuit; the regular signed number coding sub-circuit comprises a regular signed number coding unit and a partial product acquisition unit;
2. The multiplier according to claim 1, wherein said partial product obtaining unit is configured to receive second data, obtain an original partial product from said target code and said second data, and obtain a partial product of said target code from said original partial product; the accumulation sub-circuit is used for carrying out accumulation processing on the partial product of the target code to obtain a multiplication result; the state control circuit is used for acquiring a storage indication signal and a reading indication signal; the register control circuit is used for determining the register circuit for storing the multiplication result according to the storage indication signal input by the state control circuit; the register circuit is used for storing the multiplication result; the selection circuit is used for reading data in the multiplication operation result stored in the register circuit according to the received reading indication signal to serve as a target operation result; the second data and the first data are fixed point numbers, and the data bit widths of the second data and the first data are equal.
The regular signed number coding sub-circuit is used for carrying out regular signed number coding processing on received first data to obtain target codes, determining an original partial product according to the target codes and the second data, and obtaining a partial product of the target codes according to the original partial product; the accumulation sub-circuit is used for carrying out accumulation processing on the partial product of the target code to obtain a multiplication result; the first conversion sub-circuit and the second conversion sub-circuit are respectively used for carrying out revolution processing on the multiplication operation result to obtain a target operation result; wherein the data bit width of the target operation result is smaller than 2 times of the data bit width of the multiplication operation result.
CN201910819020.4A2019-08-302019-08-30 Multiplier, data processing method, chip and electronic deviceActiveCN110515589B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201910819020.4ACN110515589B (en)2019-08-302019-08-30 Multiplier, data processing method, chip and electronic device

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201910819020.4ACN110515589B (en)2019-08-302019-08-30 Multiplier, data processing method, chip and electronic device

Publications (2)

Publication NumberPublication Date
CN110515589A CN110515589A (en)2019-11-29
CN110515589Btrue CN110515589B (en)2024-04-09

Family

ID=68629959

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201910819020.4AActiveCN110515589B (en)2019-08-302019-08-30 Multiplier, data processing method, chip and electronic device

Country Status (1)

CountryLink
CN (1)CN110515589B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN111176725B (en)*2019-12-272022-05-06北京市商汤科技开发有限公司Data processing method, device, equipment and storage medium
CN113568864B (en)*2020-04-292025-09-05意法半导体股份有限公司Circuit, corresponding device, system and method
CN112114776B (en)*2020-09-302023-12-15本源量子计算科技(合肥)股份有限公司Quantum multiplication method, device, electronic device and storage medium
CN112711394B (en)*2021-03-262021-06-04南京后摩智能科技有限公司Circuit based on digital domain memory computing
CN113222132B (en)*2021-05-222023-04-18上海阵量智能科技有限公司Multiplier, data processing method, chip, computer device and storage medium
CN115857873B (en)*2023-02-072023-05-09兰州大学Multiplier, multiplication calculation method, processing system, and storage medium
CN116774966B (en)*2023-08-222023-12-08深圳比特微电子科技有限公司Multiplier, multiply-accumulate circuit, operation circuit, processor and computing device

Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN1454347A (en)*2000-10-162003-11-05诺基亚公司Multiplier and shift device using signed digit representation
CN101178643A (en)*2006-11-092008-05-14普诚科技股份有限公司 Data conversion method and data conversion circuit capable of saving digital operations
CN105183424A (en)*2015-08-212015-12-23电子科技大学Fixed-bit-width multiplier with high accuracy and low energy consumption properties
CN209895329U (en)*2019-08-302020-01-03上海寒武纪信息科技有限公司 multiplier

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US7080115B2 (en)*2002-05-222006-07-18Broadcom CorporationLow-error canonic-signed-digit fixed-width multiplier, and method for designing same

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN1454347A (en)*2000-10-162003-11-05诺基亚公司Multiplier and shift device using signed digit representation
CN101178643A (en)*2006-11-092008-05-14普诚科技股份有限公司 Data conversion method and data conversion circuit capable of saving digital operations
CN105183424A (en)*2015-08-212015-12-23电子科技大学Fixed-bit-width multiplier with high accuracy and low energy consumption properties
CN209895329U (en)*2019-08-302020-01-03上海寒武纪信息科技有限公司 multiplier

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Herz–Schur multipliers of dynamical systems;A. McKee;Advances in Mathematics;20180620;第331卷;全文*
一种高速数字FIR滤波器的VLSI实现;万超 等;《合肥工业大学学报(自然科学版)》;第31卷(第5期);第736-739页*
基于CSD编码的16位并行乘法器的设计;王瑞光 等;《微计算机信息》;第24卷(第23期);第75-76,26页*

Also Published As

Publication numberPublication date
CN110515589A (en)2019-11-29

Similar Documents

PublicationPublication DateTitle
CN110515589B (en) Multiplier, data processing method, chip and electronic device
CN110413254B (en)Data processor, method, chip and electronic equipment
CN111381808B (en) Multiplier, data processing method, chip and electronic device
CN110531954B (en)Multiplier, data processing method, chip and electronic equipment
CN110515587B (en)Multiplier, data processing method, chip and electronic equipment
CN110362293B (en) Multipliers, data processing methods, chips and electronic devices
CN110515590B (en)Multiplier, data processing method, chip and electronic equipment
CN110554854B (en)Data processor, method, chip and electronic equipment
CN113031911B (en) Multiplier, data processing method, device and chip
CN113031915B (en)Multiplier, data processing method, device and chip
CN113031912B (en)Multiplier, data processing method, device and chip
CN113031913B (en)Multiplier, data processing method, device and chip
CN111258541B (en)Multiplier, data processing method, chip and electronic equipment
CN111258633B (en)Multiplier, data processing method, chip and electronic equipment
CN110515586B (en) Multiplier, data processing method, chip and electronic device
CN209895329U (en) multiplier
CN113031916B (en)Multiplier, data processing method, device and chip
CN110515588B (en)Multiplier, data processing method, chip and electronic equipment
CN110647307B (en) Data processor, method, chip and electronic device
CN110378477B (en) Multipliers, data processing methods, chips and electronic devices
CN110688087B (en) Data processors, methods, chips and electronic devices
CN210006031U (en) multiplier
CN210109789U (en) data processor
CN210109863U (en)Multiplier, device, neural network chip and electronic equipment
CN210006029U (en) data processor

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp