Movatterモバイル変換


[0]ホーム

URL:


CN108388943A - A kind of pond device and method suitable for neural network - Google Patents

A kind of pond device and method suitable for neural network
Download PDF

Info

Publication number
CN108388943A
CN108388943ACN201810014396.3ACN201810014396ACN108388943ACN 108388943 ACN108388943 ACN 108388943ACN 201810014396 ACN201810014396 ACN 201810014396ACN 108388943 ACN108388943 ACN 108388943A
Authority
CN
China
Prior art keywords
pond
pooling
neuron
module
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810014396.3A
Other languages
Chinese (zh)
Other versions
CN108388943B (en
Inventor
韩银和
闵丰
许浩博
王颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Computing Technology of CAS
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CASfiledCriticalInstitute of Computing Technology of CAS
Priority to CN201810014396.3ApriorityCriticalpatent/CN108388943B/en
Publication of CN108388943ApublicationCriticalpatent/CN108388943A/en
Application grantedgrantedCritical
Publication of CN108388943BpublicationCriticalpatent/CN108388943B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

Translated fromChinese

本发明涉及一种适用于神经网络的池化装置,包括神经元输入接口模块,用于接收神经元数据,并识别有效神经元数据;池化缓存模块,用于暂存复用神经元数据;池化计算模块,用于完成针对神经元数据的池化计算;神经元输出接口模块,用于输出池化计算结果;以及池化控制模块,用于控制所述池化装置的各个模块和池化过程。

The invention relates to a pooling device suitable for a neural network, comprising a neuron input interface module for receiving neuron data and identifying valid neuron data; a pooling cache module for temporarily storing multiplexed neuron data; The pooling calculation module is used to complete the pooling calculation for neuron data; the neuron output interface module is used to output the pooling calculation result; and the pooling control module is used to control each module and pool of the pooling device process.

Description

Translated fromChinese
一种适用于神经网络的池化装置及方法A pooling device and method suitable for neural networks

技术领域technical field

本发明涉及计算领域,特别涉及一种适用于神经网络的池化装置及方法。The invention relates to the computing field, in particular to a pooling device and method suitable for neural networks.

背景技术Background technique

神经网络是人工智能领域具有高发展水平的感知模型之一,因广泛的应用和出色的表现使其成为了学术界和工业界的研究热点。神经网络通过模拟人类大脑的神经连接结构来建立模型,为大规模数据(例如图像、视频或音频)处理任务带来了突破性进展。神经网络的计算过程一般可分为卷积、激活、池化等步骤,其中,神经网络的各层次特征图尺寸可随着池化操作组成减少,以达到计算收敛效果,高效的池化装置有利于节约神经网络的硬件成本。Neural network is one of the perception models with a high level of development in the field of artificial intelligence. Because of its wide application and excellent performance, it has become a research hotspot in academia and industry. Neural networks create a model by simulating the neural connection structure of the human brain, bringing breakthroughs in large-scale data processing tasks such as images, videos or audio. The calculation process of the neural network can generally be divided into steps such as convolution, activation, and pooling. Among them, the size of the feature maps of each level of the neural network can be reduced with the composition of the pooling operation to achieve the calculation convergence effect. The efficient pooling device has It is beneficial to save the hardware cost of the neural network.

在实际应用中,不同神经网络模型的池化尺寸、池化复用选择以及池化数据的调度会存在差异。而现有技术中的池化装置,很难在满足神经网络加速器兼容性要求的同时,保持神经网络芯片的低能耗,这就极大限制了神经网络芯片的效率以及对不同网络的兼容性。In practical applications, there will be differences in the pooling size, pooling multiplexing selection, and pooling data scheduling of different neural network models. However, the pooling device in the prior art is difficult to maintain the low energy consumption of the neural network chip while meeting the compatibility requirements of the neural network accelerator, which greatly limits the efficiency of the neural network chip and its compatibility with different networks.

因此,需要一种兼容性好且能耗低的适用于神经网络的池化装置及方法。Therefore, there is a need for a pooling device and method suitable for neural networks with good compatibility and low energy consumption.

发明内容Contents of the invention

本发明提供一种适用于神经网络的池化装置包括:神经元输入接口模块,用于接收神经元数据,并识别有效神经元数据;池化缓存模块,用于暂存复用神经元数据;池化计算模块,用于完成针对神经元数据的池化计算;神经元输出接口模块,用于输出池化计算结果;以及池化控制模块,用于控制所述池化装置的各个模块和池化过程。The present invention provides a pooling device suitable for neural networks, including: a neuron input interface module for receiving neuron data and identifying valid neuron data; a pooling cache module for temporarily storing multiplexed neuron data; The pooling calculation module is used to complete the pooling calculation for neuron data; the neuron output interface module is used to output the pooling calculation result; and the pooling control module is used to control each module and pool of the pooling device process.

优选的,所述池化控制模块还用于接收并分析池化参数。Preferably, the pooling control module is also used to receive and analyze pooling parameters.

优选的,所述池化控制模块根据所述池化参数判断是否在池化过程中采用复用策略。Preferably, the pooling control module judges whether to adopt a multiplexing strategy in the pooling process according to the pooling parameters.

优选的,所述池化参数包括池化域的步长和边长。Preferably, the pooling parameters include the step size and side length of the pooling domain.

优选的,若所述步长小于边长,则采用复用策略,所述池化控制模块控制所述池化计算模块执行计算,并控制所述池化缓存模块启动。Preferably, if the step size is smaller than the side length, a multiplexing strategy is adopted, and the pooling control module controls the pooling calculation module to perform calculations, and controls the pooling cache module to start.

优选的,所述池化计算模块从所述神经元输入接口模块和所述池化缓存模块接收神经元数据。Preferably, the pooling calculation module receives neuron data from the neuron input interface module and the pooling cache module.

优选的,所述神经元数据是经拼接组成的单个池化域神经元数据。Preferably, the neuron data is spliced single pooled domain neuron data.

优选的,若所述步长等于边长,则不采用复用策略,所述池化控制模块控制所述池化计算模块对所述神经元直接执行计算,并控制所述池化缓存模块不启动。Preferably, if the step size is equal to the side length, the multiplexing strategy is not adopted, the pooling control module controls the pooling calculation module to directly perform calculations on the neurons, and controls the pooling cache module to not start up.

优选的,所述计算池化控制模块还用于控制所述池化装置的休眠和启动。Preferably, the computing pooling control module is also used to control sleep and startup of the pooling device.

根据本发明的另一个方面,还一种适用于神经网络的池化方法,包括以下步骤:According to another aspect of the present invention, there is also a pooling method suitable for neural networks, comprising the following steps:

接收并分析池化参数,生成有效数据编码并确定复用策略;Receive and analyze pooling parameters, generate valid data codes and determine multiplexing strategies;

根据所述有效数据编码接收有效神经元数据,根据所述复用策略判定是否存储复用神经元数据;receiving effective neuron data according to the effective data encoding, and determining whether to store multiplexed neuron data according to the multiplexing strategy;

针对所述有效神经元数据或者所述有效神经元数据和所述复用神经元数据拼接组成的神经元数据进行池化计算并输出计算的最终结果。A pooling calculation is performed on the effective neuron data or the neuron data composed of the effective neuron data and the multiplexed neuron data, and a final calculation result is output.

相对于现有技术,本发明取得了如下有益技术效果:本发明提供的应用于神经网络的池化装置及方法,通过分析神经网络的池化参数,获得有效数据编码和复用策略,利用暂存数据的方式实现池化过程中数据的复用,从而实现利用固定的运算单元对不同池化范围的神经元进行分批激活,提升了池化装置的兼容性;同时为池化装置设立了休眠和启动机制,降低了神经网络芯片的能耗。Compared with the prior art, the present invention has achieved the following beneficial technical effects: the pooling device and method applied to the neural network provided by the present invention can obtain effective data encoding and multiplexing strategies by analyzing the pooling parameters of the neural network, and utilize temporary The way of storing data realizes the reuse of data in the pooling process, so that the neurons in different pooling ranges can be activated in batches by using fixed computing units, which improves the compatibility of the pooling device; at the same time, a pooling device is set up The sleep and start mechanism reduces the energy consumption of the neural network chip.

附图说明Description of drawings

图1是本发明提供的适用于神经网络的池化装置。Fig. 1 is a pooling device suitable for neural networks provided by the present invention.

图2是利用图1所示的池化装置进行池化的方法流程图。Fig. 2 is a flowchart of a pooling method using the pooling device shown in Fig. 1 .

图3是本发明的较佳实施例的池化装置结构示意图。Fig. 3 is a schematic structural diagram of a pooling device in a preferred embodiment of the present invention.

具体实施方式Detailed ways

为了使本发明的目的、技术方案以及优点更加清楚明白,以下结合附图,对根据本发明的实施例中提供的适用于神经网络的池化装置及方法进一步详细说明。In order to make the object, technical solution and advantages of the present invention more clear, the pooling device and method suitable for neural networks provided in the embodiments of the present invention will be further described in detail below with reference to the accompanying drawings.

随着近年来人工智能的发展,基于深度学习的神经网络在解决抽象问题上得到了广泛的应用,深度神经网络可以通过多个变换阶段分层对数据特征进行描述,从而建立一种由大量节点通过网状互连构成的运算模型,这些节点通常被称为神经元。一般来说,神经网络的计算量较大,且计算过程复杂,利用池化装置的取最大/小值以及取平均值,可对神经网络的计算进行收敛。因此,设计高效的池化装置对神经网络的计算具有重要意义。With the development of artificial intelligence in recent years, neural networks based on deep learning have been widely used in solving abstract problems. Deep neural networks can describe data features hierarchically through multiple transformation stages, thereby establishing a network consisting of a large number of nodes. A computing model composed of mesh interconnections, these nodes are often called neurons. Generally speaking, the calculation amount of the neural network is large, and the calculation process is complicated. The calculation of the neural network can be converged by using the maximum/minimum value and the average value of the pooling device. Therefore, designing an efficient pooling device is of great significance to the calculation of neural networks.

针对现有池化装置普遍存在兼容性较差的问题,发明人经研究提出了一种池化装置及方法,能够以固定规模的池化计算模块完成不同规模的池化任务,并且采用了灵活的调用方式来完成神经元的复用,从而能够保证对神经网络加速器的兼容性;同时,采用了启动和休眠相结合的工作机制,从而实现神经网络芯片的低能耗。Aiming at the common problem of poor compatibility of existing pooling devices, the inventor proposed a pooling device and method after research, which can complete pooling tasks of different scales with a fixed-scale pooling computing module, and adopts a flexible The calling method is used to complete the multiplexing of neurons, so as to ensure the compatibility with the neural network accelerator; at the same time, the working mechanism of combining startup and sleep is adopted to realize the low energy consumption of the neural network chip.

图1是本发明提供的适用于神经网络的池化装置,如图1所示,池化装置101包括神经元输入接口模块102、神经元输出接口模块105、池化缓存模块103、池化计算模块104以及池化控制模块106。Fig. 1 is a pooling device suitable for neural networks provided by the present invention. As shown in Fig. 1, the pooling device 101 includes a neuron input interface module 102, a neuron output interface module 105, a pooling cache module 103, a pooling calculation module 104 and pooling control module 106.

其中,神经元输入接口模块102,可用于根据接收的控制信号,按照传输协议接收由外部模块(例如激活模块或外部缓存模块)传递至池化装置101的不同有效带宽的神经元数据,并保证数据准确的传输。该接口模块可与外部模块建立持续的数据传输通道,并根据控制信号中的有效数据段地址编码识别出输入的神经元数据的有效部分;Among them, the neuron input interface module 102 can be used to receive neuron data with different effective bandwidths transmitted to the pooling device 101 by an external module (such as an activation module or an external cache module) according to the transmission protocol according to the received control signal, and ensure that accurate transmission of data. The interface module can establish a continuous data transmission channel with the external module, and identify the valid part of the input neuron data according to the valid data segment address code in the control signal;

池化缓存模块103,可用于存储待池化的复用神经元,为神经元的池化操作提供复用部分的数据供应,该模块可根据接收的控制信号,辅助完成神经元数据复用过程中的数据读取过程;The pooling cache module 103 can be used to store the multiplexed neurons to be pooled, and provide the data supply of the multiplexed part for the pooling operation of the neurons. This module can assist in completing the neuron data multiplexing process according to the received control signal The data reading process in;

池化计算模块104,可用于完成对每次输入的神经元的激活操作,其池化运算单元数固定,对于不同池化范围的神经元可采用分割方式完成池化;同时,该运算模块还可暂存池化运算的中间结果,以用于辅助完成神经元池化操作时的迭代运算;The pooling calculation module 104 can be used to complete the activation operation of each input neuron. The number of pooling calculation units is fixed, and the pooling can be completed in a split manner for neurons with different pooling ranges; at the same time, the calculation module also The intermediate results of the pooling operation can be temporarily stored to assist in the iterative operation when completing the neuron pooling operation;

神经元输出接口模块105,可用于根据接收的控制信号,按照传输协议向外部模块输出池化装置101的运算结果。The neuron output interface module 105 can be used to output the operation result of the pooling device 101 to an external module according to the transmission protocol according to the received control signal.

池化控制模块106,可用于接收发送至池化装置101的池化参数,并根据该参数向池化装置101的各个模块发送控制信号,对各模块的工作情况以及池化数据在池化过程中的流水传递进行管控,例如,控制模块106可控制神经元输入接口模块102和神经元输出接口模块105的有效数据输入/输出量,以及池化缓存模块103数据的写入、删除与传递。The pooling control module 106 can be used to receive the pooling parameters sent to the pooling device 101, and send control signals to each module of the pooling device 101 according to the parameters, and check the working conditions of each module and the pooling data during the pooling process. For example, the control module 106 can control the effective data input/output volume of the neuron input interface module 102 and the neuron output interface module 105, as well as the writing, deletion and transmission of data in the pooling cache module 103.

在本发明的一个实施例中,池化控制模块106还可以控制池化装置101各模块的启动和休眠。例如,在Resnet50神经网络模型中,其总层数约50层,其中执行池化操作所需的层数为2,分别为池化边长3x3,步长为2,以及池化边长7x7,步长为7,由于该模型中的池化操作为小众计算,可在不使用池化的其它网络层设置池化装置的启动与休眠机制,从而减少池化模块对神经网络加速器带来的附加能耗。In an embodiment of the present invention, the pooling control module 106 may also control the startup and sleep of each module of the pooling device 101 . For example, in the Resnet50 neural network model, the total number of layers is about 50 layers, and the number of layers required to perform the pooling operation is 2, respectively, the pooling side length is 3x3, the step size is 2, and the pooling side length is 7x7, The step size is 7. Since the pooling operation in this model is a niche calculation, the startup and sleep mechanism of the pooling device can be set in other network layers that do not use pooling, thereby reducing the impact of the pooling module on the neural network accelerator. Additional energy consumption.

根据本发明的另一个方面,还提供一种利用上述池化装置101对神经元数据进行池化的方法,图2是利用图1所示的池化装置进行池化的方法流程图,如图2所示,该方法具体包括以下步骤:According to another aspect of the present invention, there is also provided a method for pooling neuron data using the above-mentioned pooling device 101. FIG. 2 is a flowchart of a method for pooling using the pooling device shown in FIG. 1, as shown in FIG. 2, the method specifically includes the following steps:

步骤S10、接收并分析池化参数Step S10, receiving and analyzing pooling parameters

当神经网络需要进行池化时,池化控制模块106将接收来自神经网络的激活信号,从而控制启动处于休眠状态的池化装置101中的各个模块。When the neural network needs to perform pooling, the pooling control module 106 will receive an activation signal from the neural network, so as to control and start each module in the pooling device 101 in a dormant state.

池化装置启动后,可利用池化控制模块106接收来自外部神经网络模块发送至池化装置101的池化参数并进行参数分析,以便确定池化运算的复用策略,并生成控制信号;After the pooling device is started, the pooling control module 106 can be used to receive the pooling parameters sent from the external neural network module to the pooling device 101 and perform parameter analysis, so as to determine the multiplexing strategy of the pooling operation and generate a control signal;

其中,池化参数可包括池化边长、池化步长和池化操作类型。池化控制模块可通过分析池化边长和池化步长,判断当前层池化复用的必要性,例如,若步长与边长相等,控制模块可选择非复用策略;若步长小于边长,则需启动复用策略,并计算出神经元的复用量,再根据池化计算模块104的运算单元数量对池化域内的神经元进行分批,生成表示每批神经元数量的数据编码。Wherein, the pooling parameters may include pooling edge length, pooling step size and pooling operation type. The pooling control module can judge the necessity of pooling reuse in the current layer by analyzing the pooling side length and pooling step size. For example, if the step size is equal to the side length, the control module can choose a non-reuse strategy; if the step size If it is less than the side length, it is necessary to start the multiplexing strategy, and calculate the multiplexing amount of neurons, and then divide the neurons in the pooling domain according to the number of computing units of the pooling calculation module 104, and generate a representation of the number of neurons in each batch data encoding.

在本发明的一个实施例中,上述池化装置101的输入带宽所对应的神经元数据量与池化计算模块104的运算单元数相同。In an embodiment of the present invention, the amount of neuron data corresponding to the input bandwidth of the pooling device 101 is the same as the number of computing units of the pooling calculation module 104 .

步骤S20、接收并存储神经元Step S20, receiving and storing neurons

利用池化控制模块106,将步骤S10生成的有效数据编码以及神经元复用量发送至神经元输入接口模块102,神经元输入接口模块102根据上述有效数据编码以及相应的传输协议,从外部接收有效的神经元数据;同时,还根据上述神经元复用量,对输入的神经元中待下次激活复用的部分进行赋值,并将其暂存入池化缓存模块103以待下次使用。Utilize the pooling control module 106 to send the effective data encoding and neuron multiplexing amount generated in step S10 to the neuron input interface module 102, and the neuron input interface module 102 receives the data from the outside according to the above effective data encoding and the corresponding transmission protocol Effective neuron data; at the same time, according to the above-mentioned neuron multiplexing amount, assign values to the part of the input neuron to be activated and multiplexed next time, and temporarily store it in the pooling cache module 103 for next use .

步骤S30、执行池化计算并输出计算结果Step S30, perform pooling calculation and output the calculation result

池化控制模块106控制完成步骤S20上述神经元数据的输入暂存后,可从神经元输入接口模块102和池化缓存模块103加载神经元,并将上述神经元拼接组成单个池化域的神经元输入至池化计算模块104,池化计算模块104根据接收的来自池化控制模块106的控制信息选择计算类型对该神经元数据执行池化运算;其中,上述控制信息包含与当前神经元数据相对应的池化计算类型;After the pooling control module 106 completes the input and temporary storage of the above-mentioned neuron data in step S20, neurons can be loaded from the neuron input interface module 102 and the pooling cache module 103, and the above-mentioned neurons are spliced to form a single pooling domain neuron The unit is input to the pooling calculation module 104, and the pooling calculation module 104 selects the calculation type according to the received control information from the pooling control module 106 to perform pooling operations on the neuron data; The corresponding pooling calculation type;

针对当前批次的神经元数据,若池化计算模块104获得的计算结果为中间结果,则将该中间结果暂存,待属于当前池化范围的下一批次神经元数据输入池化装置时,将上述中间结果与下一批次输入的神经元数据共同输入至神经元输入接口模块102,以便进行迭代计算;经过若干次迭代运算后,若池化计算模块104获得的计算结果为池化最终结果,则将该最终结果传输至神经元输出接口模块105,按照数据传输协议将该最终结果输出至外部模块。For the current batch of neuron data, if the calculation result obtained by the pooling calculation module 104 is an intermediate result, the intermediate result is temporarily stored until the next batch of neuron data belonging to the current pooling range is input into the pooling device , input the above-mentioned intermediate results together with the neuron data input in the next batch to the neuron input interface module 102 for iterative calculation; after several iterations, if the calculation result obtained by the pooling calculation module 104 is pooling The final result is transmitted to the neuron output interface module 105, and the final result is output to an external module according to the data transmission protocol.

在本发明的一个实施例中,当池化运算模块104在执行池化操作时,若输入的神经元数据无法完全填充池化计算模块104的输入带宽,池化控制模块106可依据不同类型池化操作对输入的神经元数据的空闲位进行填充,从而保证池化计算的准确性。In one embodiment of the present invention, when the pooling calculation module 104 is performing the pooling operation, if the input neuron data cannot completely fill the input bandwidth of the pooling calculation module 104, the pooling control module 106 can base on different types of pooling The pooling operation fills the vacant bits of the input neuron data, thus ensuring the accuracy of the pooling calculation.

图3是本发明的较佳实施例的池化装置结构示意图,如图3所示,以下将以具体的实例来说明本发明提供的利用池化装置101对神经元数据进行池化的方法。FIG. 3 is a schematic structural diagram of a pooling device in a preferred embodiment of the present invention. As shown in FIG. 3 , the method for pooling neuron data provided by the present invention using the pooling device 101 will be described below with specific examples.

假设池化装置101的输入带宽为128bit,神经元为8bit,池化域的步长为2、边长为3。当池化装置101接收到激活信号,可由池化控制模块106控制启动,从休眠状态进入池化计算状态。Assume that the input bandwidth of the pooling device 101 is 128 bits, the neuron is 8 bits, the step size of the pooling domain is 2, and the side length is 3. When the pooling device 101 receives the activation signal, it can be started under the control of the pooling control module 106, and enters the pooling computing state from the dormant state.

首先执行上述步骤S10,利用池化控制模块106接收外部模块输入的池化参数,包括池化域边长3,池化步长2,以及池化操作类型;同时,利用池化控制模块106分析上述参数数据,由于池化步长小于池化边长,则判定需要启动神经元复用机制,假设池化窗口为单向移动,则可获得其复用量为3;First execute the above step S10, use the pooling control module 106 to receive the pooling parameters input by the external module, including the pooling domain side length 3, the pooling step size 2, and the pooling operation type; at the same time, use the pooling control module 106 to analyze For the above parameter data, since the pooling step size is smaller than the pooling side length, it is determined that the neuron multiplexing mechanism needs to be activated. Assuming that the pooling window moves in one direction, the multiplexing amount can be obtained as 3;

其次,执行上述步骤S20,接收神经元,并依据池化计算模块104的运算单元数量对池化域内的神经元进行分批,假设输入的神经元数为16,池化域的有效神经元数为9,单次输入传递可容纳单个池化域中所有神经元,若神经元为输入特征图每行的第一个池化窗口,则有效神经元个数为9,可生成表示每批量神经元数量为9的编码;Secondly, execute the above step S20, receive neurons, and divide the neurons in the pooling domain into batches according to the number of computing units of the pooling calculation module 104, assuming that the number of input neurons is 16, the effective number of neurons in the pooling domain is 9, and a single input pass can accommodate all neurons in a single pooling domain. If the neuron is the first pooling window in each row of the input feature map, the effective number of neurons is 9, which can generate neurons representing each batch. An encoding with a number of elements of 9;

根据上述有效编码,输入接口模块102将从外部模块发送的包含有效与无效的数据中接收128bit的有效数据;同时,根据复用量信息将输入神经元中待下次激活复用部分的3个神经元复制并存入池化缓存模块103;According to the above-mentioned valid code, the input interface module 102 will receive 128-bit valid data from the data sent by the external module including valid and invalid data; at the same time, according to the multiplexing amount information, it will input the 3 bits of the multiplexing part to be activated next time in the neuron. The neuron is copied and stored in the pooling cache module 103;

最后,执行上述步骤S30,池化控制模块106将输入的6个神经元与暂存模块中需复用的3个神经元进行拼接组成3x3池化域内的9个输入神经元,并将上述拼接结果传输至池化计算模块,执行池化计算,获得最终结果后经神经元输出接口模块105传输至外部模块。Finally, the above step S30 is executed, the pooling control module 106 splices the input 6 neurons and the 3 neurons that need to be reused in the temporary storage module to form 9 input neurons in the 3x3 pooling domain, and the above splicing The result is transmitted to the pooling calculation module, the pooling calculation is performed, and the final result is transmitted to the external module through the neuron output interface module 105 .

在本发明的另一个实施例中,假设池化域的步长与边长相同,例如均为7,当池化装置101接收到激活信号进入池化计算状态时,则无需启动神经元复用机制,也就是说,无需启动池化缓存模块103,可将输入的神经元直接传输至池化计算模块104进行池化运算,若池化域的神经元需要分多批次进行池化,则可将中间结果暂存,并与后续批次的神经元共同执行池化,直至完成规定池化域所有神经元的池化操作任务并输出结果,例如,假设单个池化域的神经元为49个,则可分为4个批次,从而生成编码为(16-16-16-1)。In another embodiment of the present invention, assuming that the step size of the pooling domain is the same as the side length, for example, both are 7, when the pooling device 101 receives an activation signal and enters the pooling calculation state, there is no need to start neuron multiplexing mechanism, that is, without starting the pooling cache module 103, the input neurons can be directly transmitted to the pooling calculation module 104 for pooling operations. If the neurons in the pooling domain need to be pooled in multiple batches, then The intermediate results can be temporarily stored and pooled with subsequent batches of neurons until the pooling operation task of all neurons in the specified pooling domain is completed and the result is output. For example, assuming that the number of neurons in a single pooling domain is 49 , it can be divided into 4 batches, and the generated code is (16-16-16-1).

尽管在上述实施例中,采用了Resnet50神经网络模型为例对本发明提供的池化装置及方法进行了说明,但本领域普通技术人员应理解,此处的池化装置及方法还可以用于其它神经网络模型。Although in the above embodiment, the Resnet50 neural network model is used as an example to illustrate the pooling device and method provided by the present invention, those of ordinary skill in the art should understand that the pooling device and method here can also be used in other neural network model.

相对于现有技术,在本发明实施例中所提供的适用于神经网络的池化装置及方法,通过采用相应复用策略,实现了仅利用固定规模的池化运算单元就可完成不同规模的池化任务,同时以灵活的神经元调用方法完成神经元的复用,实现了池化装置的兼容性;设立了池化装置的启动与休眠机制,降低了池化装置的能耗。Compared with the prior art, the pooling device and method applicable to the neural network provided in the embodiment of the present invention, by adopting the corresponding multiplexing strategy, realizes the completion of pooling operation units of different scales only by using fixed-scale pooling operation units. Pooling tasks, while completing the multiplexing of neurons with a flexible neuron call method, realizing the compatibility of the pooling device; setting up the start-up and sleep mechanism of the pooling device, reducing the energy consumption of the pooling device.

虽然本发明已经通过优选实施例进行了描述,然而本发明并非局限于这里所描述的实施例,在不脱离本发明范围的情况下还包括所作出的各种改变以及变化。Although the present invention has been described in terms of preferred embodiments, the present invention is not limited to the embodiments described herein, and various changes and changes are included without departing from the scope of the present invention.

Claims (10)

CN201810014396.3A2018-01-082018-01-08 A pooling device and method suitable for neural networksActiveCN108388943B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201810014396.3ACN108388943B (en)2018-01-082018-01-08 A pooling device and method suitable for neural networks

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201810014396.3ACN108388943B (en)2018-01-082018-01-08 A pooling device and method suitable for neural networks

Publications (2)

Publication NumberPublication Date
CN108388943Atrue CN108388943A (en)2018-08-10
CN108388943B CN108388943B (en)2020-12-29

Family

ID=63076734

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201810014396.3AActiveCN108388943B (en)2018-01-082018-01-08 A pooling device and method suitable for neural networks

Country Status (1)

CountryLink
CN (1)CN108388943B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN109558564A (en)*2018-11-302019-04-02上海寒武纪信息科技有限公司Operation method, device and Related product
CN117273102A (en)*2023-11-232023-12-22深圳鲲云信息科技有限公司Apparatus and method for pooling accelerators and chip circuitry and computing device

Citations (8)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
JP2015036939A (en)*2013-08-152015-02-23富士ゼロックス株式会社Feature extraction program and information processing apparatus
CN106228240A (en)*2016-07-302016-12-14复旦大学Degree of depth convolutional neural networks implementation method based on FPGA
CN106355244A (en)*2016-08-302017-01-25深圳市诺比邻科技有限公司CNN (convolutional neural network) construction method and system
CN106682734A (en)*2016-12-302017-05-17中国科学院深圳先进技术研究院Method and apparatus for increasing generalization capability of convolutional neural network
CN106875012A (en)*2017-02-092017-06-20武汉魅瞳科技有限公司A kind of streamlined acceleration system of the depth convolutional neural networks based on FPGA
CN106940815A (en)*2017-02-132017-07-11西安交通大学A kind of programmable convolutional neural networks Crypto Coprocessor IP Core
CN107103113A (en)*2017-03-232017-08-29中国科学院计算技术研究所Towards the Automation Design method, device and the optimization method of neural network processor
US20170300812A1 (en)*2016-04-142017-10-19International Business Machines CorporationEfficient determination of optimized learning settings of neural networks

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
JP2015036939A (en)*2013-08-152015-02-23富士ゼロックス株式会社Feature extraction program and information processing apparatus
US20170300812A1 (en)*2016-04-142017-10-19International Business Machines CorporationEfficient determination of optimized learning settings of neural networks
CN106228240A (en)*2016-07-302016-12-14复旦大学Degree of depth convolutional neural networks implementation method based on FPGA
CN106355244A (en)*2016-08-302017-01-25深圳市诺比邻科技有限公司CNN (convolutional neural network) construction method and system
CN106682734A (en)*2016-12-302017-05-17中国科学院深圳先进技术研究院Method and apparatus for increasing generalization capability of convolutional neural network
CN106875012A (en)*2017-02-092017-06-20武汉魅瞳科技有限公司A kind of streamlined acceleration system of the depth convolutional neural networks based on FPGA
CN106940815A (en)*2017-02-132017-07-11西安交通大学A kind of programmable convolutional neural networks Crypto Coprocessor IP Core
CN107103113A (en)*2017-03-232017-08-29中国科学院计算技术研究所Towards the Automation Design method, device and the optimization method of neural network processor

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
NICOLA CALABRETTA等: "Flow controlled scalable optical packet switch for low latency flat data center network", 《2013 15TH INTERNATIONAL CONFERENCE ON TRANSPARENT OPTICAL NETWORKS (ICTON)》*
常亮等: "图像理解中的卷积神经网络", 《自动化学报》*

Cited By (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN109558564A (en)*2018-11-302019-04-02上海寒武纪信息科技有限公司Operation method, device and Related product
CN109558564B (en)*2018-11-302022-03-11上海寒武纪信息科技有限公司 Computing method, device and related products
CN117273102A (en)*2023-11-232023-12-22深圳鲲云信息科技有限公司Apparatus and method for pooling accelerators and chip circuitry and computing device
CN117273102B (en)*2023-11-232024-05-24深圳鲲云信息科技有限公司Apparatus and method for pooling accelerators and chip circuitry and computing device

Also Published As

Publication numberPublication date
CN108388943B (en)2020-12-29

Similar Documents

PublicationPublication DateTitle
CN110597616B (en)Memory allocation method and device for neural network
CN112862112B (en) Federated learning method, storage medium, terminal, server, and federated learning system
US11087203B2 (en)Method and apparatus for processing data sequence
CN105184366A (en)Time-division-multiplexing general neural network processor
CN111831359B (en)Weight precision configuration method, device, equipment and storage medium
CN112543918A (en)Neural network segmentation method, prediction method and related device
CN108196929B (en)Intelligent loading system, method, storage medium and equipment
CN108345934B (en) A kind of activation device and method for neural network processor
CN108304925B (en) A pooled computing device and method
JP2023531538A (en) Neural network generation method, device and computer readable storage medium
CN110600020B (en)Gradient transmission method and device
CN111831358A (en) Weight accuracy configuration method, device, device and storage medium
CN111831354B (en)Data precision configuration method, device, chip array, equipment and medium
CN115456149B (en) Spiking neural network accelerator learning method, device, terminal and storage medium
CN108388943A (en)A kind of pond device and method suitable for neural network
CN111860867B (en)Model training method and system for hybrid heterogeneous system and related device
CN115113814B (en)Neural network model online method and related device
CN114253550B (en) Optimization strategy generation method and operator construction method
CN111831356B (en)Weight precision configuration method, device, equipment and storage medium
WO2025112979A1 (en)Parallel strategy optimal selection method, and neural network solver training method and apparatus
CN114237864B (en) A rapid training system and method for artificial intelligence models
CN119227771A (en) A model training method, device, equipment, system and storage medium
CN115033388A (en)Method and system for configuring GPU (graphics processing Unit) with parallel flow in artificial intelligence system
CN108733739A (en)Support the arithmetic unit and method of beam-search
CN113313239A (en)Artificial intelligence model design optimization method and device

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp