CN113902092A

Movatterモバイル変換

Info

Publication number: CN113902092A
Application number: CN202111024733.5A
Authority: CN
Inventors: 黄漪婧; 吴志刚; 沈伟; 戴靠山; 吴建军; 廖光明; 卫军名; 周林; 杨斌; 张丁凡; 张辉; 周成刚; 魏莞月; 向光明; 童波; 朱瑞蒙
Original assignee: Sichuan Shengjinhui Technology Co ltd
Current assignee: Sichuan Shengjinhui Technology Co ltd
Priority date: 2021-09-02
Filing date: 2021-09-02
Publication date: 2022-01-07

Abstract

Translated fromChinese

本发明公开了一种脉冲神经网络间接监督训练方法，包括将人工神经网络ANN转化为脉冲神经网络SNN的方法，其特征在于：所述将人工神经网络ANN转化为脉冲神经网络SNN的方法包括以下步骤：选用ReLU()作为激活函数；将ANN中所有偏置置零之后进行训练；权值定点化。本发明的优点在于：将ANN间接监督的方式运用到SNN。

The invention discloses an indirect supervision training method of an impulse neural network, including a method for converting an artificial neural network ANN into an impulse neural network SNN, characterized in that: the method for converting an artificial neural network ANN into an impulse neural network SNN includes the following steps: Steps: Select ReLU() as the activation function; set all the biases in the ANN to zero before training; the weights are fixed-point. The advantage of the present invention is that the indirect supervision method of ANN is applied to SNN.

Description

Indirect supervised training method for impulse neural network

Technical Field

The invention relates to the technical field of a pulse neural network, in particular to an indirect supervision training method of the pulse neural network.

Background

Whether ANN or SNN, its training, or learning, is achieved by adjusting the connection weights between neurons, which is reflected in synaptic plasticity in biological neural networks. The weight adjustment algorithm is crucial in the learning of artificial neural networks, in the ANN, the back propagation algorithm and the gradient descent have achieved great success, but in the SNN, the back propagation algorithm is no longer applicable, and the conflict is mainly reflected by two points: firstly, in the impulse neural network, the activation function in the ANN becomes the weighted sum of a plurality of impulses, and the impulses can be regarded as dirac functions and have no derivative, so that the back propagation algorithm cannot be applied in the SNN. Another problem is biological reasonableness, also known as the Weight Transport problem, which exists in both ANN and SNN, specifically, the Weight values of forward links are needed in the computation of the back-propagation algorithm, however, such back-links do not exist in the creatures, making the back-propagation algorithm non-biological reasonableness.

At present, a recognized training algorithm does not appear in the impulse neural network, and the training algorithm can be classified into unsupervised learning and supervised learning according to whether a label is used or not.

The impulse neural network adopts a structure more similar to that of the biological neural network, and although it cannot be applied to the back propagation algorithm of the great diversity in the ANN, it also enables the application of a learning rule with a biological interpretability, the biological basis of which is impulse Timing-Dependent Plasticity (STDP). The main characteristic is that the connection weight between the pre-synaptic and post-synaptic neurons is adjusted according to the relative excitation time (in the order of 10 ms) of the neurons, and the mathematical approximation is as follows:

where Δ ω represents the amount of change in weight and τ represents the time window constant, the weight between the pre-synaptic neurons becomes larger when they fire money after the post-synaptic neuron and becomes smaller when they do not fire, the change being subject to the hyperparameters τ, a⁺And a^-Similar to the learning rate in the gradient descent algorithm, the unsupervised learning method designed by using the STDP rule can play a good role in feature extraction.

(2) Supervised learning

SpikeProp is the earliest learning algorithm adopting error back propagation in an impulse neural network, and is characterized in that an impulse response model is adopted as a neuron model, the change of the activation state value of a neuron is regarded as linear increase in a very short time, the neuron is only required to output a single pulse, and the error is defined as the mean square error of the pulse excitation time of the output neuron. Learning algorithms such as ReSuMe and SPAN are emerging gradually, and the learning algorithms are characterized in that one neuron receives the input of a plurality of neurons to generate a desired pulse time sequence.

In the deep pulse neural network, supervised learning can be divided into indirect and direct categories. Firstly, the indirect supervised learning means that the ANN is trained firstly, then the ANN is converted into the SNN, and the labels are used for supervised training in the training process of the ANN. The core idea is to understand the continuous activation value in the ANN as the pulse excitation frequency in the SNN. Studies in this direction include constraints on the ANN structure, the transformation methods, and so on. Direct supervised learning has proposed some solutions to the conflict of back propagation with SNN, which is generally solved with an approximate, derivable function for the non-derivable problem. A study on the Weight Transport problem found that using random weights instead of weights in the back-propagation did not significantly affect the results. It should be noted, however, that the training mode with direct supervision is still less accurate than the training mode with indirect supervision.

Disclosure of Invention

In order to solve the various problems, the invention provides an indirect supervision and training method of a pulse neural network, which applies an ANN indirect supervision mode to an SNN.

In order to solve the technical problems, the technical scheme provided by the invention is as follows: an indirect supervised training method of an impulse neural network (ANN), which comprises a method for converting the Artificial Neural Network (ANN) into the impulse neural network (SNN), and the method for converting the Artificial Neural Network (ANN) into the impulse neural network (SNN) comprises the following steps:

the method comprises the following steps: ReLU () is selected as an activation function;

step two: training after setting all biases in the ANN to zero;

step three: and (5) fixing the weight.

Finally, the steps of generating the SNN logical network are summarized as follows:

the method comprises the following steps: according to the set network structure and the set hyper-parameters, training an ANN (artificial neural network) by adopting a BP (back propagation) algorithm to obtain input weights of all neurons;

step two: converting the multiplication and addition operation between the weight and the input in the second generation neuron model into the addition operation of the third generation neuron model, wherein the addition is triggered by the arrival of the pulse, and judging whether the ID of the pulse is connected with the current neuron;

step three: converting the nonlinear activation process in the second generation neuron model into threshold judgment, generating new pulse and setting the membrane potential to zero when the membrane potential is greater than the threshold, otherwise, keeping the membrane potential unchanged;

step four: all weights are fixed.

Further, the ReLU () activation function is different from Sigmoid () and Tanh (), and the output of the ReLU () activation function is a non-negative value, so that the problem that the activation value is a negative number can be solved. Meanwhile, when the input is greater than 0, the ReLU () activation function is linear, and the characteristic can reduce the performance loss from ANN to SNN to a certain extent.

Compared with the prior art, the invention has the advantages that: unlike Sigmoid () and Tanh (), the output of the ReLU () activation function is a non-negative value, which can solve the problem that the activation value is negative. Meanwhile, when the input is greater than 0, the ReLU () activation function is linear, and the characteristic can reduce the performance loss from ANN to SNN to a certain extent.

Drawings

FIG. 1 is a schematic diagram illustrating the comparison between the ANN value output and the SNN pulse output of the indirect supervised training method of the spiking neural network according to the present invention.

FIG. 2 is a functional diagram of Sigmoid () and Tanh () of the method for training an impulse neural network by indirect supervision according to the present invention.

FIG. 3 is a diagram of the ReLU () function of an impulse neural network indirect supervised training method of the present invention.

FIG. 4 is a schematic diagram of ANN to SNN conversion of the neural network training method.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings.

The transition from ANN to SNN using indirect supervision faces several problems, including:

firstly, the activation values of neurons in each layer of the traditional neural network have positive and negative scores, wherein the meaning of the negative activation values is the inhibition of neurons in the later layer, and the inhibition is difficult to accurately express in the impulse neural network. The reasons for this problem are mainly the following:

(1) the input to the network may be negative

Generally, the input of the artificial neural network is preprocessed, and a common preprocessing mode is normalization, that is, input data is mapped between-1 and 1 through transformation, so that the purpose of this is to increase the generalization capability of the network on one hand, and to accelerate the convergence of the network during training on the other hand.

(2) Multiply-add operation

In an artificial neural network, neurons convert inputs to outputs by activating multiply-add operations with weights, and biases, both of which may be negative.

(3) Activating a function

The nonlinear activation functions Sigmoid () and Tanh () commonly used by artificial neural networks have output ranges of-1 to 1.

The second problem is that the impulse neural network does not represent a bias as well as the artificial neural network. In ANN, each operation of a neuron adds a multiplication and addition operation of an input and a weight to an offset and then passes through an activation function, but for SNN, the operation of the neuron is converted into a pulse trigger, and the corresponding weight is added to an activation level every time a new pulse appears, so that the offset cannot be represented.

Thirdly, the trained ANN is usually floating point number, and FPGA is difficult to process the floating point number.

In conjunction with figures 1 to 4 of the accompanying drawings,

examples

An indirect supervised training method of an impulse neural network (ANN), which comprises a method for converting the Artificial Neural Network (ANN) into the impulse neural network (SNN), and the method for converting the Artificial Neural Network (ANN) into the impulse neural network (SNN) comprises the following steps:

step two: training after setting all biases in the ANN to zero;

step three: and (5) fixing the weight.

step four: all weights are fixed.

Unlike Sigmoid () and Tanh (), the output of the ReLU () activation function is a non-negative value, which can solve the problem that the activation value is negative. Meanwhile, when the input is greater than 0, the ReLU () activation function is linear, and the characteristic can reduce the performance loss from ANN to SNN to a certain extent.

The present invention and its embodiments have been described above, and the description is not intended to be limiting, and the drawings are only one embodiment of the present invention, and the actual structure is not limited thereto. In summary, those skilled in the art should appreciate that they can readily use the disclosed conception and specific embodiments as a basis for designing or modifying other structures for carrying out the same purposes of the present invention without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

Translated fromChinese

1.一种脉冲神经网络间接监督训练方法，包括将人工神经网络ANN转化为脉冲神经网络SNN的方法，其特征在于：所述将人工神经网络ANN转化为脉冲神经网络SNN的方法包括以下步骤：1. an impulse neural network indirect supervision training method, comprises the method that artificial neural network ANN is converted into impulse neural network SNN, it is characterized in that: the described method that artificial neural network ANN is converted into impulse neural network SNN comprises the following steps:

步骤一：选用ReLU()作为激活函数；Step 1: Select ReLU() as the activation function;

步骤二：将ANN中所有偏置置零之后进行训练；Step 2: Train after setting all biases in ANN to zero;

步骤三：权值定点化。Step 3: Fixed-point weights.

最终，SNN逻辑网络的生成步骤概括如下：Finally, the generation steps of the SNN logic network are summarized as follows:

步骤一：按照设定好的网络结构及超参数，采用BP算法训练ANN网络，得到所有神经元的输入权值；Step 1: According to the set network structure and hyperparameters, use the BP algorithm to train the ANN network to obtain the input weights of all neurons;

步骤二：将第二代神经元模型中权值与输入之间的乘加运算转换为第三代神经元模型的加法运算，此加法由脉冲的到达触发，判断脉冲的ID是否与当前神经元连接；Step 2: Convert the multiplication and addition operation between the weights and the input in the second-generation neuron model to the addition operation of the third-generation neuron model. This addition is triggered by the arrival of the pulse to determine whether the ID of the pulse is the same as that of the current neuron. connect;

步骤三：将第二代神经元模型中的非线性激活过程，转换为阈值判断，当膜电位大于阈值时，产生新的脉冲并将膜电位置零，否则保持不变；Step 3: Convert the nonlinear activation process in the second-generation neuron model to threshold judgment. When the membrane potential is greater than the threshold, a new pulse will be generated and the membrane potential will be zero, otherwise it will remain unchanged;

步骤四：将所有的权重定点化。Step 4: Definite all weights.

2.根据权利要求1所述的一种脉冲神经网络间接监督训练方法，其特征在于：ReLU()激活函数与Sigmoid()和Tanh()不同，ReLU()激活函数的输出为非负值，能够解决激活值为负数的问题。同时，当输入大于0时，ReLU()激活函数是线性的，这一特点能够在一定程度上减轻从ANN到SNN带来的性能损失。2. a kind of spiking neural network indirect supervision training method according to claim 1, is characterized in that: ReLU () activation function is different from Sigmoid () and Tanh (), the output of ReLU () activation function is non-negative value, Can solve the problem of negative activation value. At the same time, when the input is greater than 0, the ReLU() activation function is linear, which can alleviate the performance loss from ANN to SNN to a certain extent.