CN112288085B

Movatterモバイル変換

Info

Publication number: CN112288085B
Application number: CN202011147836.6A
Authority: CN
Inventors: 吴欣欣; 范志华; 轩伟; 李文明; 叶笑春; 范东睿
Original assignee: Institute of Computing Technology of CAS
Current assignee: Institute of Computing Technology of CAS
Priority date: 2020-10-23
Filing date: 2020-10-23
Publication date: 2024-04-09
Anticipated expiration: 2040-10-23
Also published as: CN112288085A

Abstract

The invention provides a convolutional neural network acceleration method and a system, which comprise the steps of taking an image to be subjected to feature analysis as input to activate and input a convolutional neural network, and decomposing a weight vector of a filter in the convolutional neural network to obtain a symbol vector corresponding to the weight in the filter; performing convolution operation through the symbol vector and the input activation vector to obtain a first convolution result, performing convolution operation through the compensation factor and the input activation vector to obtain a second convolution result, and adding the first convolution result and the second convolution result to obtain a prediction result; and when the convolutional neural network executes convolutional calculation, skipping 0-value related operation according to the prediction result to obtain a convolutional result. The invention can predict the output activated sparsity to guide the original neural network operation to skip the operation related to the 0 value, thereby reducing the calculation amount of the original network, saving the calculation resource, reducing the power consumption and improving the performance.

Description

Image detection method and system based on convolutional neural network

Technical Field

The present invention relates to computer system architecture, and more particularly to an image detection and system based on convolutional neural networks.

Background

The neural network has advanced performance in the aspects of image detection, voice recognition and natural language processing, and along with the complexity of application, the neural network model is also complex, so that many challenges are presented to the traditional hardware, and in order to relieve the pressure of hardware resources, the sparse network has good advantages in the aspects of calculation, storage, power consumption requirements and the like. Many algorithms and accelerators for accelerating sparse networks, such as a sparse-blas library for a CPU, a custars library for a GPU, etc., have appeared, which accelerate the execution of sparse networks to some extent, and for dedicated accelerators, have advanced expressive forces in terms of performance, power consumption, etc.

The neural network model is also becoming large and deep with the complexity of application, which presents many challenges to the traditional hardware, and in order to relieve the pressure of hardware resources, the sparse network has good advantages in the aspects of calculation, storage, power consumption requirements and the like.

In most Deep Neural Networks (DNNs), the output of the network layer uses a rectifying linear unit (RELU) widely, negative activation data is forcedly output to be 0, and for the weight of the neural network, some weight values are set to be 0 by using pruning and other methods based on the redundancy characteristics of the weight value data, and the methods cause the neural network to generate a large number of 0-value output activation data and weight values, so that weight sparseness and activation sparseness exist in a sparse network, and the sparseness of about 50% exists in a modern DNN model. The operation of the neural network is mainly multiply-add operation, and because the multiplication of 0 value data and any value is 0, the operations can be regarded as invalid operations, and the execution of the invalid operations occupies calculation resources to cause the waste of calculation resources and power consumption, thereby prolonging the execution time of the network and reducing the performance of the network.

Disclosure of Invention

Aiming at the existence of a large amount of sparse data in the neural network, the invention provides a sparse activation data prediction device, which predicts the sparsity of the activation data in advance by using smaller prediction overhead so as to guide the original neural network operation. The execution of the neural network is thus divided into two phases, a prediction phase and an execution phase. In the prediction stage, network operation is performed by using the weight symbols and the input activation data, and compensation factors are added to reduce the loss of reasoning accuracy, so that a prediction result of the output activation data is generated. In the execution phase, only the correlation neural network operation whose output activation prediction value is positive is executed using the predicted output activation data, and the correlation neural network operation whose activation prediction value is negative is removed. And finally, the calculation amount of sparse network operation is reduced, the power consumption is reduced, and the execution performance of the network is improved.

Specifically, the invention provides a convolutional neural network acceleration method and a system aiming at the defects of the prior art, wherein the method comprises the following steps:

step 1, taking an image to be subjected to feature analysis as input to activate and input a convolutional neural network, and decomposing a weight vector of a filter in the convolutional neural network to obtain a symbol vector corresponding to the weight in the filter;

step 2, performing convolution operation through the symbol vector and the input activation vector to obtain a first convolution result, performing convolution operation through the compensation factor and the input activation vector to obtain a second convolution result, and adding the first convolution result and the second convolution result to obtain a prediction result;

and step 2, skipping the operation related to the 0 value according to the prediction result when the convolutional neural network executes convolutional calculation, so as to obtain a convolutional result.

The convolutional neural network acceleration method and system, wherein the step 2 comprises the following steps: judging whether a numerical value smaller than or equal to 0 exists in the predicted result, if so, acquiring a vector position of the numerical value smaller than or equal to 0 in the predicted result, skipping a calculation process related to the vector position when performing convolution calculation to obtain an activated output result, and setting the numerical value located at the vector position in the activated output result to zero to obtain the convolution result.

The convolutional neural network acceleration method and system, wherein the step 1 comprises the following steps: and taking the high-order weight value in the filter in the convolutional neural network as the symbol vector.

The convolutional neural network acceleration method and the system thereof, wherein the value range of the compensation factor is more than 0 and less than 1.

The convolutional neural network acceleration method and the system are characterized in that the convolutional calculation is shown in the following formula:

O＝∑I*W

w is the weight of the filter, I is the input activation, and O is the convolution calculation result.

The invention also provides a convolutional neural network acceleration system and a convolutional neural network acceleration system, wherein the convolutional neural network acceleration system comprises:

the method comprises the steps of 1, taking an image to be subjected to feature analysis as input to activate and input a convolutional neural network, and decomposing a weight vector of a filter in the convolutional neural network to obtain a symbol vector corresponding to the weight in the filter;

the module 2 performs convolution operation through the symbol vector and the input activation vector to obtain a first convolution result, performs convolution operation through the compensation factor and the input activation vector to obtain a second convolution result, and adds the first convolution result and the second convolution result to obtain a prediction result;

and 2, skipping the operation related to the 0 value according to the prediction result when the convolutional neural network executes convolutional calculation to obtain a convolutional result.

The convolutional neural network acceleration system and system, wherein the module 2 comprises: judging whether a numerical value smaller than or equal to 0 exists in the predicted result, if so, acquiring a vector position of the numerical value smaller than or equal to 0 in the predicted result, skipping a calculation process related to the vector position when performing convolution calculation to obtain an activated output result, and setting the numerical value located at the vector position in the activated output result to zero to obtain the convolution result.

The convolutional neural network acceleration system and system, wherein the module 1 comprises: and taking the high-order weight value in the filter in the convolutional neural network as the symbol vector.

The convolutional neural network acceleration system and the system thereof, wherein the value range of the compensation factor is more than 0 and less than 1.

The convolutional neural network acceleration system and the system are characterized in that the convolutional calculation is shown in the following formula:

O＝∑I*W

The technical progress of the application of the invention is to provide a prediction method and a prediction system, which can predict the sparseness of output activation so as to guide the original neural network operation to skip the operation related to 0 value, thereby reducing the calculation amount of the original network, saving calculation resources, reducing power consumption and improving performance.

Drawings

FIG. 1 is a diagram of a predictor and actuator framework based on weight symbols;

FIG. 2 is a block diagram of a predictor and executor detailed device based on weight symbols;

FIG. 3 is a flowchart of a prediction phase;

FIG. 4 is a flowchart of the execution phase.

Detailed Description

In order to make the above features and effects of the present invention more clearly understood, the following specific examples are given with reference to the accompanying drawings.

The prediction method comprises the following steps:

the convolution layer calculation formula of the neural network is shown as the following formula:

O＝∑I*W

＝∑I*(W_msb ＜＜m+W_lsb )

＝∑I*(W_msb *2^m +W_lsb )

＝∑I*2^m *(W_msb +W_lsb *2^-m )

＝∑2^m *(I*W_msb +I*W_lsb *2^-m )

≈∑2^m *(I*W_msb +I*W₁ *a)

the convolution layer maps the filter weights (W) to the input feature map (I) to extract the input feature information. Since the filter weights can be decomposed into higher order bits (W_msb ) And lower (W)_lsb ) The convolution operation is divided into two parts, input and high order filter calculation and input and low order filter calculation. W (W)₁ A matrix of all 1's, the size of the matrix is the same as the size of the input activation. O is the convolution calculation result. In the neural network, the filter weights are known parameters, the input image is also called input activation, and the output characteristic diagram of the first layer is obtained after the first layer convolution, which is also called output activation. The output profile of the first layer is also the input to the next convolutional layer. The neural network is executed layer by layer, and the calculation result of the previous layer is the input of the next layer.

Prediction based on weight sign bits uses only the highest bits (signBits) performs a convolution operation to determine a final output result of 0 or non-0 while compensating for the loss of accuracy of the final result using the compensation factor a. The determination of the compensation factor requires neural network operations by setting different values, ranging from 0-1, but different values have different effects on the accuracy of the result, the best compensation factor should be the least. Assuming a is 0.5, by performing i×w_msb And I.times.W₁ *0.5, if the sum of the two is negative, the output is activated by an activation function (RELU) and then the output activation value is 0, otherwise, the output activation value is positive. Based on the predicted result, only the correlation convolution operation with the predicted result being a positive value is selected in the original convolution calculation, and the correlation convolution operation with the predicted result being a negative value is skipped.

Based on this convolution operation, it is divided into two stages, a prediction stage and an execution stage. In the prediction stage, the prediction device uses the weight sign bit and the compensation factor to predict the output activation, and the execution stage selects to execute the operation of non-0 output activation according to the prediction result. The prediction device is as shown in fig. 1:

a detailed prediction and execution apparatus based on the weight symbol prediction is shown in fig. 2. A convolution operation is performed using the weight symbols in the filter with the input activations, wherein if the weight is positive, the symbol is 0, the weight is negative, the symbol is-1, and since 0 is multiplied by any number 0, only the operation with the weight symbol-1 is performed by weight index, and the associated input activations are indexed by weight sign index. At the same time, the input activates the execution and compensation factor a operation. And adding the results of the two to obtain a predicted output sign, wherein if the predicted output sign is a negative value, the output activation value becomes 0 after the value passes through the activation function, and if the predicted output sign is a positive value, the output activation value remains unchanged after the value passes through the activation function. Based on this, index conputation unit calculates the correlation index of the non-0 output activation from the sign of the predicted value, weight index, input act index. According to the index information, the execution stage performs an operation of not 0 output activation, and for 0 output activation, a 0 value is directly output.

The process of predicting output activation based on weight symbols is described in detail in connection with the execution of convolution, in this example, the filter size is 2×2, the input activation (Ifmap) size is 4*4, and the compensation factor is assumed to be 0.5.

Step one: acquiring a symbol of each weight according to the weight of the filter, wherein the symbol is 0 when the weight is greater than 0, and the symbol is-1 when the weight is less than 0;

step two: the weight symbols and input activation (Ifmap) perform convolution operations, where only the weight-1 and the corresponding Ifmap perform multiply-add operations, as shown in fig. 3, with the result of the convolution being-0.7, -1, -0.8, -0.3. Meanwhile, the compensation factor and the input activation (Ifmap) also execute convolution operation, and the convolution result is 0.8,0.8,0.85,0.95;

step three: the respective results are added to obtain predicted output data, which is shown as 0.1, -0.2,0.05,0.65 in fig. 3.

Step four: in the execution stage, the operation of predicting that the output activation value is negative is skipped according to the predicted result. In fig. 3, the result-0.2 is negative, so the convolution operations associated with this value can be skipped during the execution phase, as shown in fig. 4, where the actuator only needs to execute the convolution operations of 3 output activation values, 0.045,0.145,0.475 respectively. And the output active value with the predicted output value of-0.2 directly outputs 0.

As shown in fig. 4, it is determined whether a value less than or equal to 0 exists in the prediction result, if so, the vector position of the value less than or equal to 0 in the prediction result is obtained, the calculation process related to the vector position is skipped when the convolution calculation is performed, that is, the calculation process of white fill color in the graph is skipped, only the convolution of gray fill color is calculated, an activated output result is obtained, and the value at the vector position in the activated output result is set to zero, so as to obtain the convolution result.

The following is an example of a system corresponding to the above method embodiment, and the present implementation system may be implemented in cooperation with the above embodiment. The details of the related art mentioned in the foregoing embodiments are still valid in the present implementation system, and in order to reduce repetition, details are not repeated here. Accordingly, the related technical details mentioned in the present embodiment system can also be applied to the above-described embodiments.

O＝∑I*W

Claims

1. An image detection method based on a convolutional neural network is characterized by comprising the following steps:

step 3, when the convolutional neural network executes convolutional calculation on the image, skipping 0 value related operation according to the prediction result to obtain a convolutional result;

the step 3 comprises the following steps: judging whether a numerical value smaller than or equal to 0 exists in the predicted result, if so, acquiring a vector position of the numerical value smaller than or equal to 0 in the predicted result, skipping a calculation process related to the vector position when performing convolution calculation to obtain an activated output result, and setting the numerical value located at the vector position in the activated output result to zero to obtain the convolution result.

2. The convolutional neural network-based image detection method of claim 1, wherein the step 1 comprises: and taking the high-order weight value in the filter in the convolutional neural network as the symbol vector.

3. The convolutional neural network-based image detection method of claim 1, wherein the compensation factor has a value range greater than 0 and less than 1.

4. The convolutional neural network-based image detection method of claim 1, wherein the convolutional calculation is as shown in the following equation:

O＝∑I*W

5. An image detection system based on a convolutional neural network, comprising:

the method comprises the steps of 1, taking an image to be detected as input, activating and inputting a convolutional neural network, and decomposing a weight vector of a filter in the convolutional neural network to obtain a symbol vector corresponding to the weight in the filter;

the module 3 skips the operation related to the 0 value according to the prediction result when the convolutional neural network executes convolutional calculation on the image, and a convolutional result is obtained;

the module 3 comprises: judging whether a numerical value smaller than or equal to 0 exists in the predicted result, if so, acquiring a vector position of the numerical value smaller than or equal to 0 in the predicted result, skipping a calculation process related to the vector position when performing convolution calculation to obtain an activated output result, and setting the numerical value located at the vector position in the activated output result to zero to obtain the convolution result.

6. The convolutional neural network-based image detection system of claim 5, wherein the module 1 comprises: and taking the high-order weight value in the filter in the convolutional neural network as the symbol vector.

7. The convolutional neural network-based image detection system of claim 5, wherein the compensation factor has a range of values greater than 0 and less than 1.

8. The convolutional neural network-based image detection system of claim 5, wherein the convolutional calculation is as shown in the following equation:

O＝∑I*W