Movatterモバイル変換


[0]ホーム

URL:


CN114707650B - Simulation implementation method for improving simulation efficiency - Google Patents

Simulation implementation method for improving simulation efficiency
Download PDF

Info

Publication number
CN114707650B
CN114707650BCN202210321357.4ACN202210321357ACN114707650BCN 114707650 BCN114707650 BCN 114707650BCN 202210321357 ACN202210321357 ACN 202210321357ACN 114707650 BCN114707650 BCN 114707650B
Authority
CN
China
Prior art keywords
neural network
point characteristic
simulation
file
folder
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210321357.4A
Other languages
Chinese (zh)
Other versions
CN114707650A (en
Inventor
朱旭东
吴春选
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Xinmai Microelectronics Co ltd
Original Assignee
Zhejiang Xinmai Microelectronics Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Xinmai Microelectronics Co ltdfiledCriticalZhejiang Xinmai Microelectronics Co ltd
Priority to CN202210321357.4ApriorityCriticalpatent/CN114707650B/en
Publication of CN114707650ApublicationCriticalpatent/CN114707650A/en
Application grantedgrantedCritical
Publication of CN114707650BpublicationCriticalpatent/CN114707650B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

The application discloses a simulation implementation method for improving simulation efficiency, which relates to the technical field of deep learning, wherein the simulation implementation method for improving the simulation efficiency comprises the following steps: the quantization set picture quantifies the neural network model through the neural network compiler to generate an executable file, the ten-thousand-person test set generates first input data, a first fixed point characteristic file and a floating point characteristic file through the neural network compiler, and if the statistical result of the precision table accords with a preset precision range, the executable file and the first input data are read to simulate the neural network model. The method has the advantages that batch simulation of a plurality of different types of neural network models is realized, correctness of transplanting to chips or FPGA is ensured, simulation is carried out layer by layer aiming at the different types of neural network models, more simulation verification points are covered, risk of chip streaming is prevented, and meanwhile, comprehensive accuracy verification is carried out on accuracy tables for counting the neural network models.

Description

Simulation implementation method for improving simulation efficiency
The application discloses a simulation realization method based on a neural network compiler, the neural network compiler and a computer readable storage medium, which are divisional applications of which the application date is 2021, 12 and 31, and the application number is 202111653883.2.
Technical Field
The application belongs to the technical field of deep learning, and particularly relates to a simulation implementation method for improving simulation efficiency.
Background
With the development of internet technology, the collected mass data provides enough scenes for deep learning training, the development of intelligent algorithms mainly comprising convolutional neural networks depends on the mass data, and the accuracy of the intelligent algorithms in the fields of image classification, object recognition and the like exceeds the recognition accuracy of human beings.
The neural network algorithm is required to land in the security field, and a trained algorithm model on the server is required to be analyzed into a computer language which can be identified by the embedded chip, so that the installation and monitoring of the security cameras are facilitated.
The convolutional neural network algorithm is realized on a CPU (Central Processing Unit, a central processing unit) or a GPU (Graphics Processing Unit, a graphic processor) and is transplanted to an FPGA (Programmable GATE ARRAY, a field Programmable gate array) or a chip to be realized so as to be convenient for portable carrying and installation, the calculation power realized in the CPU cannot meet the current requirements, the GPU realization mode cannot be applied to embedded equipment, and the python language or the C++ language is adopted to carry out the 32bit floating point forward realization process, so that the cost is reduced and the precision is not lost in order to reduce the area of the chip, the implementation is realized by quantifying to 8bit fixed points, and the FPGA or the chip is realized by adopting verilog (HDL is a hardware description language), so that the whole neural network model is required to be simulated and whether the precision meets the requirements or not is verified.
The prior technical proposal has the following defects: firstly, each middle layer of a neural network model can only be simulated, each layer of information can only be manually searched and configured into a file in the early stage, the precision test of a ten-thousand-person test set can not be performed, and the range distribution of different data sets is not simulated. Secondly, when other types of neural network models or test sets of different scenes are replaced, the operation correctness of the chip or the FPGA cannot be ensured, the chip flow cost is increased, the neural network is not quantized, and the floating point multiplier is adopted, so that the operation performance is reduced.
Disclosure of Invention
The application aims to provide a simulation implementation method for improving simulation efficiency, which aims to solve the technical problems that in the prior art, only each middle layer of a neural network model can be simulated and precision test of a ten-thousand-person test set cannot be performed.
In order to achieve the technical purpose, the application adopts the following technical scheme:
A simulation implementation method for improving simulation efficiency comprises the following steps:
A neural network compiler is constructed and used for receiving quantized set pictures, a plurality of different types of neural network models and a ten-thousand-person test set, and after the neural network compiler performs accuracy verification, the neural network models are simulated layer by layer;
the quantization set picture quantizes the neural network model through the neural network compiler to generate an executable file, and the ten-thousand-person test set generates first input data, a first fixed-point characteristic file and a floating-point characteristic file through the neural network compiler;
comparing the first fixed point characteristic file with the floating point characteristic file, and outputting a precision table for counting the neural network model;
and if the statistical result of the precision table accords with a preset precision range, reading the executable file and the first input data to simulate the neural network model.
Preferably, the method further comprises the steps of:
Building an environment of the neural network compiler, installing the neural network compiler, and testing whether the neural network compiler is successfully installed;
the building environment of the neural network compiler is set to be the same operating system as that of the simulation system.
Preferably, the quantization set picture quantizes the neural network model through the neural network compiler to generate an executable file, which specifically includes the following steps:
preparing different types of neural network models and quantized set pictures in different scenes;
operating the neural network compiler, and quantizing the neural network model according to the quantized set picture to generate the executable file;
the executable file comprises a neural network name identifier, a layer identifier of an input layer, a layer identifier of an intermediate layer, a layer identifier of an output layer, a quantized weight value, a quantized offset value, a layer operation name, layer parameter information, layer association information and layer memory information.
Preferably, the method further comprises the steps of:
Presetting the number of the neural network models, setting the initial circulation times to 0, and judging whether the circulation times accord with the preset number of the neural network models;
If the cycle times do not accord with the preset quantity of the neural network models, the quantization set picture quantizes the neural network models through the neural network compiler to generate the executable file, and the ten-thousand-person test set generates the first input data, the first fixed-point characteristic file and the floating-point characteristic file through the neural network compiler;
And if the cycle times accord with the preset quantity of the neural network models, ending the flow.
Preferably, the ten thousand person test set generates first input data, a first fixed point characteristic file and a floating point characteristic file through the neural network compiler, and specifically comprises the following steps:
preparing different ten-thousand-person test sets according to different neural network models;
The ten-thousand-person test set generates first input data with the network resolution through a scaling function, and the ten-thousand-person test set is simulated to generate a first fixed-point characteristic file and a floating-point characteristic file.
Preferably, comparing the first fixed point feature file with the floating point feature file, outputting a precision table for counting the neural network model, and specifically comprising the following steps:
the floating point characteristic file comprises first floating point characteristic data, the fixed point characteristic data in the first fixed point characteristic file is converted into floating point characteristic data, and second floating point characteristic data is generated;
comparing the similarity of the first floating point characteristic data and the second floating point characteristic data, and if the similarity is within a preset variable, meeting the precision requirement; if the similarity is not in the preset variable, the accuracy requirement is not met;
And outputting the similarity statistical result of the first floating point characteristic data and the second floating point characteristic data in a form of a table.
Preferably, if the statistical result of the precision table accords with a preset precision range, the executable file and the first input data are read to simulate the neural network model, and the method specifically includes the following steps:
counting the precision table, wherein the counting result is required to accord with a preset precision range;
Reading the executable file, configuring hardware according to the executable file, reading the first input data, starting simulation of the neural network model according to the first input data, and generating a second fixed-point characteristic file;
and comparing the first fixed point characteristic file with the second fixed point characteristic file, and if the first fixed point characteristic file and the second fixed point characteristic file are different, storing error data in the second fixed point characteristic file.
Preferably, the method further comprises the steps of:
establishing a first folder, and automatically generating a first main folder under the first folder, wherein the first main folder is used for storing the executable files;
automatically generating a first sub-folder under a first folder, wherein the first sub-folder is used for storing the first fixed-point characteristic file;
and automatically generating an input data folder under a first folder, wherein the input data folder is used for storing the first input data.
Preferably, different types of neural network models and quantized set pictures are prepared, and the method specifically comprises the following steps of:
and establishing a second folder, and generating a second main folder under the second folder, wherein the second main folder is used for storing the neural network models of different types, the quantized set pictures and the floating point characteristic files.
Preferably, different ten thousand person test sets are prepared according to different neural network models, and the method specifically comprises the following steps:
and establishing a second auxiliary folder under the second main folder, wherein the second auxiliary folder is used for storing the ten-thousand-person test set.
A neural network compiler is applied to the simulation implementation method for improving the simulation efficiency, and comprises the following steps: the network analysis module, the network quantification module, the network merging module, the network storage module and the network forward execution module are sequentially connected;
The network analysis module is used for receiving the quantized set pictures, the multiple different types of neural network models and the ten-thousand-person test set, analyzing and reconstructing the structure of the neural network models layer by layer, and at least acquiring one of the input layer, the output layer and the layer operation name, the layer parameter information and the layer association information of the middle layer of the neural network model;
the network quantization module is used for generating an offset value, a conversion value and converting a floating point type weight value into a fixed point type weight value according to the reconstructed neural network model;
The network merging module is used for merging the running water operation instructions of the convolution layer, the pooling layer and the activation layer in the neural network model;
The network storage module is used for storing the data in the network analysis module, the network quantization module and the network merging module to generate an executable file;
The network forward execution module is used for generating the first input data, the first fixed point characteristic file and the floating point characteristic file through the network forward execution module by the ten-thousand-person test set, comparing the first fixed point characteristic file and the floating point characteristic file, and outputting an accuracy table for counting the neural network model.
A computer readable storage medium having stored thereon computer instructions which when executed by a processor perform the steps of the method described above.
The beneficial effects provided by the application are as follows:
1. The quantization set picture is quantized to different neural network models through the neural network compiler to generate different executable files, and if the statistical result of the precision table accords with a preset precision range, the executable files and the first input data are read to simulate the neural network models. The simulation method has the advantages that batch simulation of a plurality of different types of neural network models is realized, various marginalized simulation is considered in the simulation of the neural network models, correctness of the transplanted neural network models to a chip or an FPGA is ensured, hardware is configured through executable files, simulation is conducted layer by layer aiming at the different types of the neural network models, more simulation verification points are covered, risk of chip streaming is prevented, cost is saved, simulation efficiency is improved, and meanwhile, comprehensive accuracy verification is conducted on an accuracy table for counting the neural network models.
2. The method comprises the steps of presetting the number of neural network models, setting the initial circulation times to 0, and judging whether the circulation times accord with the preset number of the neural network models. By judging the number of the neural network models, the time for generating the executable file, the first input data, the first fixed point characteristic file and the floating point characteristic file is saved, and the time consumption of quantification of the neural network models in the forward process is avoided. The generated different data are automatically stored under different folders through pre-stored paths, corresponding data are provided for realizing simulation of multiple types of neural network models, the simulation flow is simplified, and the simulation efficiency is accelerated.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of a simulation implementation method for improving simulation efficiency in embodiment 1.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. The components of the embodiments of the present application generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the application, as presented in the figures, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Example 1:
As shown in fig. 1, the present embodiment includes a simulation implementation method for improving simulation efficiency, including the following steps:
And constructing a neural network compiler which is used for receiving the quantized set pictures, the neural network models of different types and the ten-thousand-person test set, and simulating the neural network models layer by layer after the neural network compiler performs accuracy verification.
The quantized set picture quantizes the neural network model through a neural network compiler to generate an executable file, and the ten-thousand-person test set generates first input data, a first fixed-point characteristic file and a floating-point characteristic file through the neural network compiler.
Comparing the first fixed point characteristic file with the floating point characteristic file, and outputting a precision table for the statistical neural network model; if the statistical result of the precision table accords with the preset precision range, the executable file and the first input data are read to simulate the neural network model.
The simulation method has the advantages that batch simulation of a plurality of different types of neural network models is realized, various marginalized simulation is considered in the simulation of the neural network models, correctness of the transplanted neural network models to a chip or an FPGA is ensured, hardware is configured through executable files, simulation is conducted layer by layer aiming at the different types of the neural network models, more simulation verification points are covered, risk of chip streaming is prevented, cost is saved, simulation efficiency is improved, and meanwhile, comprehensive accuracy verification is conducted on an accuracy table for counting the neural network models.
The method also comprises the steps of: and setting up the environment of the neural network compiler, installing the neural network compiler, and testing whether the neural network compiler is successfully installed, wherein the setting up environment of the neural network compiler is set to be the same operating system as the simulation system. Specifically, the neural network compiler is packaged into whl format, whl format is a compressed file format, and the installation test under the operating system is convenient.
The quantization set picture quantizes the neural network model through a neural network compiler to generate an executable file, and specifically comprises the following steps: different types of neural network models and quantized set pictures in different scenes are prepared.
And operating a neural network compiler, and quantizing the neural network model according to the quantized set picture to generate an executable file. The executable file comprises a neural network name identifier, a layer identifier of an input layer, a layer identifier of an intermediate layer, a layer identifier of an output layer, a quantized weight value, a quantized offset value, a layer operation name, layer parameter information, layer association information and layer memory information.
Specifically, the network analysis module of the neural network compiler analyzes and reconstructs the structure of the original neural network model layer by layer, generates offset values and conversion values according to the reconstructed neural network model, and converts floating point type weight values into fixed point type weight values. The network merging module and the network quantifying module operate simultaneously to merge pipeline operation instructions in a convolution layer, a pooling layer and an activation layer in the neural network model. And the network storage module generates executable files from the quantized data operated by the network analysis module, the network quantization module and the network merging module.
The formula for generating the offset value is as follows:
equation one: x's'm=(x′max-x′min)*2bw
Where x 'm denotes an offset value, x 'max denotes a maximum weight value of the floating point type, x 'min denotes a minimum weight value of the floating point type, bw denotes a converted bit width, and in this embodiment, a bit width of 12 bits is currently supported.
The formula for generating the conversion value is as follows:
formula II: f=max ((bw-ceil (log2(x′m) +1)), bw)
Wherein, f represents a conversion value, max represents a maximum value of the built-in function belonging to the system library, bw represents a converted bit width, log2 represents the built-in function of the system library, x'm represents an offset value, and ceil represents an upward rounding of the built-in function belonging to the system library.
Converting a floating point type weight value into a fixed point type weight value, wherein the formula for converting floating point characteristic data into fixed point characteristic data is expressed as follows:
And (3) a formula III: x=round (Xfloat*2f)+x′m
Wherein X represents fixed-point feature data, in this embodiment, may be a fixed-point weight value, Xfloat represents floating-point feature data, in this embodiment, may be a floating-point weight value, round represents a rounded system library built-in function, f represents a conversion value, and X'm represents an offset value.
Specifically, the layer operation names include at least one of convolution, deconvolution, pooling, full join, culling, join, point addition, point multiplication, normalization, and activation layer operations. The layer parameter information includes at least one of a convolution kernel size, a convolution kernel span, a grouping, a padding value, whether to bring an active layer, a quantized weight value, and a quantized offset value. The layer association information includes at least one of an input layer operation name of the current layer, layer parameter information, an output layer operation name of the current layer, and layer parameter information. The intra-layer memory information includes at least one of a memory size of a current layer and whether to multiplex the memories of other layers.
Specifically, the neural network models of different types comprise a detection network, an identification network, a classification network and the like, and the number of quantized set pictures in different scenes at least comprises 50.
The method also comprises the steps of: presetting the number of neural network models, setting the initial circulation times to 0, and judging whether the circulation times accord with the preset number of the neural network models.
If the cycle times do not accord with the number of the preset neural network models, the quantized set pictures quantize the neural network models through a neural network compiler to generate executable files, and the ten-thousand-person test set generates first input data, a first fixed-point characteristic file and a floating-point characteristic file through the neural network compiler.
If the cycle number accords with the number of the preset neural network models, ending the flow. And adding 1 to the circulation number every time an executable file is simulated.
By judging the number of the neural network models, the time for generating the executable file, the first input data, the first fixed point characteristic file and the floating point characteristic file is saved, and the time consumption of quantification of the neural network models in the forward process is avoided.
The ten-thousand-person test set generates first input data, a first fixed-point characteristic file and a floating-point characteristic file through a neural network compiler, and specifically comprises the following steps:
according to different neural network models, different ten-thousand-person test sets are prepared, the ten-thousand-person test sets generate first input data with the network resolution through a scaling function, and simulation is carried out on the ten-thousand-person test sets to generate a first fixed-point feature file and a floating-point feature file.
Specifically, the ten thousand person test sets are picture sets, the number of the ten thousand person test sets is ten thousand pictures, and the ten thousand person test sets generate first input data, a first fixed point characteristic file and a floating point characteristic file through a network forward execution module.
The method also comprises the steps of: and establishing a first folder, automatically generating a first main folder under the first folder, wherein the first main folder is used for storing executable files.
And automatically generating a first sub-folder under the first folder, wherein the first sub-folder is used for storing the first fixed-point characteristic file. And automatically generating an input data folder under the first folder, wherein the input data folder is used for storing the first input data.
Preparing different types of neural network models and quantized set pictures, and specifically comprising the following steps of: and establishing a second folder, and generating a second main folder under the second folder, wherein the second main folder is used for storing different types of neural network models, quantized set pictures and floating point characteristic files.
According to different neural network models, different ten thousand person test sets are prepared, and the method specifically comprises the following steps: and establishing a second auxiliary folder under the second main folder, wherein the second auxiliary folder is used for storing the ten-thousand-person test set.
Specifically, under the current PATH, a first folder and a second folder are established, the file name of the first folder is defined as SPE_PATH1, the file name of the second folder is defined as SPE_PATH2, under the SPE_PATH2 file, a second main folder named by the name of the neural network is established to store the neural network model and quantized set pictures generated by the GPU, and a second auxiliary folder is established under the second main folder to store the ten-thousand-person test set.
And generating a second main folder named by the neural network name under the SPE_PATH1 file by the neural network compiler every time an executable file is generated, and storing the executable file generated by the neural network compiler.
Automatically generating an input data folder under the SPE_PATH1 file, defining a neural network name which is analyzed by a neural network compiler as resnet in the embodiment, defining the file name of the generated input data folder as SPE_PATH1/resnet/data_input, storing a ten-thousand-person test set, generating first input data with the network resolution through a scaling function, and adopting a hexadecimal format and arranging each data line for the convenience of simulation.
And automatically generating a first sub-folder under the SPE_PATH1 file, wherein the analyzed neural network name is resnet, the network layer name is conv1_1, and the layer serial number is 1, and the file name of the generated first sub-folder is defined as SPE_PATH1/resnet/conv1_1 and is used for storing a first fixed point characteristic file generated by an intermediate layer and an output layer when the ten-thousand-person test set is simulated so as to facilitate the simulation to check the correctness of data, and each data line is arranged in a hexadecimal format.
The generated different data are automatically stored under different folders through pre-stored paths, corresponding data are provided for realizing simulation of multiple types of neural network models, the simulation flow is simplified, and the simulation efficiency is accelerated.
The method also comprises the steps of: and presetting the number of executable files, and judging whether the number of the executable files under the first main folder exceeds the number of the preset executable files.
If the number of the executable files under the first main folder does not exceed the number of the preset executable files, the neural network compiler simulates the ten-thousand-person test set to generate a first fixed-point characteristic file.
If the number of the executable files under the first main folder exceeds the number of the preset executable files, ending the process of simulating the ten-thousand-person test set by the neural network compiler.
And determining whether the ten-thousand-person test set is simulated by judging the number of executable files under the first main folder, and ending the simulation flow if the simulation is finished, so that the simulation efficiency is improved.
Comparing the first fixed point characteristic file with the floating point characteristic file, and outputting a precision table for the statistical neural network model, wherein the method specifically comprises the following steps:
the floating point characteristic file comprises first floating point characteristic data, the fixed point characteristic data in the first fixed point characteristic file is converted into floating point characteristic data, and second floating point characteristic data is generated;
Comparing the similarity of the first floating point characteristic data and the second floating point characteristic data, and if the similarity is within a preset variable, meeting the precision requirement; if the similarity is not in the preset variable, the accuracy requirement is not met;
And outputting the similarity statistics of the first floating point characteristic data and the second floating point characteristic data in the form of a table.
Specifically, the fixed point characteristic data in the first fixed point characteristic file is converted into floating point characteristic data through a conversion formula, wherein the conversion formula is as follows:
Equation four: x'.float=(X-x′m)/2f
Wherein X 'float represents floating point feature data, which in this embodiment may be second floating point feature data, X represents fixed point feature data, which in this embodiment may be fixed point feature data in the first fixed point feature file, X'm represents an offset value, and f represents a conversion value.
Specifically, the similarity between the first floating point feature data and the second floating point feature data is compared, and the similarity distance formula is as follows:
Formula five:
Where n represents the total number of floating point feature data, xi represents the first floating point feature data, and yi represents the second floating point feature data, i.e., the value of x'float in equation four. θ represents the similarity of distances, and closer to 1 indicates higher accuracy.
In this embodiment, by testing a ten-thousand-person test set corresponding to a neural network model, setting a preset variable to be a similarity distance of 0.8, comparing the similarity of the first floating point feature data and the second floating point feature data, that is, counting the similarity of each picture in the ten-thousand-person test set, when the similarity distance is greater than or equal to 0.8, indicating that the precision requirement is met, counting the duty ratio of the count data of each neural network model in the ten-thousand-person test set, and outputting a precision table for counting the neural network model. The statistical result of the precision table can be intuitively seen, and whether the requirement of hardware design meets the precision requirement is checked.
If the statistical result of the precision table accords with the preset precision range, reading the executable file and the first input data to simulate the neural network model, wherein the method specifically comprises the following steps:
and counting the precision table, wherein the counting statistical result is required to accord with a preset precision range. And reading the executable file, configuring hardware according to the executable file, reading the first input data, and starting simulation of the neural network model according to the first input data to generate a second fixed-point characteristic file.
And comparing the first fixed point characteristic file with the second fixed point characteristic file, and if the first fixed point characteristic file and the second fixed point characteristic file are different, storing error data in the second fixed point characteristic file.
The simulation problem can be conveniently located through the error data in the second fixed-point characteristic file, the simulation efficiency can be improved, and the simulation coverage is wider.
Example 2:
the embodiment includes a neural network compiler, which is applied to the simulation implementation method for improving the simulation efficiency of embodiment 1, and includes: the system comprises a network analysis module, a network quantification module, a network merging module, a network storage module and a network forward execution module which are connected in sequence.
The network analysis module is used for receiving the quantized set pictures, the nerve network models of different types and the ten-thousand-person test set, analyzing and reconstructing the structure of the nerve network model layer by layer, and at least acquiring one of the input layer, the output layer and the layer operation name, the layer parameter information and the layer association information of the middle layer of the nerve network model.
Specifically, the network analysis module analyzes the structure of the original neural network model layer by layer, at least one of the input layer, the output layer, the layer operation name of the middle layer, the layer parameter information and the layer association information of the neural network model is obtained, the structure executed in the internal sequence is reconstructed after analysis, the data structure of the internal relevant network layer is redefined, the network layer comprises a convolution layer, a pooling layer and an activation layer, and the content such as the layer execution sequence, the layer operation type, the layer operation name, the layer parameter information and the layer association information is filled into the data structure of the internal relevant network layer.
And the network quantization module is used for generating an offset value, a conversion value and converting a floating point type weight value into a fixed point type weight value according to the reconstructed neural network model.
Specifically, floating point characteristic data of the storage address space is converted into a data format supported by hardware, and conversion values are calculated, so that the calculated amount of hardware and the number of multipliers are reduced.
And the network merging module is used for merging the running water operation instructions of the convolution layer, the pooling layer and the activation layer in the neural network model.
Specifically, according to the principle of reducing the bandwidth of an external memory, pipeline operation instructions in a convolution layer, a pooling layer and an activation layer are optimized, equivalent transformation optimization is performed on the convolution layer, the pooling layer and the activation layer, and internal data structures are optimized and combined again, so that the resource consumption is reduced, and the execution efficiency is improved. And the data interaction between the internal memory and the external memory is reduced, so that the bandwidth utilization rate is improved, and the layers in the same pipeline stage are combined, wherein the main combined layers are a convolution layer and a pooling layer.
And the network storage module is used for storing the data in the network analysis module, the network quantization module and the network merging module to generate an executable file.
The network forward execution module is used for generating first input data, a first fixed point characteristic file and a floating point characteristic file through the network forward execution module by the ten-thousand testing set, comparing the first fixed point characteristic file and the floating point characteristic file, and outputting an accuracy form for counting the neural network model.
Specifically, the standardization part is implemented by adopting an open-source deep learning architecture so as to ensure the correct comparison standard, and the simulation part keeps the forward logic of the network consistent with the logic of the hardware execution network so as to ensure the consistency of the simulation result of the data and the hardware.
For relevance, see the section of example 1.
Example 3:
a computer readable storage medium having stored thereon computer instructions which when executed by a processor perform the steps of the method of embodiment 2.
It will be apparent to those skilled in the art that embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal device, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It should be noted that:
Reference in the specification to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the application. Thus, the appearances of the phrase "one embodiment" or "an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the application.
In addition, the specific embodiments described in the present specification may differ in terms of parts, shapes of components, names, and the like. All equivalent or simple changes of the structure, characteristics and principle according to the inventive concept are included in the protection scope of the present application. Those skilled in the art may make various modifications or additions to the described embodiments or substitutions in a similar manner without departing from the scope of the application as defined in the accompanying claims.

Claims (9)

CN202210321357.4A2021-12-312021-12-31Simulation implementation method for improving simulation efficiencyActiveCN114707650B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202210321357.4ACN114707650B (en)2021-12-312021-12-31Simulation implementation method for improving simulation efficiency

Applications Claiming Priority (2)

Application NumberPriority DateFiling DateTitle
CN202111653883.2ACN114004352B (en)2021-12-312021-12-31Simulation implementation method, neural network compiler and computer readable storage medium
CN202210321357.4ACN114707650B (en)2021-12-312021-12-31Simulation implementation method for improving simulation efficiency

Related Parent Applications (1)

Application NumberTitlePriority DateFiling Date
CN202111653883.2ADivisionCN114004352B (en)2021-12-312021-12-31Simulation implementation method, neural network compiler and computer readable storage medium

Publications (2)

Publication NumberPublication Date
CN114707650A CN114707650A (en)2022-07-05
CN114707650Btrue CN114707650B (en)2024-06-14

Family

ID=79932421

Family Applications (3)

Application NumberTitlePriority DateFiling Date
CN202111653883.2AActiveCN114004352B (en)2021-12-312021-12-31Simulation implementation method, neural network compiler and computer readable storage medium
CN202210315323.4AActiveCN114676830B (en)2021-12-312021-12-31Simulation implementation method based on neural network compiler
CN202210321357.4AActiveCN114707650B (en)2021-12-312021-12-31Simulation implementation method for improving simulation efficiency

Family Applications Before (2)

Application NumberTitlePriority DateFiling Date
CN202111653883.2AActiveCN114004352B (en)2021-12-312021-12-31Simulation implementation method, neural network compiler and computer readable storage medium
CN202210315323.4AActiveCN114676830B (en)2021-12-312021-12-31Simulation implementation method based on neural network compiler

Country Status (1)

CountryLink
CN (3)CN114004352B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN114386588B (en)*2022-03-232022-07-29杭州雄迈集成电路技术股份有限公司Neural network reasoning method and system
CN116861971A (en)*2023-06-272023-10-10北京微电子技术研究所 An efficient firmware operating system for neural network processors
CN117634374A (en)*2023-08-242024-03-01上海合见工业软件集团有限公司Heterogeneous hardware simulation method and system
CN117034822B (en)*2023-10-102023-12-15北京云枢创新软件技术有限公司Verification method based on three-step simulation, electronic equipment and medium

Family Cites Families (21)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN103929210B (en)*2014-04-252017-01-11重庆邮电大学Hard decision decoding method based on genetic algorithm and neural network
US10643126B2 (en)*2016-07-142020-05-05Huawei Technologies Co., Ltd.Systems, methods and devices for data quantization
WO2018015963A1 (en)*2016-07-212018-01-25Ramot At Tel-Aviv University Ltd.Method and system for comparing sequences
CN108510067B (en)*2018-04-112021-11-09西安电子科技大学Convolutional neural network quantification method based on engineering realization
CN109716288A (en)*2018-04-172019-05-03深圳鲲云信息科技有限公司Network model compiler and Related product
US11645493B2 (en)*2018-05-042023-05-09Microsoft Technology Licensing, LlcFlow for quantized neural networks
CN109102064B (en)*2018-06-262020-11-13杭州雄迈集成电路技术股份有限公司High-precision neural network quantization compression method
US11586883B2 (en)*2018-12-142023-02-21Microsoft Technology Licensing, LlcResidual quantization for neural networks
CN109740302B (en)*2019-04-022020-01-10深兰人工智能芯片研究院(江苏)有限公司Simulation method and device of neural network
WO2021068253A1 (en)*2019-10-122021-04-15深圳鲲云信息科技有限公司Customized data stream hardware simulation method and apparatus, device, and storage medium
CN110795165A (en)*2019-10-122020-02-14苏州浪潮智能科技有限公司 A neural network model data loading method and related device
CN113228056B (en)*2019-10-122023-12-22深圳鲲云信息科技有限公司Runtime hardware simulation method, device, equipment and storage medium
CN110750945B (en)*2019-12-252020-11-13安徽寒武纪信息科技有限公司Chip simulation method and device, simulation chip and related product
CN111178512B (en)*2019-12-312023-04-18中科南京人工智能创新研究院Device operation neural network test method and device
CN113326930B (en)*2020-02-292024-05-03华为技术有限公司Data processing method, neural network training method, related device and equipment
CN111401550A (en)*2020-03-102020-07-10北京迈格威科技有限公司Neural network model quantification method and device and electronic equipment
CN111523526A (en)*2020-07-022020-08-11杭州雄迈集成电路技术股份有限公司Target detection method, computer equipment and readable storage medium
CN112232497A (en)*2020-10-122021-01-15苏州浪潮智能科技有限公司Method, system, device and medium for compiling AI chip
CN112446491B (en)*2021-01-202024-03-15上海齐感电子信息科技有限公司Real-time automatic quantification method and real-time automatic quantification system for neural network model
CN113159276B (en)*2021-03-092024-04-16北京大学Model optimization deployment method, system, equipment and storage medium
CN113435570B (en)*2021-05-072024-05-31西安电子科技大学Programmable convolutional neural network processor, method, device, medium and terminal

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
BP神经网络字符识别系统Matlab建模及硬件实现;余菲;赵杰;王静霞;温国忠;宋荣;;深圳职业技术学院学报;20190520(03);全文*

Also Published As

Publication numberPublication date
CN114004352B (en)2022-04-26
CN114004352A (en)2022-02-01
CN114676830A (en)2022-06-28
CN114676830B (en)2024-06-14
CN114707650A (en)2022-07-05

Similar Documents

PublicationPublication DateTitle
CN114707650B (en)Simulation implementation method for improving simulation efficiency
CN110515739B (en)Deep learning neural network model load calculation method, device, equipment and medium
Zhong et al.Reduced-order digital twin and latent data assimilation for global wildfire prediction
CN114723033A (en)Data processing method, data processing device, AI chip, electronic device and storage medium
CN113392973B (en)AI chip neural network acceleration method based on FPGA
US11269760B2 (en)Systems and methods for automated testing using artificial intelligence techniques
CN115222950A (en)Lightweight target detection method for embedded platform
CN114429208A (en) Model compression method, device, equipment and medium based on residual structure pruning
CN114742211B (en) A Microcontroller-Oriented Convolutional Neural Network Deployment and Optimization Method
CN112488287A (en)Convolutional neural network compression method, system, device and medium
CN115392443A (en)Pulse neural network application representation method and device of brain-like computer operating system
CN114330690A (en) Convolutional neural network compression method, device and electronic device
US20220058530A1 (en)Method and device for optimizing deep learning model conversion, and storage medium
CN118331746B (en)Edge cloud long sequence load prediction method, device, equipment and medium based on complex period
CN118981604A (en) Multimodal large model perception quantization training method, device, computer equipment and storage medium
CN112069269A (en)Big data and multidimensional feature-based data tracing method and big data cloud server
CN115033434B (en)Method and device for calculating kernel performance theoretical value and storage medium
CN115759197A (en) Neural network search method, device and computer equipment
CN117368686A (en)Cloud side deep learning chip test index and method based on computer vision reasoning
LiOptimizing Embedded Neural Network Models
González et al.Impact of ML optimization tactics on greener pre-trained ML models
Bougioukou et al.A Distributed Emulation Environment for In-Memory Computing Systems
CN118378725B (en)Model optimization method, device, computer equipment and computer readable storage medium
CN117494759B (en) A micro hardware machine learning method and system
CN116561696B (en)Multi-dimensional user adjustable load rapid aggregation method and system thereof

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
CB02Change of applicant information
CB02Change of applicant information

Country or region after:China

Address after:311400 4th floor, building 9, Yinhu innovation center, No.9 Fuxian Road, Yinhu street, Fuyang District, Hangzhou City, Zhejiang Province

Applicant after:Zhejiang Xinmai Microelectronics Co.,Ltd.

Address before:311400 4th floor, building 9, Yinhu innovation center, No.9 Fuxian Road, Yinhu street, Fuyang District, Hangzhou City, Zhejiang Province

Applicant before:Hangzhou xiongmai integrated circuit technology Co.,Ltd.

Country or region before:China

GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp