Movatterモバイル変換


[0]ホーム

URL:


CN108734288B - Operation method and device - Google Patents

Operation method and device
Download PDF

Info

Publication number
CN108734288B
CN108734288BCN201710269049.0ACN201710269049ACN108734288BCN 108734288 BCN108734288 BCN 108734288BCN 201710269049 ACN201710269049 ACN 201710269049ACN 108734288 BCN108734288 BCN 108734288B
Authority
CN
China
Prior art keywords
data
model
neural network
processed
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710269049.0A
Other languages
Chinese (zh)
Other versions
CN108734288A (en
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Cambricon Information Technology Co Ltd
Original Assignee
Shanghai Cambricon Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to CN201710269049.0ApriorityCriticalpatent/CN108734288B/en
Application filed by Shanghai Cambricon Information Technology Co LtdfiledCriticalShanghai Cambricon Information Technology Co Ltd
Priority to CN202410405915.4Aprioritypatent/CN118690805A/en
Priority to EP19214320.4Aprioritypatent/EP3654172A1/en
Priority to EP19214371.7Aprioritypatent/EP3786786B1/en
Priority to EP18788355.8Aprioritypatent/EP3614259A4/en
Priority to KR1020197038135Aprioritypatent/KR102258414B1/en
Priority to CN201880000923.3Aprioritypatent/CN109121435A/en
Priority to CN201811097653.0Aprioritypatent/CN109376852B/en
Priority to PCT/CN2018/083415prioritypatent/WO2018192500A1/en
Priority to US16/476,262prioritypatent/US11531540B2/en
Priority to JP2019549467Aprioritypatent/JP6865847B2/en
Priority to KR1020197025307Aprioritypatent/KR102292349B1/en
Publication of CN108734288ApublicationCriticalpatent/CN108734288A/en
Priority to US16/697,637prioritypatent/US11720353B2/en
Priority to US16/697,727prioritypatent/US11698786B2/en
Priority to US16/697,533prioritypatent/US11531541B2/en
Priority to US16/697,687prioritypatent/US11734002B2/en
Priority to JP2019228383Aprioritypatent/JP6821002B2/en
Application grantedgrantedCritical
Publication of CN108734288BpublicationCriticalpatent/CN108734288B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Classifications

Landscapes

Abstract

Translated fromChinese

一种运算方法及装置,该运算装置包括输入模块,用于输入数据;模型生成模块,用于根据输入数据构建模型;神经网络运算模块,用于基于模型生成运算指令并缓存,以及根据运算指令对待处理数据进行运算得到运算结果;输出模块,用于输出运算结果。本公开的装置及方法,能够避免运行传统方法中整个软件架构带来的额外开销。

Figure 201710269049

An operation method and device, the operation device includes an input module for inputting data; a model generation module for constructing a model according to the input data; a neural network operation module for generating and buffering operation instructions based on the model, and according to the operation instructions Perform operation on the data to be processed to obtain the operation result; the output module is used to output the operation result. The apparatus and method of the present disclosure can avoid the extra overhead caused by running the entire software architecture in the traditional method.

Figure 201710269049

Description

Operation method and device
Technical Field
The present disclosure relates to the field of computer architecture, deep learning and neural networks, and more particularly, to an operation method and apparatus.
Background
Deep learning is a branch of machine learning that attempts to use algorithms that involve high-level abstractions of data using multiple processing layers that contain complex structures or are composed of multiple nonlinear transformations.
Deep learning is a method based on characterization learning of data in machine learning. An observation (e.g., an image) may be represented using a number of ways, such as a vector of intensity values for each pixel, or more abstractly as a series of edges, a specially shaped region, etc. Tasks (e.g., face recognition or facial expression recognition) are more easily learned from the examples using some specific representation methods.
Several deep learning architectures, such as deep neural networks, convolutional neural networks, deep belief networks, and recurrent neural networks, have been used in the fields of computer vision, speech recognition, natural language processing, audio recognition, and bioinformatics, and have achieved excellent results. In addition, deep learning has become a similar term, or brand remodeling of neural networks.
With the heat of deep learning (neural network), the neural network accelerator also works, and through the design of a special memory and an operation module, the neural network accelerator can obtain an acceleration ratio which is dozens of times or even hundreds of times that of a general processor when the neural network accelerator performs deep learning operation, and has smaller area and lower power consumption.
In order to facilitate the application of neural network accelerators to various network architectures for accelerating operations, programming software libraries and programming frameworks based thereon have also been and are being developed. In a conventional application scenario, a programming framework of a neural network accelerator is usually located at the uppermost layer, and currently, commonly used programming frameworks include Caffe, tensrflow, Torch, and the like, as shown in fig. 1, a neural network accelerator (dedicated hardware for neural network operation), a hardware driver (for software to call the neural network accelerator), a programming library of the neural network accelerator (for providing an interface for calling the neural network accelerator), a programming framework of the neural network accelerator, and a high-level application that needs to perform the neural network operation are sequentially arranged from the bottom layer to the upper layer. In some application scenarios with low memory and strong real-time performance, running the whole software architecture consumes excessive computing resources. Therefore, how to optimize the operation process for a specific application scenario is one of the problems to be solved.
Disclosure of Invention
Based on the above problems, the present disclosure is directed to a computing method and device for solving at least one of the above problems.
In order to achieve the above object, as one aspect of the present disclosure, the present disclosure proposes an arithmetic method comprising the steps of:
when the input data comprises data to be processed, network structure and weight data, executing the following steps:
step 11, inputting and reading input data;
step 12, constructing an offline model according to the network structure and the weight data;
step 13, analyzing the off-line model to obtain an operation instruction and caching the operation instruction for subsequent calculation and calling;
step 14, according to the operation instruction, operating the data to be processed to obtain an operation result for outputting;
when the input data comprises the data to be processed and the off-line model, executing the following steps:
step 21, inputting and reading input data;
step 22, analyzing the offline model to obtain an operation instruction and caching the operation instruction for subsequent calculation and calling;
step 23, according to the operation instruction, operating the data to be processed to obtain an operation result for outputting;
when the input data only comprises the data to be processed, the following steps are executed:
step 31, inputting and reading input data;
and step 32, calling the cached operation instruction, and operating the data to be processed to obtain an operation result for outputting.
Further, the step of obtaining the operation result by operating the data to be processed according to the operation instruction is realized by the neural network processing unit.
Further, the neural network processing unit has an instruction cache unit for caching the operation instruction for subsequent calculation call.
Further, the offline models include various neural network models including Cambridge _ model, AlexNet _ model, GoogleNet _ model, VGG _ model, R-CNN _ model, GAN _ model, LSTM _ model, RNN _ model, ResNet _ model, and the like.
Further, the data to be processed is input which can be processed by a neural network.
Further, the data to be processed includes a continuous single picture, voice or video stream.
Further, the network structure includes AlexNet, GoogleNet, ResNet, VGG, R-CNN, GAN, LSTM, RNN, ResNet, and possibly various neural network structures.
In order to achieve the above object, as another aspect of the present disclosure, the present disclosure proposes an arithmetic device comprising:
the input module is used for inputting data, and the data comprises data to be processed, network structure and weight data and/or offline model data;
the model generation module is used for constructing an offline model according to the input network structure and the weight data;
the neural network operation module is used for generating an operation instruction based on the offline model, caching the operation instruction, and operating the data to be processed based on the operation instruction to obtain an operation result;
the output module is used for outputting the operation result;
the control module is used for detecting the type of input data and executing the following operations:
when the input data comprises data to be processed, a network structure and weight data, the control input module inputs the network structure and the weight data into the model generation module to construct an offline model, and controls the neural network operation module to operate the data to be processed input by the input module based on the offline model input by the model generation module;
when the input data comprise data to be processed and an offline model, the control input module inputs the data to be processed and the offline model into the neural network operation module, controls the neural network operation module to generate and cache an operation instruction based on the offline model, and operates the data to be processed based on the operation instruction;
when the input data only comprises the data to be processed, the control input module inputs the data to be processed into the neural network operation module, and controls the neural network operation module to call the cached operation instruction to operate the data to be processed.
Further, the neural network operation module comprises a model analysis unit and a processing unit, wherein:
the model analysis unit is used for generating an operation instruction based on the offline model;
the neural network processing unit is used for caching the operation instruction for subsequent calculation and calling; or calling the cached operation instruction when the input data only comprises the data to be processed, and operating the data to be processed based on the operation instruction to obtain an operation result.
Further, the neural network processing unit has an instruction cache unit for caching the operation instruction for subsequent calculation call.
The operation method and the device provided by the disclosure have the following beneficial effects:
1. according to the method and the device, after the off-line model is generated, operation can be directly performed according to the off-line model, and extra overhead caused by running of the whole software framework including a deep learning framework is avoided;
2. the device and the method provided by the disclosure realize more efficient function reconstruction of the neural network processor, so that the neural network processor can fully exert the performance in an application environment with low memory and strong real-time performance, and the operation process is more concise and faster.
Drawings
FIG. 1 is a prior art programming framework;
fig. 2 is a flowchart illustrating an operation method according to an embodiment of the disclosure;
fig. 3 is a structural frame diagram of a computing device according to another embodiment of the disclosure.
Detailed Description
For the purpose of promoting a better understanding of the objects, aspects and advantages of the present disclosure, reference is made to the following detailed description taken in conjunction with the accompanying drawings.
In this specification, the various embodiments described below are meant to be illustrative only and should not be construed in any way to limit the scope of the disclosure. The following description with reference to the accompanying drawings is provided to assist in a comprehensive understanding of exemplary embodiments of the present disclosure as defined by the claims and their equivalents. The following description includes various specific details to aid understanding, but such details are to be regarded as illustrative only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Moreover, descriptions of well-known functions and constructions are omitted for clarity and conciseness. Moreover, throughout the drawings, the same reference numerals are used for similar functions and operations.
The present disclosure discloses an operation method, comprising the steps of:
when the input data comprises data to be processed, network structure and weight data, executing the following steps:
step 11, inputting and reading input data;
step 12, constructing an offline model according to the network structure and the weight data;
step 13, analyzing the off-line model to obtain an operation instruction and caching the operation instruction for subsequent calculation and calling;
step 14, according to the operation instruction, operating the data to be processed to obtain an operation result for outputting;
when the input data comprises the data to be processed and the off-line model, executing the following steps:
step 21, inputting and reading input data;
step 22, analyzing the offline model to obtain an operation instruction and caching the operation instruction for subsequent calculation and calling;
step 23, according to the operation instruction, operating the data to be processed to obtain an operation result for outputting;
when the input data only comprises the data to be processed, the following steps are executed:
step 31, inputting and reading input data;
and step 32, calling the cached operation instruction, and operating the data to be processed to obtain an operation result for outputting.
In some embodiments of the present disclosure, the neural network processing unit operates the data to be processed according to the operation instruction to obtain an operation result; preferably, the neural network processing unit has an instruction cache unit, configured to cache the received operation instruction, where the operation instruction cached in advance is an operation instruction of a previous operation cached by the instruction cache unit.
In some embodiments of the present disclosure, the neural network processing unit further includes a data caching unit, configured to cache the data to be processed.
Based on the above operation method, the present disclosure also discloses an operation device, including:
the input module is used for inputting data, and the data comprises data to be processed, network structure and weight data and/or offline model data;
the model generation module is used for constructing an offline model according to the input network structure and the weight data;
the neural network operation module is used for generating an operation instruction based on the offline model, caching the operation instruction, and operating the data to be processed based on the operation instruction to obtain an operation result;
the output module is used for outputting the operation result;
the control module is used for detecting the type of input data and executing the following operations:
when the input data comprises data to be processed, a network structure and weight data, the control input module inputs the network structure and the weight data into the model generation module to construct an offline model, and controls the neural network operation module to operate the data to be processed input by the input module based on the offline model input by the model generation module;
when the input data comprise data to be processed and an offline model, the control input module inputs the data to be processed and the offline model into the neural network operation module, controls the neural network operation module to generate and cache an operation instruction based on the offline model, and operates the data to be processed based on the operation instruction;
when the input data only comprises the data to be processed, the control input module inputs the data to be processed into the neural network operation module, and controls the neural network operation module to call the cached operation instruction to operate the data to be processed.
The neural network operation module comprises a model analysis unit and a neural network processing unit, wherein:
the model analysis unit is used for generating an operation instruction based on the offline model;
the neural network processing unit is used for caching the operation instruction for subsequent calculation and calling; or calling the cached operation instruction when the input data only comprises the data to be processed, and operating the data to be processed based on the operation instruction to obtain an operation result.
In some embodiments of the present disclosure, the neural network processing unit has an instruction cache unit, configured to cache the operation instruction for a subsequent computation call.
In some embodiments of the disclosure, the offline model is a text file defined according to a specific structure, and can be various neural network models, such as cambric _ model, AlexNet _ model, GoogleNet _ model, VGG _ model, R-CNN _ model, GAN _ model, LSTM _ model, RNN _ model, ResNet _ model, etc., but not limited to these models provided in this embodiment.
In some embodiments of the present disclosure, the data to be processed is input that can be processed with a neural network, such as any of a continuous single picture, a voice or a video stream.
In some embodiments of the present disclosure, the network structure may be various neural network structures, such as AlexNet, GoogleNet, ResNet, VGG, R-CNN, GAN, LSTM, RNN, ResNet, etc., but not limited to these structures proposed in this embodiment.
Specifically, according to the difference of the input data of the input module, the arithmetic device of the present disclosure has the following three working principles:
1. when the data input by the input module is the network structure, the weight data and the data to be processed, the control module controls the input module to transmit the network structure and the weight data to the model generation module and transmit the data to be processed to the model analysis module; the control module controls the model generation module to generate an offline model according to the network structure and the weight data and transmits the generated offline model to the model analysis unit; the control module controls the model analysis unit to analyze the received off-line model to obtain an operation instruction which can be identified by the neural network processing unit, and transmits the operation instruction and the data to be processed to the included neural network processing unit; the neural network processing unit operates the data to be processed according to the received operation instruction to obtain a determined operation result, and transmits the operation result to the output module for output.
2. When the data input by the input module is the offline model and the data to be processed, the control module controls the input module to directly transmit the offline model and the data to be processed to the model analysis unit, and the subsequent working principle is the same as that in the first case.
3. When the data input by the input module only contains the data to be processed, the control module controls the input module to directly transmit the data to be processed to the neural network processing unit through the model analysis unit, and the neural network processing unit performs operation on the data to be processed according to the cached operation instruction to obtain an operation result. This is typically not the case in first-use neural network processors to ensure that there are certain arithmetic instructions in the instruction cache.
Therefore, when the current network operation is different from the offline model of the last network operation, the data input by the input module comprises a network structure, weight data and data to be processed, and the model generation module generates a new offline model and then performs subsequent network operation; when the current network operation is the first network operation and a corresponding offline model is obtained in advance, the data input by the input module comprises the offline model and the data to be processed; when the current network operation is not the first time and is the same as the offline model of the last network operation, the data input by the input module only comprises the data to be processed.
In some embodiments of the present disclosure, the computing device described in the present disclosure is integrated as a sub-module into a central processor module of an entire computer system. The data to be processed and the off-line model are controlled by the central processing unit and transmitted to the arithmetic device. The model analysis unit analyzes the transmitted neural network offline model and generates an operation instruction. Then the operation instruction and the data to be processed are transmitted into the neural network processing unit, the operation result is obtained through operation processing, and the operation result is returned to the main memory unit. In the subsequent calculation process, the network structure is not changed any more, and the neural network calculation can be completed only by continuously transmitting data to be processed, so that an operation result is obtained.
The following describes the computing device and method proposed in the present disclosure in detail by specific embodiments.
Example 1
As shown in fig. 2, the present embodiment provides an operation method, including the following steps:
when the input data comprises data to be processed, network structure and weight data, executing the following steps:
step 11, inputting and reading input data;
step 12, constructing an offline model according to the network structure and the weight data;
step 13, analyzing the off-line model to obtain an operation instruction and caching the operation instruction for subsequent calculation and calling;
step 14, according to the operation instruction, operating the data to be processed to obtain a neural network operation result for outputting;
when the input data comprises the data to be processed and the off-line model, executing the following steps:
step 21, inputting and reading input data;
step 22, analyzing the offline model to obtain an operation instruction and caching the operation instruction for subsequent calculation and calling;
step 23, according to the operation instruction, operating the data to be processed to obtain a neural network operation result for outputting;
when the input data only comprises the data to be processed, the following steps are executed:
step 31, inputting and reading input data;
and step 32, calling the cached operation instruction, and operating the data to be processed to obtain a neural network operation result for outputting.
Processing the data to be processed according to the operation instruction through a neural network processing unit to obtain an operation result; the neural network processing unit is provided with an instruction cache unit and a data cache unit, and is used for caching the received operation instruction and the data to be processed respectively.
The input network structure provided in this embodiment is AlexNet, the weight data is bvlc _ alexnet.cafemodel, the data to be processed is a continuous single picture, and the offline model is Cambricon _ model.
In summary, the method provided by this embodiment can greatly simplify the operation flow using the neural network processor, and avoid the extra memory and IO overhead of calling the whole set of the conventional programming framework. By applying the method, the neural network accelerator can give full play to the operation performance in the environment with low memory and strong real-time performance.
Example 2
As shown in fig. 3, the present embodiment provides an arithmetic device, including: aninput module 101, amodel generation module 102, a neuralnetwork operation module 103, anoutput module 104 and acontrol module 105, wherein the neuralnetwork operation module 103 comprises amodel analysis unit 106 and aneural network processor 107
The key word of the device is executed in an off-line mode, namely the key word is used for generating an off-line model, then directly generating a related operation instruction by using the off-line model, transmitting weight data and carrying out processing operation on data to be processed. More specifically:
theinput module 101 is configured to input a combination of a network structure, weight data, and to-be-processed data, or a combination of an offline model and to-be-processed data. When the input is the network structure, the weight data and the data to be processed, the network structure and the weight data are transmitted to themodel generation module 102 to generate an offline model for executing the following operations. When the input is the offline model and the data to be processed, the offline model and the data to be processed are directly transmitted to themodel analysis unit 106 to perform the following operations.
Theoutput module 104 is configured to output the determined operation data generated according to the specific network structure and the set of data to be processed. Wherein the output data is computed by theneural network processor 107.
Themodel generating module 102 is configured to generate an offline model for use by a lower layer according to the input network structure parameter and the weight data.
Themodel analyzing unit 106 is configured to analyze the incoming offline model, generate an operation instruction that can be directly sent to theneural network processor 107, and send the data to be processed, which is sent from theinput module 101, to theneural network processor 107.
Theneural network processor 107 is configured to perform an operation according to the transmitted operation instruction and data to be processed, obtain a determined operation result, transmit the operation result to theoutput module 104, and has an instruction cache unit and a data cache unit.
Thecontrol module 105 is configured to detect an input data type and perform the following operations:
when the input data comprises data to be processed, a network structure and weight data, thecontrol input module 101 inputs the network structure and the weight data into themodel generation module 102 to construct an offline model, and controls the neuralnetwork operation module 103 to perform neural network operation on the data to be processed input by theinput module 101 based on the offline model input by themodel generation module 102;
when the input data comprises data to be processed and an offline model, thecontrol input module 101 inputs the data to be processed and the offline model into the neuralnetwork operation module 103, controls the neuralnetwork operation module 103 to generate and cache an operation instruction based on the offline model, and performs neural network operation on the data to be processed based on the operation instruction;
when the input data only includes the data to be processed, thecontrol input module 101 inputs the data to be processed into the neuralnetwork operation module 103, and controls the neuralnetwork operation module 103 to call the cached operation instruction, so as to perform neural network operation on the data to be processed.
The input network structure provided in this embodiment is AlexNet, the weight data is bvlc _ AlexNet. Themodel generation module 102 generates a new offline model Cambricon _ model according to the input network structure and the weight data, and the generated offline model Cambricon _ model can also be used alone as the next input; themodel parsing unit 106 may parse the offline model Cambricon _ model, thereby generating a series of operation instructions. Themodel analysis unit 106 transmits the generated operation instruction to an instruction cache unit on theneural network processor 107, and transmits the input image transmitted by theinput module 101 to a data cache unit on theneural network processor 107.
The processes or methods depicted in the preceding figures may be performed by processing logic that comprises hardware, software, or a combination of both. Although the processes or methods are described above in terms of some sequential operations, it should be understood that some of the operations described may be performed in a different order. Further, some operations may be performed in parallel rather than sequentially.
The above-mentioned embodiments are intended to illustrate the objects, aspects and advantages of the present disclosure in further detail, and it should be understood that the above-mentioned embodiments are only illustrative of the present disclosure and are not intended to limit the present disclosure, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present disclosure should be included in the scope of the present disclosure.

Claims (12)

Translated fromChinese
1.一种运算方法,包括以下步骤:1. A computing method, comprising the following steps:当输入数据包括待处理数据、网络结构和权值数据时,执行如下步骤:When the input data includes data to be processed, network structure and weight data, perform the following steps:输入并读取输入数据;input and read input data;根据所述网络结构和权值数据构建离线模型;Build an offline model according to the network structure and weight data;解析所述离线模型,得到神经网络处理单元能够识别的运算指令并缓存,用于后续计算调用;Parsing the offline model to obtain operation instructions that can be identified by the neural network processing unit and cache them for subsequent calculation calls;所述神经网络处理单元根据所述运算指令,对所述待处理数据进行运算得到运算结果以供输出;The neural network processing unit performs an operation on the data to be processed to obtain an operation result for output according to the operation instruction;当输入数据包括待处理数据和离线模型时,执行如下步骤:When the input data includes the data to be processed and the offline model, perform the following steps:输入并读取输入数据;input and read input data;解析所述离线模型,得到所述神经网络处理单元能够识别的运算指令并缓存,用于后续计算调用;Parsing the offline model to obtain operation instructions that can be identified by the neural network processing unit and cache them for subsequent calculation calls;所述神经网络处理单元根据所述运算指令,对所述待处理数据进行运算得到运算结果以供输出;The neural network processing unit performs an operation on the data to be processed to obtain an operation result for output according to the operation instruction;当输入数据仅包括待处理数据时,执行如下步骤:When the input data only includes data to be processed, perform the following steps:输入并读取输入数据;input and read input data;调用缓存的所述神经网络处理单元能够识别的运算指令,对所述待处理数据进行运算得到运算结果以供输出。The cached operation instruction that can be identified by the neural network processing unit is called, and an operation result is obtained by performing an operation on the to-be-processed data for output.2.如权利要求1所述的运算方法,其中,所述神经网络处理单元具有指令缓存单元,用于缓存所述运算指令,用于后续计算调用。2 . The operation method according to claim 1 , wherein the neural network processing unit has an instruction cache unit, which is used for caching the operation instructions for subsequent calculation calls. 3 .3.如权利要求1~2中任一项所述的运算方法,其中,所述离线模型为神经网络模型;所述神经网络模型包括Cambricon_model、AlexNet_model、GoogleNet_model、VGG_model、R-CNN_model、GAN_model、LSTM_model、RNN_model、ResNet_model。3. The computing method according to any one of claims 1 to 2, wherein the offline model is a neural network model; the neural network model comprises Cambricon_model, AlexNet_model, GoogleNet_model, VGG_model, R-CNN_model, GAN_model, LSTM_model , RNN_model, ResNet_model.4.如权利要求1所述的运算方法,其中,所述待处理数据为能用神经网络进行处理的输入。4. The computing method of claim 1, wherein the data to be processed is an input that can be processed by a neural network.5.如权利要求4所述的运算方法,其中,所述待处理数据包括连续的单张图片、语音或视频流。5. The computing method according to claim 4, wherein the data to be processed comprises a continuous single picture, a voice or a video stream.6.如权利要求1所述的运算方法,其中,所述网络结构为神经网络结构;所述神经网络结构包括AlexNet、GoogleNet、ResNet、VGG、R-CNN、GAN、LSTM、RNN。6. The computing method according to claim 1, wherein the network structure is a neural network structure; and the neural network structure comprises AlexNet, GoogleNet, ResNet, VGG, R-CNN, GAN, LSTM, and RNN.7.一种运算装置,包括:7. A computing device, comprising:输入模块,用于输入数据,所述数据包括待处理数据、网络结构和权值数据和/或离线模型数据;an input module for inputting data, the data including data to be processed, network structure and weight data and/or offline model data;模型生成模块,用于根据输入的网络结构和权值数据构建离线模型;The model generation module is used to build an offline model according to the input network structure and weight data;神经网络运算模块,包括:Neural network operation module, including:模型解析单元,用于基于离线模型生成神经网络处理单元能够识别的运算指令;以及a model parsing unit for generating operation instructions that can be recognized by the neural network processing unit based on the offline model; and神经网络处理单元,用于缓存所述运算指令用于后续计算调用,以及基于运算指令对待处理数据进行运算得到运算结果;a neural network processing unit, configured to cache the operation instructions for subsequent calculation calls, and perform operations on the data to be processed based on the operation instructions to obtain operation results;输出模块,用于输出所述运算结果;an output module for outputting the operation result;控制模块,用于检测输入数据类型并执行如下操作:A control block that detects the input data type and does the following:当输入数据包括待处理数据、网络结构和权值数据时,控制输入模块将网络结构和权值数据输入模型生成模块以构建离线模型,并控制神经网络运算模块基于模型生成模块构建的离线模型,对输入模块输入的待处理数据进行运算;When the input data includes data to be processed, network structure and weight data, the control input module inputs the network structure and weight data into the model generation module to construct an offline model, and controls the neural network operation module to construct an offline model based on the model generation module, Perform operations on the pending data input by the input module;当输入数据包括待处理数据和离线模型时,控制输入模块将待处理数据和离线模型输入神经网络运算模块,并控制神经网络运算模块基于离线模型生成运算指令并缓存,并基于所述运算指令对所述待处理数据进行运算;When the input data includes the data to be processed and the offline model, the control input module inputs the data to be processed and the offline model into the neural network operation module, and controls the neural network operation module to generate and cache operation instructions based on the offline model, and based on the operation instructions The to-be-processed data is operated;当输入数据仅包括待处理数据时,控制输入模块将待处理数据输入神经网络运算模块,并控制神经网络运算模块调用缓存的运算指令,对所述待处理数据进行运算。When the input data only includes to-be-processed data, the control input module inputs the to-be-processed data into the neural network operation module, and controls the neural network operation module to call the cached operation instructions to perform operations on the to-be-processed data.8.如权利要求7所述的运算装置,其中,所述神经网络处理单元具有指令缓存单元,用于缓存所述运算指令用于后续计算调用。8 . The computing device according to claim 7 , wherein the neural network processing unit has an instruction buffer unit for buffering the operation instructions for subsequent computing calls. 9 .9.根据权利要求7所述的运算装置,其中,所述离线模型为神经网络模型;所述神经网络模型包括Cambricon_model、AlexNet_model、GoogleNet_model、VGG_model、R-CNN_model、GAN_model、LSTM_model、RNN_model、ResNet_model。9. The computing device according to claim 7, wherein the offline model is a neural network model; the neural network model comprises Cambricon_model, AlexNet_model, GoogleNet_model, VGG_model, R-CNN_model, GAN_model, LSTM_model, RNN_model, ResNet_model.10.根据权利要求7所述的运算装置,其中,所述待处理数据为能用神经网络进行处理的输入。10. The computing device of claim 7, wherein the data to be processed is an input that can be processed by a neural network.11.根据权利要求10所述的运算装置,其中,所述待处理数据包括连续的单张图片、语音或视频流。11. The computing device according to claim 10, wherein the data to be processed comprises a continuous single picture, a voice or a video stream.12.根据权利要求7所述的运算装置,其中,所述网络结构为神经网络结构;所述神经网络结构包括AlexNet、GoogleNet、ResNet、VGG、R-CNN、GAN、LSTM、RNN。12. The computing device according to claim 7, wherein the network structure is a neural network structure; and the neural network structure comprises AlexNet, GoogleNet, ResNet, VGG, R-CNN, GAN, LSTM, and RNN.
CN201710269049.0A2017-04-192017-04-21Operation method and deviceActiveCN108734288B (en)

Priority Applications (17)

Application NumberPriority DateFiling DateTitle
CN201710269049.0ACN108734288B (en)2017-04-212017-04-21Operation method and device
US16/476,262US11531540B2 (en)2017-04-192018-04-17Processing apparatus and processing method with dynamically configurable operation bit width
EP19214371.7AEP3786786B1 (en)2017-04-192018-04-17Processing device, processing method, chip, and electronic apparatus
EP18788355.8AEP3614259A4 (en)2017-04-192018-04-17 PROCESSING DEVICE AND PROCESSING METHODS
KR1020197038135AKR102258414B1 (en)2017-04-192018-04-17Processing apparatus and processing method
CN201880000923.3ACN109121435A (en)2017-04-192018-04-17 Processing device and processing method
CN201811097653.0ACN109376852B (en)2017-04-212018-04-17Arithmetic device and arithmetic method
PCT/CN2018/083415WO2018192500A1 (en)2017-04-192018-04-17Processing apparatus and processing method
CN202410405915.4ACN118690805A (en)2017-04-192018-04-17 Processing device and processing method
JP2019549467AJP6865847B2 (en)2017-04-192018-04-17 Processing equipment, chips, electronic equipment and methods
KR1020197025307AKR102292349B1 (en)2017-04-192018-04-17 Processing device and processing method
EP19214320.4AEP3654172A1 (en)2017-04-192018-04-17Fused vector multiplier and method using the same
US16/697,637US11720353B2 (en)2017-04-192019-11-27Processing apparatus and processing method
US16/697,727US11698786B2 (en)2017-04-192019-11-27Processing apparatus and processing method
US16/697,533US11531541B2 (en)2017-04-192019-11-27Processing apparatus and processing method
US16/697,687US11734002B2 (en)2017-04-192019-11-27Counting elements in neural network input data
JP2019228383AJP6821002B2 (en)2017-04-192019-12-18 Processing equipment and processing method

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201710269049.0ACN108734288B (en)2017-04-212017-04-21Operation method and device

Publications (2)

Publication NumberPublication Date
CN108734288A CN108734288A (en)2018-11-02
CN108734288Btrue CN108734288B (en)2021-01-29

Family

ID=63934137

Family Applications (2)

Application NumberTitlePriority DateFiling Date
CN201710269049.0AActiveCN108734288B (en)2017-04-192017-04-21Operation method and device
CN201811097653.0AActiveCN109376852B (en)2017-04-192018-04-17Arithmetic device and arithmetic method

Family Applications After (1)

Application NumberTitlePriority DateFiling Date
CN201811097653.0AActiveCN109376852B (en)2017-04-192018-04-17Arithmetic device and arithmetic method

Country Status (1)

CountryLink
CN (2)CN108734288B (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN109685203B (en)*2018-12-212020-01-17中科寒武纪科技股份有限公司Data processing method, device, computer system and storage medium
CN109726797B (en)*2018-12-212019-11-19北京中科寒武纪科技有限公司Data processing method, device, computer system and storage medium
CN109697500B (en)*2018-12-292020-06-09中科寒武纪科技股份有限公司Data processing method and device, electronic equipment and storage medium
CN110070176A (en)*2019-04-182019-07-30北京中科寒武纪科技有限公司The processing method of off-line model, the processing unit of off-line model and Related product
WO2020192587A1 (en)2019-03-222020-10-01中科寒武纪科技股份有限公司Artificial intelligence computing device and related product
CN111832737B (en)2019-04-182024-01-09中科寒武纪科技股份有限公司Data processing method and related product
CN110309917B (en)*2019-07-052020-12-18安徽寒武纪信息科技有限公司Verification method of off-line model and related device
CN113490943B (en)*2019-07-312023-03-10华为技术有限公司 An integrated chip and method for processing sensor data
CN111582459B (en)*2020-05-182023-10-20Oppo广东移动通信有限公司Method for executing operation, electronic equipment, device and storage medium
CN112613597B (en)*2020-11-302023-06-30河南汇祥通信设备有限公司Comprehensive pipe rack risk automatic identification convolutional neural network model and construction method
CN112947935B (en)*2021-02-262024-08-13上海商汤智能科技有限公司Operation method and device, electronic equipment and storage medium
CN115906969A (en)*2022-11-222023-04-04中国第一汽车股份有限公司 A model computing system, method, electronic device and storage medium

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
KR20130090147A (en)*2012-02-032013-08-13안병익Neural network computing apparatus and system, and method thereof
US9378455B2 (en)*2012-05-102016-06-28Yan M. YufikSystems and methods for a computer understanding multi modal data streams
US20160162779A1 (en)*2014-12-052016-06-09RealMatch, Inc.Device, system and method for generating a predictive model by machine learning
EP3035249B1 (en)*2014-12-192019-11-27Intel CorporationMethod and apparatus for distributed and cooperative computation in artificial neural networks
CN105005911B (en)*2015-06-262017-09-19深圳市腾讯计算机系统有限公司The arithmetic system and operation method of deep neural network
CN106355246B (en)*2015-10-082019-02-15上海兆芯集成电路有限公司 Three Configuration Neural Network Units
CN107578099B (en)*2016-01-202021-06-11中科寒武纪科技股份有限公司Computing device and method
CN105930902B (en)*2016-04-182018-08-10中国科学院计算技术研究所 A neural network processing method and system
CN106228238B (en)*2016-07-272019-03-22中国科学技术大学苏州研究院Accelerate the method and system of deep learning algorithm on field programmable gate array platform
CN106529670B (en)*2016-10-272019-01-25中国科学院计算技术研究所 A neural network processor, design method and chip based on weight compression
CN106557332A (en)*2016-11-302017-04-05上海寒武纪信息科技有限公司A kind of multiplexing method and device of instruction generating process

Also Published As

Publication numberPublication date
CN109376852A (en)2019-02-22
CN108734288A (en)2018-11-02
CN109376852B (en)2021-01-29

Similar Documents

PublicationPublication DateTitle
CN108734288B (en)Operation method and device
Hu et al.Dynamic adaptive DNN surgery for inference acceleration on the edge
US12333794B2 (en)Emotion recognition in multimedia videos using multi-modal fusion-based deep neural network
US20200097795A1 (en)Processing apparatus and processing method
RU2771008C1 (en)Method and apparatus for processing tasks based on a neural network
US20220044358A1 (en)Image processing method and apparatus, device, and storage medium
CN106845631B (en)Stream execution method and device
CN113313241B (en) Method and computing device for determining tensor information of deep learning model
KR102501773B1 (en)Apparatus and method for generating speech vided that creates landmarks together
WO2022028220A1 (en)Neural network model computing chip, method and apparatus, device and medium
US12149708B2 (en)Machine learning of encoding parameters for a network using a video encoder
JP2022078286A (en)Training method and training device for data processing model, electronic equipment and storage medium
US11580736B2 (en)Parallel video processing neural networks
EP4232959A1 (en)Improved processing of sequential data via machine learning models featuring temporal residual connections
CN115049780A (en)Deep rendering model training method and device, and target rendering method and device
CN113705799B (en) Processing unit, computing device, and computational graph processing method for deep learning model
CN119106717A (en) A multi-modal large model edge accelerator system and its implementation method, device, equipment, and medium
CN111199276B (en)Data processing method and related product
Chen et al.Hardware/software co-design for machine learning accelerators
WO2024007938A1 (en)Multi-task prediction method and apparatus, electronic device, and storage medium
CN112861687B (en)Mask wearing detection method, device, equipment and medium for access control system
KR20240014179A (en)An electronic device for providing video call service and method for controlling the same
CN114816742A (en)Request processing method and device, electronic equipment and storage medium
US12443847B2 (en)Task processing method and device based on neural network
CN115796284B (en)Reasoning method, device, storage medium and equipment based on TVM compiler

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp